diff --git a/docs/source/user-guide/concepts.md b/docs/source/user-guide/concepts.md index 3c34535a..7214f5f7 100644 --- a/docs/source/user-guide/concepts.md +++ b/docs/source/user-guide/concepts.md @@ -8,9 +8,9 @@ Key terminology: - `Stage`: a portion of the plan separated by a network boundary from other parts of the plan. A plan contains one or more stages, each separated by network boundaries. -- `Task`: a unit of work in a stage that executes the inner plan in parallel to other tasks within the stage. Each task - in a stage executes a structurally identical plan in different worker, passing a `task_index` as a contextual value - for making choices about what data should be returned. +- `Task`: a unit of work in a stage that executes a plan in parallel to other tasks within the stage. Each task + in a stage runs on a different worker with its own plan variant — pre-specialized at planning time for the subset + of data it is responsible for. - `Network Boundary`: a node in the plan that streams data from a network interface rather than directly from its children nodes. - `Worker`: a physical machine listening to serialized execution plans over an Arrow Flight interface. A task is @@ -60,11 +60,14 @@ previous stages. An extension present during the `ExecutionPlan::execute()` that contains information about the current task in which the plan is being executed. -As a user, you will need to interact with this type in your custom leaf nodes, as depending on which task index -you are in, you might want to return a different set of data. +For built-in file-based plans (`DataSourceExec`), data partitioning is handled automatically at planning time via +`DistributedLeafExec`: each task receives a pre-built plan variant with its own isolated file groups, so no +runtime dispatch is needed. -For example, if you are on the task with index 0 of a 3-task stage, you might want to return only the first 1/3 of the -data. If you are on the task with index 2, you might want to return the last 1/3 of the data, and so on. +For custom leaf nodes that need to dispatch work themselves, `DistributedTaskContext` exposes `task_index` and +`task_count` so execution logic can select the appropriate data subset. For example, task 0 of 3 might return +the first third of rows, task 2 the last third, and so on. See the `TaskEstimator` documentation for guidance on +which approach to use. ## [ChannelResolver](https://github.com/datafusion-contrib/datafusion-distributed/blob/main/src/networking/channel_resolver.rs) diff --git a/docs/source/user-guide/task-estimator.md b/docs/source/user-guide/task-estimator.md index b2d42f4d..418f37e9 100644 --- a/docs/source/user-guide/task-estimator.md +++ b/docs/source/user-guide/task-estimator.md @@ -14,16 +14,20 @@ can provide custom `TaskEstimator` implementations for your own `ExecutionPlan` Providing a `TaskEstimator` allows you to do two things: -1. Tell the distributed planner how many tasks should be used for your own `ExecutionPlan`s. -2. Tell the distributed planner how to "scale up" your `ExecutionPlan` in order to account for it running in - multiple distributed tasks. - -If your custom nodes will execute in a distributed manner, you must handle this during execution. When your -`TaskEstimator` specifies N tasks for a node, your execution logic must respond to the -[DistributedTaskContext](https://github.com/datafusion-contrib/datafusion-distributed/blob/75b4e73e9052c6596b9d1744ce2bdfa6cbc010d3/src/stage.rs#L137-L137) -present in DataFusion's `TaskContext` to determine which subset of data this task should process. - -There's an example of how to do that in the `examples/` folder: +1. Tell the distributed planner how many tasks should be used for your own `ExecutionPlan`s + (`task_estimation`). +2. Prepare the leaf node at planning time for distributed execution (`scale_up_leaf_node`). + The recommended approach is to return a `DistributedLeafExec` wrapping your original plan + along with `N` per-task variants (one per task). `DistributedLeafExec` is transparent to + network boundaries—it exposes the same partition count as the original—and the task spawner + automatically replaces it with the appropriate per-task variant before serialising the plan + and sending it to a worker. + + If your leaf node already handles task dispatch internally (e.g., by reading + `DistributedTaskContext.task_index` in `execute()`), you can omit `DistributedLeafExec` and + simply return the prepared plan directly from `scale_up_leaf_node`. + +There's a complete example in the `examples/` folder: - [custom_execution_plan.rs](https://github.com/datafusion-contrib/datafusion-distributed/blob/main/examples/custom_execution_plan.rs) - A complete example showing how to implement a custom execution plan (`numbers(start, end)` table function) diff --git a/examples/custom_execution_plan.rs b/examples/custom_execution_plan.rs index f9a551ba..07d7ff91 100644 --- a/examples/custom_execution_plan.rs +++ b/examples/custom_execution_plan.rs @@ -127,7 +127,8 @@ impl TableProvider for NumbersTableProvider { } /// Custom execution plan that generates numbers from start to end. -/// When distributed, each task generates a subset of numbers based on its task_index. +/// When distributed, `scale_up_leaf_node` populates `ranges_per_task` with one entry per task +/// and `execute()` uses `DistributedTaskContext.task_index` to select this task's range. #[derive(Debug, Clone)] struct NumbersExec { ranges_per_task: Vec>, diff --git a/src/coordinator/prepare_static_plan.rs b/src/coordinator/prepare_static_plan.rs index ff072694..3da4c56a 100644 --- a/src/coordinator/prepare_static_plan.rs +++ b/src/coordinator/prepare_static_plan.rs @@ -3,10 +3,9 @@ use crate::coordinator::distributed::PreparedPlan; use crate::coordinator::task_spawner::{ CoordinatorToWorkerMetrics, CoordinatorToWorkerTaskSpawner, }; -use crate::distributed_planner::get_distributed_task_estimator; use crate::stage::RemoteStage; use crate::{ - DistributedCodec, NetworkBoundaryExt, Stage, TaskRoutingContext, + DistributedConfig, NetworkBoundaryExt, Stage, TaskEstimator, TaskRoutingContext, get_distributed_worker_resolver, }; use datafusion::common::runtime::JoinSet; @@ -35,7 +34,6 @@ pub(super) fn prepare_static_plan( ctx: &Arc, ) -> Result { let worker_resolver = get_distributed_worker_resolver(ctx.session_config())?; - let codec = DistributedCodec::new_combined_with_user(ctx.session_config()); let available_urls = worker_resolver.get_urls()?; @@ -52,15 +50,11 @@ pub(super) fn prepare_static_plan( return exec_err!("Input stage from network boundary was not in Local state"); }; - let task_estimator = get_distributed_task_estimator(ctx.session_config())?; + let d_cfg = DistributedConfig::from_config_options(ctx.session_config().options())?; + let task_estimator = &d_cfg.__private_task_estimator; - let mut spawner = CoordinatorToWorkerTaskSpawner::new( - stage, - &metrics, - task_metrics, - &codec, - &mut join_set, - )?; + let mut spawner = + CoordinatorToWorkerTaskSpawner::new(stage, &metrics, task_metrics, ctx, &mut join_set)?; let routed_urls = match task_estimator.route_tasks(&TaskRoutingContext { task_ctx: Arc::clone(ctx), diff --git a/src/coordinator/task_spawner.rs b/src/coordinator/task_spawner.rs index c959a75b..575ad3c8 100644 --- a/src/coordinator/task_spawner.rs +++ b/src/coordinator/task_spawner.rs @@ -1,6 +1,7 @@ use crate::common::{TreeNodeExt, serialize_uuid, task_ctx_with_extension}; use crate::config_extension_ext::get_config_extension_propagation_headers; use crate::coordinator::MetricsStore; +use crate::execution_plans::{ChildrenIsolatorUnionExec, DistributedLeafExec}; use crate::passthrough_headers::get_passthrough_headers; use crate::protobuf::tonic_status_to_datafusion_error; use crate::stage::LocalStage; @@ -8,20 +9,21 @@ use crate::worker::generated::worker as pb; use crate::worker::generated::worker::coordinator_to_worker_msg::Inner; use crate::worker::generated::worker::set_plan_request::WorkUnitFeedDeclaration; use crate::{ - DISTRIBUTED_DATAFUSION_TASK_ID_LABEL, DistributedConfig, DistributedTaskContext, - DistributedWorkUnitFeedContext, TaskKey, get_distributed_channel_resolver, + DISTRIBUTED_DATAFUSION_TASK_ID_LABEL, DistributedCodec, DistributedConfig, + DistributedTaskContext, DistributedWorkUnitFeedContext, TaskKey, + get_distributed_channel_resolver, }; use datafusion::common::Result; use datafusion::common::instant::Instant; use datafusion::common::runtime::JoinSet; -use datafusion::common::tree_node::TreeNodeRecursion; +use datafusion::common::tree_node::{Transformed, TreeNodeRecursion}; use datafusion::common::{DataFusionError, exec_datafusion_err}; use datafusion::execution::TaskContext; use datafusion::physical_expr_common::metrics::{ Count, ExecutionPlanMetricsSet, Label, MetricBuilder, MetricValue, Time, }; use datafusion::physical_plan::ExecutionPlan; -use datafusion_proto::physical_plan::{AsExecutionPlan, PhysicalExtensionCodec}; +use datafusion_proto::physical_plan::AsExecutionPlan; use datafusion_proto::protobuf::PhysicalPlanNode; use futures::StreamExt; use http::Extensions; @@ -72,10 +74,10 @@ impl CoordinatorToWorkerMetrics { /// remote counterparts. pub(super) struct CoordinatorToWorkerTaskSpawner<'a> { plan: &'a Arc, - plan_proto: Vec, query_id: Uuid, stage_id: usize, task_count: usize, + task_ctx: &'a TaskContext, metrics: &'a CoordinatorToWorkerMetrics, task_metrics: &'a Option>, join_set: &'a mut JoinSet>, @@ -88,18 +90,15 @@ impl<'a> CoordinatorToWorkerTaskSpawner<'a> { stage: &'a LocalStage, metrics: &'a CoordinatorToWorkerMetrics, task_metrics: &'a Option>, - codec: &'a dyn PhysicalExtensionCodec, + task_ctx: &'a TaskContext, join_set: &'a mut JoinSet>, ) -> Result { - let plan_proto = PhysicalPlanNode::try_from_physical_plan(Arc::clone(&stage.plan), codec)? - .encode_to_vec(); - Ok(Self { plan: &stage.plan, - plan_proto, query_id: stage.query_id, stage_id: stage.num, task_count: stage.tasks, + task_ctx, metrics, task_metrics, join_set, @@ -126,17 +125,35 @@ impl<'a> CoordinatorToWorkerTaskSpawner<'a> { task_index: task_i, task_count: self.task_count, }; - self.plan.apply_with_dt_ctx(d_ctx, |plan, _| { - let Some(wuf) = wuf_registry.get_work_unit_feed(plan) else { - return Ok(TreeNodeRecursion::Continue); + + let plan = Arc::clone(self.plan); + let specialized = plan.transform_down_with_dt_ctx(d_ctx, |plan, d_ctx| { + if let Some(wuf) = wuf_registry.get_work_unit_feed(&plan) { + work_unit_feed_declarations.push(WorkUnitFeedDeclaration { + id: serialize_uuid(&wuf.id()), + partitions: plan.properties().partitioning.partition_count() as u64, + }); }; - work_unit_feed_declarations.push(WorkUnitFeedDeclaration { - id: serialize_uuid(&wuf.id()), - partitions: plan.properties().partitioning.partition_count() as u64, - }); - Ok(TreeNodeRecursion::Continue) + + if let Some(ciu) = plan.as_any().downcast_ref::() { + let ciu = ciu.to_task_specialized(d_ctx.task_index); + return Ok(Transformed::yes(Arc::new(ciu))); + }; + + if let Some(dle) = plan.as_any().downcast_ref::() { + let specialized = dle.to_task_specialized(d_ctx.task_index); + return Ok(Transformed::yes(specialized)); + } + + Ok(Transformed::no(plan)) })?; + let codec = DistributedCodec::new_combined_with_user(self.task_ctx.session_config()); + + let plan_proto = + PhysicalPlanNode::try_from_physical_plan(specialized.data, &codec)?.encode_to_vec(); + let plan_size = plan_proto.len(); + let task_key = TaskKey { query_id: serialize_uuid(&self.query_id), stage_id: self.stage_id as u64, @@ -144,14 +161,13 @@ impl<'a> CoordinatorToWorkerTaskSpawner<'a> { }; let msg = pb::CoordinatorToWorkerMsg { inner: Some(Inner::SetPlanRequest(pb::SetPlanRequest { - plan_proto: self.plan_proto.clone(), + plan_proto, task_count: self.task_count as u64, task_key: Some(task_key.clone()), work_unit_feed_declarations, target_worker_url: url.to_string(), })), }; - let plan_size = self.plan_proto.len(); let (coordinator_to_worker_tx, coordinator_to_worker_rx) = tokio::sync::mpsc::unbounded_channel(); diff --git a/src/distributed_planner/distributed_query_planner.rs b/src/distributed_planner/distributed_query_planner.rs index 5b699d1a..9ffb04bb 100644 --- a/src/distributed_planner/distributed_query_planner.rs +++ b/src/distributed_planner/distributed_query_planner.rs @@ -179,10 +179,9 @@ mod tests { │ [Stage 1] => NetworkShuffleExec: output_partitions=4, input_tasks=3 └────────────────────────────────────────────────── ┌───── Stage 1 ── Tasks: t0:[p0..p7] t1:[p0..p7] t2:[p0..p7] - │ RepartitionExec: partitioning=Hash([RainToday@0], 8), input_partitions=1 + │ RepartitionExec: partitioning=Hash([RainToday@0], 8), input_partitions=3 │ AggregateExec: mode=Partial, gby=[RainToday@0 as RainToday], aggr=[count(Int64(1))] - │ PartitionIsolatorExec: tasks=3 partitions=3 - │ DataSourceExec: file_groups={3 groups: [[/testdata/weather/result-000000.parquet], [/testdata/weather/result-000001.parquet], [/testdata/weather/result-000002.parquet]]}, projection=[RainToday], file_type=parquet + │ DistributedLeafExec: DataSourceExec: file_groups={3 groups: [[/testdata/weather/result-000000.parquet], [/testdata/weather/result-000001.parquet], [/testdata/weather/result-000002.parquet]]}, projection=[RainToday], file_type=parquet └────────────────────────────────────────────────── "); } @@ -209,10 +208,9 @@ mod tests { │ [Stage 1] => NetworkShuffleExec: output_partitions=4, input_tasks=2 └────────────────────────────────────────────────── ┌───── Stage 1 ── Tasks: t0:[p0..p7] t1:[p0..p7] - │ RepartitionExec: partitioning=Hash([RainToday@0], 8), input_partitions=2 + │ RepartitionExec: partitioning=Hash([RainToday@0], 8), input_partitions=3 │ AggregateExec: mode=Partial, gby=[RainToday@0 as RainToday], aggr=[count(Int64(1))] - │ PartitionIsolatorExec: tasks=2 partitions=3 - │ DataSourceExec: file_groups={3 groups: [[/testdata/weather/result-000000.parquet], [/testdata/weather/result-000001.parquet], [/testdata/weather/result-000002.parquet]]}, projection=[RainToday], file_type=parquet + │ DistributedLeafExec: DataSourceExec: file_groups={3 groups: [[/testdata/weather/result-000000.parquet], [/testdata/weather/result-000001.parquet], [/testdata/weather/result-000002.parquet]]}, projection=[RainToday], file_type=parquet └────────────────────────────────────────────────── "); } @@ -259,10 +257,9 @@ mod tests { │ [Stage 1] => NetworkShuffleExec: output_partitions=4, input_tasks=3 └────────────────────────────────────────────────── ┌───── Stage 1 ── Tasks: t0:[p0..p3] t1:[p0..p3] t2:[p0..p3] - │ RepartitionExec: partitioning=Hash([RainToday@0], 4), input_partitions=1 + │ RepartitionExec: partitioning=Hash([RainToday@0], 4), input_partitions=3 │ AggregateExec: mode=Partial, gby=[RainToday@0 as RainToday], aggr=[count(Int64(1))] - │ PartitionIsolatorExec: tasks=3 partitions=3 - │ DataSourceExec: file_groups={3 groups: [[/testdata/weather/result-000000.parquet], [/testdata/weather/result-000001.parquet], [/testdata/weather/result-000002.parquet]]}, projection=[RainToday], file_type=parquet + │ DistributedLeafExec: DataSourceExec: file_groups={3 groups: [[/testdata/weather/result-000000.parquet], [/testdata/weather/result-000001.parquet], [/testdata/weather/result-000002.parquet]]}, projection=[RainToday], file_type=parquet └────────────────────────────────────────────────── "); } @@ -312,10 +309,9 @@ mod tests { │ [Stage 1] => NetworkShuffleExec: output_partitions=4, input_tasks=3 └────────────────────────────────────────────────── ┌───── Stage 1 ── Tasks: t0:[p0..p7] t1:[p0..p7] t2:[p0..p7] - │ RepartitionExec: partitioning=Hash([RainToday@0], 8), input_partitions=1 + │ RepartitionExec: partitioning=Hash([RainToday@0], 8), input_partitions=3 │ AggregateExec: mode=Partial, gby=[RainToday@0 as RainToday], aggr=[count(Int64(1))] - │ PartitionIsolatorExec: tasks=3 partitions=3 - │ DataSourceExec: file_groups={3 groups: [[/testdata/weather/result-000000.parquet], [/testdata/weather/result-000001.parquet], [/testdata/weather/result-000002.parquet]]}, projection=[RainToday], file_type=parquet + │ DistributedLeafExec: DataSourceExec: file_groups={3 groups: [[/testdata/weather/result-000000.parquet], [/testdata/weather/result-000001.parquet], [/testdata/weather/result-000002.parquet]]}, projection=[RainToday], file_type=parquet └────────────────────────────────────────────────── "); } @@ -385,17 +381,15 @@ mod tests { │ RepartitionExec: partitioning=Hash([RainTomorrow@0], 8), input_partitions=4 │ AggregateExec: mode=Partial, gby=[RainTomorrow@1 as RainTomorrow], aggr=[avg(weather.MinTemp)] │ FilterExec: RainToday@1 = yes, projection=[MinTemp@0, RainTomorrow@2] - │ RepartitionExec: partitioning=RoundRobinBatch(4), input_partitions=1 - │ PartitionIsolatorExec: tasks=3 partitions=3 - │ DataSourceExec: file_groups={3 groups: [[/testdata/weather/result-000000.parquet], [/testdata/weather/result-000001.parquet], [/testdata/weather/result-000002.parquet]]}, projection=[MinTemp, RainToday, RainTomorrow], file_type=parquet, predicate=RainToday@19 = yes, pruning_predicate=RainToday_null_count@2 != row_count@3 AND RainToday_min@0 <= yes AND yes <= RainToday_max@1, required_guarantees=[RainToday in (yes)] + │ RepartitionExec: partitioning=RoundRobinBatch(4), input_partitions=3 + │ DistributedLeafExec: DataSourceExec: file_groups={3 groups: [[/testdata/weather/result-000000.parquet], [/testdata/weather/result-000001.parquet], [/testdata/weather/result-000002.parquet]]}, projection=[MinTemp, RainToday, RainTomorrow], file_type=parquet, predicate=RainToday@19 = yes, pruning_predicate=RainToday_null_count@2 != row_count@3 AND RainToday_min@0 <= yes AND yes <= RainToday_max@1, required_guarantees=[RainToday in (yes)] └────────────────────────────────────────────────── ┌───── Stage 3 ── Tasks: t0:[p0..p3] t1:[p0..p3] t2:[p0..p3] │ RepartitionExec: partitioning=Hash([RainTomorrow@0], 4), input_partitions=4 │ AggregateExec: mode=Partial, gby=[RainTomorrow@1 as RainTomorrow], aggr=[avg(weather.MaxTemp)] │ FilterExec: RainToday@1 = no, projection=[MaxTemp@0, RainTomorrow@2] - │ RepartitionExec: partitioning=RoundRobinBatch(4), input_partitions=1 - │ PartitionIsolatorExec: tasks=3 partitions=3 - │ DataSourceExec: file_groups={3 groups: [[/testdata/weather/result-000000.parquet], [/testdata/weather/result-000001.parquet], [/testdata/weather/result-000002.parquet]]}, projection=[MaxTemp, RainToday, RainTomorrow], file_type=parquet, predicate=RainToday@19 = no, pruning_predicate=RainToday_null_count@2 != row_count@3 AND RainToday_min@0 <= no AND no <= RainToday_max@1, required_guarantees=[RainToday in (no)] + │ RepartitionExec: partitioning=RoundRobinBatch(4), input_partitions=3 + │ DistributedLeafExec: DataSourceExec: file_groups={3 groups: [[/testdata/weather/result-000000.parquet], [/testdata/weather/result-000001.parquet], [/testdata/weather/result-000002.parquet]]}, projection=[MaxTemp, RainToday, RainTomorrow], file_type=parquet, predicate=RainToday@19 = no, pruning_predicate=RainToday_null_count@2 != row_count@3 AND RainToday_min@0 <= no AND no <= RainToday_max@1, required_guarantees=[RainToday in (no)] └────────────────────────────────────────────────── "); } @@ -412,12 +406,11 @@ mod tests { assert_snapshot!(plan, @r" ┌───── DistributedExec ── Tasks: t0:[p0] │ SortPreservingMergeExec: [MinTemp@0 DESC] - │ [Stage 1] => NetworkCoalesceExec: output_partitions=3, input_tasks=3 + │ [Stage 1] => NetworkCoalesceExec: output_partitions=9, input_tasks=3 └────────────────────────────────────────────────── - ┌───── Stage 1 ── Tasks: t0:[p0] t1:[p1] t2:[p2] + ┌───── Stage 1 ── Tasks: t0:[p0..p2] t1:[p3..p5] t2:[p6..p8] │ SortExec: expr=[MinTemp@0 DESC], preserve_partitioning=[true] - │ PartitionIsolatorExec: tasks=3 partitions=3 - │ DataSourceExec: file_groups={3 groups: [[/testdata/weather/result-000000.parquet], [/testdata/weather/result-000001.parquet], [/testdata/weather/result-000002.parquet]]}, projection=[MinTemp, MaxTemp, Rainfall, Evaporation, Sunshine, WindGustDir, WindGustSpeed, WindDir9am, WindDir3pm, WindSpeed9am, WindSpeed3pm, Humidity9am, Humidity3pm, Pressure9am, Pressure3pm, Cloud9am, Cloud3pm, Temp9am, Temp3pm, RainToday, RISK_MM, RainTomorrow], file_type=parquet + │ DistributedLeafExec: DataSourceExec: file_groups={3 groups: [[/testdata/weather/result-000000.parquet], [/testdata/weather/result-000001.parquet], [/testdata/weather/result-000002.parquet]]}, projection=[MinTemp, MaxTemp, Rainfall, Evaporation, Sunshine, WindGustDir, WindGustSpeed, WindDir9am, WindDir3pm, WindSpeed9am, WindSpeed3pm, Humidity9am, Humidity3pm, Pressure9am, Pressure3pm, Cloud9am, Cloud3pm, Temp9am, Temp3pm, RainToday, RISK_MM, RainTomorrow], file_type=parquet └────────────────────────────────────────────────── "); } @@ -443,10 +436,9 @@ mod tests { │ [Stage 1] => NetworkShuffleExec: output_partitions=4, input_tasks=3 └────────────────────────────────────────────────── ┌───── Stage 1 ── Tasks: t0:[p0..p7] t1:[p0..p7] t2:[p0..p7] - │ RepartitionExec: partitioning=Hash([RainToday@0], 8), input_partitions=1 + │ RepartitionExec: partitioning=Hash([RainToday@0], 8), input_partitions=3 │ AggregateExec: mode=Partial, gby=[RainToday@0 as RainToday], aggr=[count(Int64(1))] - │ PartitionIsolatorExec: tasks=3 partitions=3 - │ DataSourceExec: file_groups={3 groups: [[/testdata/weather/result-000000.parquet], [/testdata/weather/result-000001.parquet], [/testdata/weather/result-000002.parquet]]}, projection=[RainToday], file_type=parquet + │ DistributedLeafExec: DataSourceExec: file_groups={3 groups: [[/testdata/weather/result-000000.parquet], [/testdata/weather/result-000001.parquet], [/testdata/weather/result-000002.parquet]]}, projection=[RainToday], file_type=parquet └────────────────────────────────────────────────── "); } @@ -470,10 +462,9 @@ mod tests { │ [Stage 1] => NetworkShuffleExec: output_partitions=4, input_tasks=3 └────────────────────────────────────────────────── ┌───── Stage 1 ── Tasks: t0:[p0..p7] t1:[p0..p7] t2:[p0..p7] - │ RepartitionExec: partitioning=Hash([RainToday@0, WindGustDir@1], 8), input_partitions=1 + │ RepartitionExec: partitioning=Hash([RainToday@0, WindGustDir@1], 8), input_partitions=3 │ AggregateExec: mode=Partial, gby=[RainToday@0 as RainToday, WindGustDir@1 as WindGustDir], aggr=[] - │ PartitionIsolatorExec: tasks=3 partitions=3 - │ DataSourceExec: file_groups={3 groups: [[/testdata/weather/result-000000.parquet], [/testdata/weather/result-000001.parquet], [/testdata/weather/result-000002.parquet]]}, projection=[RainToday, WindGustDir], file_type=parquet + │ DistributedLeafExec: DataSourceExec: file_groups={3 groups: [[/testdata/weather/result-000000.parquet], [/testdata/weather/result-000001.parquet], [/testdata/weather/result-000002.parquet]]}, projection=[RainToday, WindGustDir], file_type=parquet └────────────────────────────────────────────────── "); } @@ -565,14 +556,12 @@ mod tests { ┌───── Stage 1 ── Tasks: t0:[p0..p3] t1:[p4..p7] t2:[p8..p11] t3:[p12..p15] t4:[p16..p19] t5:[p20..p23] │ DistributedUnionExec: t0:[c0(0/3)] t1:[c0(1/3)] t2:[c0(2/3)] t3:[c1(0/3)] t4:[c1(1/3)] t5:[c1(2/3)] │ FilterExec: MinTemp@0 > 10 - │ RepartitionExec: partitioning=RoundRobinBatch(4), input_partitions=1 - │ PartitionIsolatorExec: tasks=3 partitions=3 - │ DataSourceExec: file_groups={3 groups: [[/testdata/weather/result-000000.parquet], [/testdata/weather/result-000001.parquet], [/testdata/weather/result-000002.parquet]]}, projection=[MinTemp, RainToday], file_type=parquet, predicate=MinTemp@0 > 10, pruning_predicate=MinTemp_null_count@1 != row_count@2 AND MinTemp_max@0 > 10, required_guarantees=[] + │ RepartitionExec: partitioning=RoundRobinBatch(4), input_partitions=3 + │ DistributedLeafExec: DataSourceExec: file_groups={3 groups: [[/testdata/weather/result-000000.parquet], [/testdata/weather/result-000001.parquet], [/testdata/weather/result-000002.parquet]]}, projection=[MinTemp, RainToday], file_type=parquet, predicate=MinTemp@0 > 10, pruning_predicate=MinTemp_null_count@1 != row_count@2 AND MinTemp_max@0 > 10, required_guarantees=[] │ ProjectionExec: expr=[MaxTemp@0 as MinTemp, RainToday@1 as RainToday] │ FilterExec: MaxTemp@0 < 30 - │ RepartitionExec: partitioning=RoundRobinBatch(4), input_partitions=1 - │ PartitionIsolatorExec: tasks=3 partitions=3 - │ DataSourceExec: file_groups={3 groups: [[/testdata/weather/result-000000.parquet], [/testdata/weather/result-000001.parquet], [/testdata/weather/result-000002.parquet]]}, projection=[MaxTemp, RainToday], file_type=parquet, predicate=MaxTemp@1 < 30, pruning_predicate=MaxTemp_null_count@1 != row_count@2 AND MaxTemp_min@0 < 30, required_guarantees=[] + │ RepartitionExec: partitioning=RoundRobinBatch(4), input_partitions=3 + │ DistributedLeafExec: DataSourceExec: file_groups={3 groups: [[/testdata/weather/result-000000.parquet], [/testdata/weather/result-000001.parquet], [/testdata/weather/result-000002.parquet]]}, projection=[MaxTemp, RainToday], file_type=parquet, predicate=MaxTemp@1 < 30, pruning_predicate=MaxTemp_null_count@1 != row_count@2 AND MaxTemp_min@0 < 30, required_guarantees=[] └────────────────────────────────────────────────── "); } @@ -597,9 +586,8 @@ mod tests { ┌───── Stage 1 ── Tasks: t0:[p0..p3] t1:[p4..p7] t2:[p8..p11] │ DistributedUnionExec: t0:[c0(0/2)] t1:[c0(1/2)] t2:[c1] │ FilterExec: MinTemp@0 > 10 - │ RepartitionExec: partitioning=RoundRobinBatch(4), input_partitions=2 - │ PartitionIsolatorExec: tasks=2 partitions=3 - │ DataSourceExec: file_groups={3 groups: [[/testdata/weather/result-000000.parquet], [/testdata/weather/result-000001.parquet], [/testdata/weather/result-000002.parquet]]}, projection=[MinTemp, RainToday], file_type=parquet, predicate=MinTemp@0 > 10, pruning_predicate=MinTemp_null_count@1 != row_count@2 AND MinTemp_max@0 > 10, required_guarantees=[] + │ RepartitionExec: partitioning=RoundRobinBatch(4), input_partitions=3 + │ DistributedLeafExec: DataSourceExec: file_groups={3 groups: [[/testdata/weather/result-000000.parquet], [/testdata/weather/result-000001.parquet], [/testdata/weather/result-000002.parquet]]}, projection=[MinTemp, RainToday], file_type=parquet, predicate=MinTemp@0 > 10, pruning_predicate=MinTemp_null_count@1 != row_count@2 AND MinTemp_max@0 > 10, required_guarantees=[] │ ProjectionExec: expr=[MaxTemp@0 as MinTemp, RainToday@1 as RainToday] │ FilterExec: MaxTemp@0 < 30 │ RepartitionExec: partitioning=RoundRobinBatch(4), input_partitions=3 @@ -703,19 +691,17 @@ mod tests { assert_snapshot!(annotated, @r" ┌───── DistributedExec ── Tasks: t0:[p0] │ CoalescePartitionsExec - │ [Stage 2] => NetworkCoalesceExec: output_partitions=3, input_tasks=3 + │ [Stage 2] => NetworkCoalesceExec: output_partitions=9, input_tasks=3 └────────────────────────────────────────────────── - ┌───── Stage 2 ── Tasks: t0:[p0] t1:[p1] t2:[p2] + ┌───── Stage 2 ── Tasks: t0:[p0..p2] t1:[p3..p5] t2:[p6..p8] │ HashJoinExec: mode=CollectLeft, join_type=Inner, on=[(RainToday@1, RainToday@1)], projection=[MinTemp@0, MaxTemp@2] │ CoalescePartitionsExec - │ [Stage 1] => NetworkBroadcastExec: partitions_per_consumer=1, stage_partitions=3, input_tasks=3 - │ PartitionIsolatorExec: tasks=3 partitions=3 - │ DataSourceExec: file_groups={3 groups: [[/testdata/weather/result-000000.parquet], [/testdata/weather/result-000001.parquet], [/testdata/weather/result-000002.parquet]]}, projection=[MaxTemp, RainToday], file_type=parquet, predicate=DynamicFilter [ empty ] + │ [Stage 1] => NetworkBroadcastExec: partitions_per_consumer=3, stage_partitions=9, input_tasks=3 + │ DistributedLeafExec: DataSourceExec: file_groups={3 groups: [[/testdata/weather/result-000000.parquet], [/testdata/weather/result-000001.parquet], [/testdata/weather/result-000002.parquet]]}, projection=[MaxTemp, RainToday], file_type=parquet, predicate=DynamicFilter [ empty ] └────────────────────────────────────────────────── - ┌───── Stage 1 ── Tasks: t0:[p0..p2] t1:[p3..p5] t2:[p6..p8] - │ BroadcastExec: input_partitions=1, consumer_tasks=3, output_partitions=3 - │ PartitionIsolatorExec: tasks=3 partitions=3 - │ DataSourceExec: file_groups={3 groups: [[/testdata/weather/result-000000.parquet], [/testdata/weather/result-000001.parquet], [/testdata/weather/result-000002.parquet]]}, projection=[MinTemp, RainToday], file_type=parquet + ┌───── Stage 1 ── Tasks: t0:[p0..p8] t1:[p9..p17] t2:[p18..p26] + │ BroadcastExec: input_partitions=3, consumer_tasks=3, output_partitions=9 + │ DistributedLeafExec: DataSourceExec: file_groups={3 groups: [[/testdata/weather/result-000000.parquet], [/testdata/weather/result-000001.parquet], [/testdata/weather/result-000002.parquet]]}, projection=[MinTemp, RainToday], file_type=parquet └────────────────────────────────────────────────── ") } @@ -732,27 +718,24 @@ mod tests { assert_snapshot!(plan, @r" ┌───── DistributedExec ── Tasks: t0:[p0] │ CoalescePartitionsExec - │ [Stage 3] => NetworkCoalesceExec: output_partitions=3, input_tasks=3 + │ [Stage 3] => NetworkCoalesceExec: output_partitions=9, input_tasks=3 └────────────────────────────────────────────────── - ┌───── Stage 3 ── Tasks: t0:[p0] t1:[p1] t2:[p2] + ┌───── Stage 3 ── Tasks: t0:[p0..p2] t1:[p3..p5] t2:[p6..p8] │ HashJoinExec: mode=CollectLeft, join_type=Inner, on=[(RainToday@2, RainToday@1)], projection=[MinTemp@0, MaxTemp@1, Rainfall@3] │ CoalescePartitionsExec - │ [Stage 2] => NetworkBroadcastExec: partitions_per_consumer=1, stage_partitions=3, input_tasks=3 - │ PartitionIsolatorExec: tasks=3 partitions=3 - │ DataSourceExec: file_groups={3 groups: [[/testdata/weather/result-000000.parquet], [/testdata/weather/result-000001.parquet], [/testdata/weather/result-000002.parquet]]}, projection=[Rainfall, RainToday], file_type=parquet, predicate=DynamicFilter [ empty ] + │ [Stage 2] => NetworkBroadcastExec: partitions_per_consumer=3, stage_partitions=9, input_tasks=3 + │ DistributedLeafExec: DataSourceExec: file_groups={3 groups: [[/testdata/weather/result-000000.parquet], [/testdata/weather/result-000001.parquet], [/testdata/weather/result-000002.parquet]]}, projection=[Rainfall, RainToday], file_type=parquet, predicate=DynamicFilter [ empty ] └────────────────────────────────────────────────── - ┌───── Stage 2 ── Tasks: t0:[p0..p2] t1:[p3..p5] t2:[p6..p8] - │ BroadcastExec: input_partitions=1, consumer_tasks=3, output_partitions=3 + ┌───── Stage 2 ── Tasks: t0:[p0..p8] t1:[p9..p17] t2:[p18..p26] + │ BroadcastExec: input_partitions=3, consumer_tasks=3, output_partitions=9 │ HashJoinExec: mode=CollectLeft, join_type=Inner, on=[(RainToday@1, RainToday@1)], projection=[MinTemp@0, MaxTemp@2, RainToday@3] │ CoalescePartitionsExec - │ [Stage 1] => NetworkBroadcastExec: partitions_per_consumer=1, stage_partitions=3, input_tasks=3 - │ PartitionIsolatorExec: tasks=3 partitions=3 - │ DataSourceExec: file_groups={3 groups: [[/testdata/weather/result-000000.parquet], [/testdata/weather/result-000001.parquet], [/testdata/weather/result-000002.parquet]]}, projection=[MaxTemp, RainToday], file_type=parquet, predicate=DynamicFilter [ empty ] + │ [Stage 1] => NetworkBroadcastExec: partitions_per_consumer=3, stage_partitions=9, input_tasks=3 + │ DistributedLeafExec: DataSourceExec: file_groups={3 groups: [[/testdata/weather/result-000000.parquet], [/testdata/weather/result-000001.parquet], [/testdata/weather/result-000002.parquet]]}, projection=[MaxTemp, RainToday], file_type=parquet, predicate=DynamicFilter [ empty ] └────────────────────────────────────────────────── - ┌───── Stage 1 ── Tasks: t0:[p0..p2] t1:[p3..p5] t2:[p6..p8] - │ BroadcastExec: input_partitions=1, consumer_tasks=3, output_partitions=3 - │ PartitionIsolatorExec: tasks=3 partitions=3 - │ DataSourceExec: file_groups={3 groups: [[/testdata/weather/result-000000.parquet], [/testdata/weather/result-000001.parquet], [/testdata/weather/result-000002.parquet]]}, projection=[MinTemp, RainToday], file_type=parquet + ┌───── Stage 1 ── Tasks: t0:[p0..p8] t1:[p9..p17] t2:[p18..p26] + │ BroadcastExec: input_partitions=3, consumer_tasks=3, output_partitions=9 + │ DistributedLeafExec: DataSourceExec: file_groups={3 groups: [[/testdata/weather/result-000000.parquet], [/testdata/weather/result-000001.parquet], [/testdata/weather/result-000002.parquet]]}, projection=[MinTemp, RainToday], file_type=parquet └────────────────────────────────────────────────── ") } @@ -777,19 +760,17 @@ mod tests { assert_snapshot!(plan, @r" ┌───── DistributedExec ── Tasks: t0:[p0] │ CoalescePartitionsExec - │ [Stage 2] => NetworkCoalesceExec: output_partitions=3, input_tasks=3 + │ [Stage 2] => NetworkCoalesceExec: output_partitions=9, input_tasks=3 └────────────────────────────────────────────────── - ┌───── Stage 2 ── Tasks: t0:[p0] t1:[p1] t2:[p2] + ┌───── Stage 2 ── Tasks: t0:[p0..p2] t1:[p3..p5] t2:[p6..p8] │ HashJoinExec: mode=CollectLeft, join_type=Inner, on=[(RainToday@1, RainToday@1)], projection=[MinTemp@0, MaxTemp@2] │ CoalescePartitionsExec - │ [Stage 1] => NetworkBroadcastExec: partitions_per_consumer=1, stage_partitions=3, input_tasks=3 - │ PartitionIsolatorExec: tasks=3 partitions=3 - │ DataSourceExec: file_groups={3 groups: [[/testdata/weather/result-000000.parquet], [/testdata/weather/result-000001.parquet], [/testdata/weather/result-000002.parquet]]}, projection=[MaxTemp, RainToday], file_type=parquet, predicate=DynamicFilter [ empty ] + │ [Stage 1] => NetworkBroadcastExec: partitions_per_consumer=3, stage_partitions=9, input_tasks=3 + │ DistributedLeafExec: DataSourceExec: file_groups={3 groups: [[/testdata/weather/result-000000.parquet], [/testdata/weather/result-000001.parquet], [/testdata/weather/result-000002.parquet]]}, projection=[MaxTemp, RainToday], file_type=parquet, predicate=DynamicFilter [ empty ] └────────────────────────────────────────────────── - ┌───── Stage 1 ── Tasks: t0:[p0..p2] t1:[p3..p5] t2:[p6..p8] - │ BroadcastExec: input_partitions=1, consumer_tasks=3, output_partitions=3 - │ PartitionIsolatorExec: tasks=3 partitions=3 - │ DataSourceExec: file_groups={3 groups: [[/testdata/weather/result-000000.parquet], [/testdata/weather/result-000001.parquet], [/testdata/weather/result-000002.parquet]]}, projection=[MinTemp, RainToday], file_type=parquet + ┌───── Stage 1 ── Tasks: t0:[p0..p8] t1:[p9..p17] t2:[p18..p26] + │ BroadcastExec: input_partitions=3, consumer_tasks=3, output_partitions=9 + │ DistributedLeafExec: DataSourceExec: file_groups={3 groups: [[/testdata/weather/result-000000.parquet], [/testdata/weather/result-000001.parquet], [/testdata/weather/result-000002.parquet]]}, projection=[MinTemp, RainToday], file_type=parquet └────────────────────────────────────────────────── ") } @@ -820,31 +801,28 @@ mod tests { │ DistributedUnionExec: t0:[c0] t1:[c1] t2:[c2] │ HashJoinExec: mode=CollectLeft, join_type=Inner, on=[(RainToday@1, RainToday@1)], projection=[MinTemp@0, MaxTemp@2] │ CoalescePartitionsExec - │ [Stage 1] => NetworkBroadcastExec: partitions_per_consumer=1, stage_partitions=1, input_tasks=3 + │ [Stage 1] => NetworkBroadcastExec: partitions_per_consumer=3, stage_partitions=3, input_tasks=3 │ DataSourceExec: file_groups={3 groups: [[/testdata/weather/result-000000.parquet], [/testdata/weather/result-000001.parquet], [/testdata/weather/result-000002.parquet]]}, projection=[MaxTemp, RainToday], file_type=parquet, predicate=DynamicFilter [ empty ] │ HashJoinExec: mode=CollectLeft, join_type=Inner, on=[(RainToday@1, RainToday@1)], projection=[MinTemp@0, MaxTemp@2] │ CoalescePartitionsExec - │ [Stage 2] => NetworkBroadcastExec: partitions_per_consumer=1, stage_partitions=1, input_tasks=3 + │ [Stage 2] => NetworkBroadcastExec: partitions_per_consumer=3, stage_partitions=3, input_tasks=3 │ DataSourceExec: file_groups={3 groups: [[/testdata/weather/result-000000.parquet], [/testdata/weather/result-000001.parquet], [/testdata/weather/result-000002.parquet]]}, projection=[MaxTemp, RainToday], file_type=parquet, predicate=DynamicFilter [ empty ] │ HashJoinExec: mode=CollectLeft, join_type=Inner, on=[(RainToday@1, RainToday@1)], projection=[MinTemp@0, MaxTemp@2] │ CoalescePartitionsExec - │ [Stage 3] => NetworkBroadcastExec: partitions_per_consumer=1, stage_partitions=1, input_tasks=3 + │ [Stage 3] => NetworkBroadcastExec: partitions_per_consumer=3, stage_partitions=3, input_tasks=3 │ DataSourceExec: file_groups={3 groups: [[/testdata/weather/result-000000.parquet], [/testdata/weather/result-000001.parquet], [/testdata/weather/result-000002.parquet]]}, projection=[MaxTemp, RainToday], file_type=parquet, predicate=DynamicFilter [ empty ] └────────────────────────────────────────────────── - ┌───── Stage 1 ── Tasks: t0:[p0] t1:[p1] t2:[p2] - │ BroadcastExec: input_partitions=1, consumer_tasks=1, output_partitions=1 - │ PartitionIsolatorExec: tasks=3 partitions=3 - │ DataSourceExec: file_groups={3 groups: [[/testdata/weather/result-000000.parquet], [/testdata/weather/result-000001.parquet], [/testdata/weather/result-000002.parquet]]}, projection=[MinTemp, RainToday], file_type=parquet + ┌───── Stage 1 ── Tasks: t0:[p0..p2] t1:[p3..p5] t2:[p6..p8] + │ BroadcastExec: input_partitions=3, consumer_tasks=1, output_partitions=3 + │ DistributedLeafExec: DataSourceExec: file_groups={3 groups: [[/testdata/weather/result-000000.parquet], [/testdata/weather/result-000001.parquet], [/testdata/weather/result-000002.parquet]]}, projection=[MinTemp, RainToday], file_type=parquet └────────────────────────────────────────────────── - ┌───── Stage 2 ── Tasks: t0:[p0] t1:[p1] t2:[p2] - │ BroadcastExec: input_partitions=1, consumer_tasks=1, output_partitions=1 - │ PartitionIsolatorExec: tasks=3 partitions=3 - │ DataSourceExec: file_groups={3 groups: [[/testdata/weather/result-000000.parquet], [/testdata/weather/result-000001.parquet], [/testdata/weather/result-000002.parquet]]}, projection=[MinTemp, RainToday], file_type=parquet + ┌───── Stage 2 ── Tasks: t0:[p0..p2] t1:[p3..p5] t2:[p6..p8] + │ BroadcastExec: input_partitions=3, consumer_tasks=1, output_partitions=3 + │ DistributedLeafExec: DataSourceExec: file_groups={3 groups: [[/testdata/weather/result-000000.parquet], [/testdata/weather/result-000001.parquet], [/testdata/weather/result-000002.parquet]]}, projection=[MinTemp, RainToday], file_type=parquet └────────────────────────────────────────────────── - ┌───── Stage 3 ── Tasks: t0:[p0] t1:[p1] t2:[p2] - │ BroadcastExec: input_partitions=1, consumer_tasks=1, output_partitions=1 - │ PartitionIsolatorExec: tasks=3 partitions=3 - │ DataSourceExec: file_groups={3 groups: [[/testdata/weather/result-000000.parquet], [/testdata/weather/result-000001.parquet], [/testdata/weather/result-000002.parquet]]}, projection=[MinTemp, RainToday], file_type=parquet + ┌───── Stage 3 ── Tasks: t0:[p0..p2] t1:[p3..p5] t2:[p6..p8] + │ BroadcastExec: input_partitions=3, consumer_tasks=1, output_partitions=3 + │ DistributedLeafExec: DataSourceExec: file_groups={3 groups: [[/testdata/weather/result-000000.parquet], [/testdata/weather/result-000001.parquet], [/testdata/weather/result-000002.parquet]]}, projection=[MinTemp, RainToday], file_type=parquet └────────────────────────────────────────────────── "); } @@ -860,14 +838,13 @@ mod tests { assert_snapshot!(plan, @r" ┌───── DistributedExec ── Tasks: t0:[p0] │ CoalescePartitionsExec - │ [Stage 2] => NetworkCoalesceExec: output_partitions=3, input_tasks=3 + │ [Stage 2] => NetworkCoalesceExec: output_partitions=9, input_tasks=3 └────────────────────────────────────────────────── - ┌───── Stage 2 ── Tasks: t0:[p0] t1:[p1] t2:[p2] + ┌───── Stage 2 ── Tasks: t0:[p0..p2] t1:[p3..p5] t2:[p6..p8] │ HashJoinExec: mode=CollectLeft, join_type=Inner, on=[(RainToday@1, RainToday@1)], projection=[MinTemp@0, MaxTemp@2] │ CoalescePartitionsExec │ [Stage 1] => NetworkBroadcastExec: partitions_per_consumer=3, stage_partitions=9, input_tasks=1 - │ PartitionIsolatorExec: tasks=3 partitions=3 - │ DataSourceExec: file_groups={3 groups: [[/testdata/weather/result-000000.parquet], [/testdata/weather/result-000001.parquet], [/testdata/weather/result-000002.parquet]]}, projection=[MaxTemp, RainToday], file_type=parquet, predicate=DynamicFilter [ empty ] + │ DistributedLeafExec: DataSourceExec: file_groups={3 groups: [[/testdata/weather/result-000000.parquet], [/testdata/weather/result-000001.parquet], [/testdata/weather/result-000002.parquet]]}, projection=[MaxTemp, RainToday], file_type=parquet, predicate=DynamicFilter [ empty ] └────────────────────────────────────────────────── ┌───── Stage 1 ── Tasks: t0:[p0..p8] │ BroadcastExec: input_partitions=3, consumer_tasks=3, output_partitions=9 diff --git a/src/distributed_planner/inject_network_boundaries.rs b/src/distributed_planner/inject_network_boundaries.rs index f81c0c57..2804c22b 100644 --- a/src/distributed_planner/inject_network_boundaries.rs +++ b/src/distributed_planner/inject_network_boundaries.rs @@ -75,8 +75,11 @@ use uuid::Uuid; /// just delimited stage. /// - The `T` task count reconciled from phase 1 is assigned to every node in the stage during /// this top-down walk. -/// - Leaves go through [TaskEstimator::scale_up_leaf_node] (e.g. a `DataSourceExec` may be wrapped -/// in a `PartitionIsolatorExec`) which is called using `T` as the `task_count` argument. +/// - Leaves go through [TaskEstimator::scale_up_leaf_node] which is called using `T` as the +/// `task_count` argument. The default file-scan estimator wraps the leaf in a +/// [DistributedLeafExec] that holds one per-task variant for each of the `T` tasks; the +/// wrapper is transparent to network boundaries (it reports the same partition count as the +/// original) and is replaced by its per-task variant in the task spawner before serialisation. /// - If the walk meets a network boundary that was already injected by an earlier iteration of this /// loop, it does **not** descend into it — that subtree belongs to a previously-formed stage and /// has already been finalised. @@ -94,14 +97,9 @@ use uuid::Uuid; /// │ /// ▼ /// ┌─────────────────────────────┐ -/// │ DataSourceExec │ ← scale-up via TaskEstimator -/// └─────────────────────────────┘ e.g.: -/// ┌────────────────────┐ -/// │ PartitionIsolator │ ← T -/// │ ┌─────────────┐ │ -/// │ │ DataSource │ │ ← T -/// │ └─────────────┘ │ -/// └────────────────────┘ +/// │ DistributedLeafExec │ ← wraps DataSourceExec; this is replaced +/// │ (DataSourceExec) │ with the per-task variant before sending +/// └─────────────────────────────┘ to workers /// ``` /// /// ## Phase 3 — inject the boundary and seed the next stage's starting task count @@ -379,9 +377,9 @@ async fn _inject_network_boundaries( /// /// Per-case behaviour: /// -/// - **Leaves**: ask the [TaskEstimator] for an optional scaled-up replacement (e.g. wrapping a -/// `DataSourceExec` in a `PartitionIsolatorExec`). Every node in the returned subtree — -/// including any wrappers the estimator introduced — is recorded with `task_count`. +/// - **Leaves**: ask the [TaskEstimator] for an optional scaled-up replacement (e.g. expanding a +/// `DataSourceExec`'s file groups by `task_count`). Every node in the returned subtree is +/// recorded with `task_count`. /// - **Network boundaries**: don't descend into the boundary's input plan (it lives in another /// stage). Instead, rescale the boundary's input via [network_boundary_scale_input] using the /// *consumer* partition and task counts of this side of the boundary, and stitch the rescaled @@ -407,8 +405,7 @@ fn propagate_task_count_until_network_boundaries( match scaled_up { None => Ok(ctx.plan_with_task_count(Arc::clone(plan), task_count)), Some(scaled_up) => { - // The scaled up node might contain more than 1 node, for example, if a - // PartitionIsolatorExec was injected. + // The scaled up subtree may contain more than 1 node. scaled_up.apply(|plan| { ctx.set_task_count(plan, task_count); Ok(TreeNodeRecursion::Continue) @@ -594,8 +591,7 @@ mod tests { NetworkShuffleExec: task_count=Desired(2) RepartitionExec: task_count=Desired(3) AggregateExec: task_count=Desired(3) - PartitionIsolatorExec: task_count=Desired(3) - DataSourceExec: task_count=Desired(3) + DataSourceExec: task_count=Desired(3) ") } @@ -650,8 +646,7 @@ mod tests { AggregateExec: task_count=Desired(3) FilterExec: task_count=Desired(3) RepartitionExec: task_count=Desired(3) - PartitionIsolatorExec: task_count=Desired(3) - DataSourceExec: task_count=Desired(3) + DataSourceExec: task_count=Desired(3) ProjectionExec: task_count=Maximum(1) AggregateExec: task_count=Maximum(1) NetworkShuffleExec: task_count=Maximum(1) @@ -659,8 +654,7 @@ mod tests { AggregateExec: task_count=Desired(3) FilterExec: task_count=Desired(3) RepartitionExec: task_count=Desired(3) - PartitionIsolatorExec: task_count=Desired(3) - DataSourceExec: task_count=Desired(3) + DataSourceExec: task_count=Desired(3) ") } @@ -691,8 +685,7 @@ mod tests { NetworkShuffleExec: task_count=Desired(2) RepartitionExec: task_count=Desired(3) AggregateExec: task_count=Desired(3) - PartitionIsolatorExec: task_count=Desired(3) - DataSourceExec: task_count=Desired(3) + DataSourceExec: task_count=Desired(3) ") } @@ -708,13 +701,11 @@ mod tests { ChildrenIsolatorUnionExec: task_count=Desired(4) FilterExec: task_count=Maximum(2) RepartitionExec: task_count=Maximum(2) - PartitionIsolatorExec: task_count=Maximum(2) - DataSourceExec: task_count=Maximum(2) + DataSourceExec: task_count=Maximum(2) ProjectionExec: task_count=Maximum(2) FilterExec: task_count=Maximum(2) RepartitionExec: task_count=Maximum(2) - PartitionIsolatorExec: task_count=Maximum(2) - DataSourceExec: task_count=Maximum(2) + DataSourceExec: task_count=Maximum(2) ") } @@ -729,8 +720,7 @@ mod tests { assert_snapshot!(annotated, @r" FilterExec: task_count=Desired(3) RepartitionExec: task_count=Desired(3) - PartitionIsolatorExec: task_count=Desired(3) - DataSourceExec: task_count=Desired(3) + DataSourceExec: task_count=Desired(3) ") } @@ -747,8 +737,7 @@ mod tests { SortExec: task_count=Desired(3) NetworkShuffleExec: task_count=Desired(3) RepartitionExec: task_count=Desired(3) - PartitionIsolatorExec: task_count=Desired(3) - DataSourceExec: task_count=Desired(3) + DataSourceExec: task_count=Desired(3) ") } @@ -768,8 +757,7 @@ mod tests { ChildrenIsolatorUnionExec: task_count=Desired(4) FilterExec: task_count=Maximum(2) RepartitionExec: task_count=Maximum(2) - PartitionIsolatorExec: task_count=Maximum(2) - DataSourceExec: task_count=Maximum(2) + DataSourceExec: task_count=Maximum(2) ProjectionExec: task_count=Maximum(1) FilterExec: task_count=Maximum(1) RepartitionExec: task_count=Maximum(1) @@ -835,10 +823,8 @@ mod tests { CoalescePartitionsExec: task_count=Desired(3) NetworkBroadcastExec: task_count=Desired(3) BroadcastExec: task_count=Desired(3) - PartitionIsolatorExec: task_count=Desired(3) - DataSourceExec: task_count=Desired(3) - PartitionIsolatorExec: task_count=Desired(3) - DataSourceExec: task_count=Desired(3) + DataSourceExec: task_count=Desired(3) + DataSourceExec: task_count=Desired(3) ") } @@ -867,10 +853,8 @@ mod tests { CoalescePartitionsExec: task_count=Desired(3) NetworkBroadcastExec: task_count=Desired(3) BroadcastExec: task_count=Desired(3) - PartitionIsolatorExec: task_count=Desired(3) - DataSourceExec: task_count=Desired(3) - PartitionIsolatorExec: task_count=Desired(3) - DataSourceExec: task_count=Desired(3) + DataSourceExec: task_count=Desired(3) + DataSourceExec: task_count=Desired(3) "); } @@ -889,8 +873,7 @@ mod tests { NetworkBroadcastExec: task_count=Desired(3) BroadcastExec: task_count=Maximum(1) DataSourceExec: task_count=Maximum(1) - PartitionIsolatorExec: task_count=Desired(3) - DataSourceExec: task_count=Desired(3) + DataSourceExec: task_count=Desired(3) "); } @@ -909,8 +892,7 @@ mod tests { CoalescePartitionsExec: task_count=Maximum(1) NetworkBroadcastExec: task_count=Maximum(1) BroadcastExec: task_count=Desired(3) - PartitionIsolatorExec: task_count=Desired(3) - DataSourceExec: task_count=Desired(3) + DataSourceExec: task_count=Desired(3) DataSourceExec: task_count=Maximum(1) "); } @@ -951,12 +933,9 @@ mod tests { CoalescePartitionsExec: task_count=Desired(3) NetworkBroadcastExec: task_count=Desired(3) BroadcastExec: task_count=Desired(3) - PartitionIsolatorExec: task_count=Desired(3) - DataSourceExec: task_count=Desired(3) - PartitionIsolatorExec: task_count=Desired(3) - DataSourceExec: task_count=Desired(3) - PartitionIsolatorExec: task_count=Desired(3) - DataSourceExec: task_count=Desired(3) + DataSourceExec: task_count=Desired(3) + DataSourceExec: task_count=Desired(3) + DataSourceExec: task_count=Desired(3) ") } @@ -985,23 +964,19 @@ mod tests { CoalescePartitionsExec: task_count=Maximum(2) NetworkBroadcastExec: task_count=Maximum(2) BroadcastExec: task_count=Desired(3) - PartitionIsolatorExec: task_count=Desired(3) - DataSourceExec: task_count=Desired(3) - PartitionIsolatorExec: task_count=Maximum(2) - DataSourceExec: task_count=Maximum(2) + DataSourceExec: task_count=Desired(3) + DataSourceExec: task_count=Maximum(2) HashJoinExec: task_count=Maximum(1) CoalescePartitionsExec: task_count=Maximum(1) NetworkBroadcastExec: task_count=Maximum(1) BroadcastExec: task_count=Desired(3) - PartitionIsolatorExec: task_count=Desired(3) - DataSourceExec: task_count=Desired(3) + DataSourceExec: task_count=Desired(3) DataSourceExec: task_count=Maximum(1) HashJoinExec: task_count=Maximum(1) CoalescePartitionsExec: task_count=Maximum(1) NetworkBroadcastExec: task_count=Maximum(1) BroadcastExec: task_count=Desired(3) - PartitionIsolatorExec: task_count=Desired(3) - DataSourceExec: task_count=Desired(3) + DataSourceExec: task_count=Desired(3) DataSourceExec: task_count=Maximum(1) "); } diff --git a/src/distributed_planner/mod.rs b/src/distributed_planner/mod.rs index 0cf8aa58..f0d2ad23 100644 --- a/src/distributed_planner/mod.rs +++ b/src/distributed_planner/mod.rs @@ -12,5 +12,5 @@ mod task_estimator; pub use distributed_config::DistributedConfig; pub use network_boundary::{NetworkBoundary, NetworkBoundaryExt}; pub use session_state_builder_ext::SessionStateBuilderExt; +pub(crate) use task_estimator::set_distributed_task_estimator; pub use task_estimator::{TaskCountAnnotation, TaskEstimation, TaskEstimator, TaskRoutingContext}; -pub(crate) use task_estimator::{get_distributed_task_estimator, set_distributed_task_estimator}; diff --git a/src/distributed_planner/partial_reduce_below_network_shuffles.rs b/src/distributed_planner/partial_reduce_below_network_shuffles.rs index 21b35870..614af4ee 100644 --- a/src/distributed_planner/partial_reduce_below_network_shuffles.rs +++ b/src/distributed_planner/partial_reduce_below_network_shuffles.rs @@ -115,7 +115,7 @@ mod tests { let explain = sql_to_explain(r#"SELECT "RainToday", COUNT(*) FROM weather GROUP BY "RainToday""#) .await; - assert_snapshot!(explain, @" + assert_snapshot!(explain, @r" ┌───── DistributedExec ── Tasks: t0:[p0] │ CoalescePartitionsExec │ [Stage 2] => NetworkCoalesceExec: output_partitions=8, input_tasks=2 @@ -127,10 +127,9 @@ mod tests { └────────────────────────────────────────────────── ┌───── Stage 1 ── Tasks: t0:[p0..p7] t1:[p0..p7] t2:[p0..p7] │ AggregateExec: mode=PartialReduce, gby=[RainToday@0 as RainToday], aggr=[count(Int64(1))] - │ RepartitionExec: partitioning=Hash([RainToday@0], 8), input_partitions=1 + │ RepartitionExec: partitioning=Hash([RainToday@0], 8), input_partitions=3 │ AggregateExec: mode=Partial, gby=[RainToday@0 as RainToday], aggr=[count(Int64(1))] - │ PartitionIsolatorExec: tasks=3 partitions=3 - │ DataSourceExec: file_groups={3 groups: [[/testdata/weather/result-000000.parquet], [/testdata/weather/result-000001.parquet], [/testdata/weather/result-000002.parquet]]}, projection=[RainToday], file_type=parquet + │ DistributedLeafExec: DataSourceExec: file_groups={3 groups: [[/testdata/weather/result-000000.parquet], [/testdata/weather/result-000001.parquet], [/testdata/weather/result-000002.parquet]]}, projection=[RainToday], file_type=parquet └────────────────────────────────────────────────── "); } diff --git a/src/distributed_planner/task_estimator.rs b/src/distributed_planner/task_estimator.rs index 0e0c3857..553b644a 100644 --- a/src/distributed_planner/task_estimator.rs +++ b/src/distributed_planner/task_estimator.rs @@ -1,15 +1,16 @@ +use crate::DistributedConfig; use crate::config_extension_ext::set_distributed_option_extension; -use crate::{DistributedConfig, PartitionIsolatorExec}; +use crate::execution_plans::DistributedLeafExec; use TaskCountAnnotation::*; use datafusion::catalog::memory::DataSourceExec; -use datafusion::common::exec_err; use datafusion::config::ConfigOptions; -use datafusion::datasource::physical_plan::FileScanConfig; -use datafusion::error::{DataFusionError, Result}; +use datafusion::datasource::physical_plan::{FileGroup, FileScanConfig}; +use datafusion::error::Result; use datafusion::execution::TaskContext; -use datafusion::physical_plan::ExecutionPlan; +use datafusion::physical_plan::{ExecutionPlan, ExecutionPlanProperties}; use datafusion::prelude::SessionConfig; use delegate::delegate; +use std::collections::VecDeque; use std::fmt::Debug; use std::sync::Arc; use url::Url; @@ -228,18 +229,6 @@ pub(crate) fn set_distributed_task_estimator( } } -pub(crate) fn get_distributed_task_estimator( - cfg: &SessionConfig, -) -> Result, DataFusionError> { - let opts = cfg.options(); - let Some(distributed_cfg) = opts.extensions.get::() else { - return exec_err!("TaskEstimator not present in the session config"); - }; - let task_estimator: Arc = - Arc::new(distributed_cfg.__private_task_estimator.clone()); - Ok(Arc::clone(&task_estimator)) -} - /// [TaskEstimator] implementation that acts on [DataSourceExec] nodes that contain /// [FileScanConfig]s data sources (e.g., Parquet or CSV files). it will read the /// [DistributedConfig].`files_per_task` field and assigns as many tasks as needed so that @@ -280,23 +269,33 @@ impl TaskEstimator for FileScanConfigTaskEstimator { _cfg: &ConfigOptions, ) -> Option> { if task_count == 1 { - return Some(Arc::clone(plan)); + return None; } - // Based on the task count, attempt to scale up the partitions in the DataSourceExec by - // repartitioning it. This will result in a DataSourceExec with potentially a lot of - // partitions, but as we are going to wrap it with PartitionIsolatorExec, that's fine. let dse: &DataSourceExec = plan.as_any().downcast_ref()?; let file_scan: &FileScanConfig = dse.data_source().as_any().downcast_ref()?; - let mut new_file_scan = file_scan.clone(); - new_file_scan.file_groups.clear(); - for file_group in file_scan.file_groups.clone() { - new_file_scan - .file_groups - .extend(file_group.split_files(task_count)); + let mut file_scan_template = file_scan.clone(); + let mut file_groups = VecDeque::with_capacity(file_scan.file_groups.len() * task_count); + for file_group in file_scan_template.file_groups.drain(..) { + file_groups.extend(file_group.split_files(task_count)); } - let plan = DataSourceExec::from_data_source(new_file_scan); - Some(Arc::new(PartitionIsolatorExec::new(plan, task_count))) + let expected_partitions = plan.output_partitioning().partition_count(); + + let dle = DistributedLeafExec::new( + Arc::clone(plan), + (0..task_count).map(|_| { + let mut new_file_scan = file_scan_template.clone(); + while new_file_scan.file_groups.len() < expected_partitions { + match file_groups.pop_front() { + None => new_file_scan.file_groups.push(FileGroup::new(vec![])), + Some(fg) => new_file_scan.file_groups.push(fg), + } + } + DataSourceExec::from_data_source(new_file_scan) as _ + }), + ); + + Some(Arc::new(dle)) } } diff --git a/src/execution_plans/children_isolator_union.rs b/src/execution_plans/children_isolator_union.rs index af675202..0bf51378 100644 --- a/src/execution_plans/children_isolator_union.rs +++ b/src/execution_plans/children_isolator_union.rs @@ -5,6 +5,7 @@ use datafusion::arrow::datatypes::SchemaRef; use datafusion::common::{internal_err, plan_err}; use datafusion::error::DataFusionError; use datafusion::execution::{RecordBatchStream, SendableRecordBatchStream, TaskContext}; +use datafusion::physical_plan::empty::EmptyExec; use datafusion::physical_plan::metrics::{BaselineMetrics, ExecutionPlanMetricsSet, MetricsSet}; use datafusion::physical_plan::union::UnionExec; use datafusion::physical_plan::{ @@ -194,6 +195,33 @@ impl ChildrenIsolatorUnionExec { } counts } + + /// Trims out all the children that are going to be ignored based on the provided + /// task index. These children are replaced by [EmptyExec] as placeholders. + pub(crate) fn to_task_specialized(&self, task_i: usize) -> Self { + let mut children_to_keep = vec![]; + for (child_i, _) in &self.task_idx_map[task_i] { + children_to_keep.push(*child_i); + } + let new_children = self + .children + .iter() + .enumerate() + .map( + |(child_i, plan)| match children_to_keep.contains(&child_i) { + true => Arc::clone(plan), + false => Arc::new(EmptyExec::new(plan.schema())), + }, + ) + .collect_vec(); + Self { + children: new_children, + properties: self.properties.clone(), + metrics: self.metrics.clone(), + child_weights: self.child_weights.clone(), + task_idx_map: self.task_idx_map.clone(), + } + } } impl DisplayAs for ChildrenIsolatorUnionExec { diff --git a/src/execution_plans/distributed_leaf.rs b/src/execution_plans/distributed_leaf.rs new file mode 100644 index 00000000..eca60fc2 --- /dev/null +++ b/src/execution_plans/distributed_leaf.rs @@ -0,0 +1,159 @@ +use crate::DistributedTaskContext; +use datafusion::common::{Result, Statistics, exec_err, not_impl_err}; +use datafusion::execution::{SendableRecordBatchStream, TaskContext}; +use datafusion::physical_expr_common::metrics::MetricsSet; +use datafusion::physical_plan::{DisplayAs, DisplayFormatType, ExecutionPlan, PlanProperties}; +use std::any::Any; +use std::fmt::Formatter; +use std::sync::Arc; + +/// Represents a leaf node ready to be distributed across N tasks, where the variant of the node +/// belonging to each task is stored in a `Vec` of N positions. +/// +/// While sending this plan over the wire to a remote worker, only the appropriate variant is sent. +/// +/// This [ExecutionPlan] implementation is typically returned by +/// [crate::TaskEstimator::scale_up_leaf_node], which will be called for scaling up a node for +/// distribution. The process typically looks like this: +/// +/// 1. The distributed planner calls [crate::TaskEstimator::scale_up_leaf_node] providing a leaf +/// node and the amount of tasks in which it should be distributed: +/// +/// ```text +/// ┌──────────────┐ +/// │DataSourceExec│ + 3 tasks +/// └──────────────┘ +/// ``` +/// +/// 2. The [crate::TaskEstimator] implementation, either user provided or a default one, returns +/// a [DistributedLeafExec] adhering to this task count: +/// +/// ```text +/// ┌────────────────────────────────────────────────┐ +/// │ DistributedLeafExec │ +/// │ │ +/// │┌──────────────┐┌──────────────┐┌──────────────┐│ +/// ││DataSourceExec││DataSourceExec││DataSourceExec││ +/// ││ for task 0 ││ for task 1 ││ for task 2 ││ +/// │└──────────────┘└──────────────┘└──────────────┘│ +/// └────────────────────────────────────────────────┘ +/// ``` +/// +/// 3. The [crate::DistributedExec] node, upon being executed, will send the different variants of +/// the leaf node to the respective workers, instead of sending the full [DistributedLeafExec]: +/// +/// ```text +/// ┌──────────────────┐┌──────────────────┐┌──────────────────┐ +/// │ Worker 0 ││ Worker 1 ││ Worker 2 │ +/// │ ││ ││ │ +/// │ ... ││ ... ││ ... │ +/// │ ││ ││ │ +/// │ ┌──────────────┐ ││ ┌──────────────┐ ││ ┌──────────────┐ │ +/// │ │ SomeExec │ ││ │ SomeExec │ ││ │ SomeExec │ │ +/// │ │ │ ││ │ │ ││ │ │ │ +/// │ └──────────────┘ ││ └──────────────┘ ││ └──────────────┘ │ +/// │ ┌──────────────┐ ││ ┌──────────────┐ ││ ┌──────────────┐ │ +/// │ │DataSourceExec│ ││ │DataSourceExec│ ││ │DataSourceExec│ │ +/// │ │ for task 0 │ ││ │ for task 1 │ ││ │ for task 2 │ │ +/// │ └──────────────┘ ││ └──────────────┘ ││ └──────────────┘ │ +/// └──────────────────┘└──────────────────┘└──────────────────┘ +/// ``` +/// +/// This way, the different workers get to execute different versions of the same plan, each +/// handling its own range of non-overlapping data. +#[derive(Debug)] +pub struct DistributedLeafExec { + pub(crate) original: Arc, + pub(crate) variants: Vec>, +} + +impl DistributedLeafExec { + /// Builds a new [DistributedLeafExec] based on the provided original plan and its per-task + /// variants. Provided variants must expose the same partition count as the original plan. + pub fn new( + original: Arc, + variants: impl IntoIterator>, + ) -> Self { + Self { + original, + variants: variants.into_iter().collect(), + } + } + + /// Returns the variant belonging to provided task index. + pub(crate) fn to_task_specialized(&self, task_i: usize) -> Arc { + Arc::clone(&self.variants[task_i]) + } +} + +impl DisplayAs for DistributedLeafExec { + fn fmt_as(&self, t: DisplayFormatType, f: &mut Formatter) -> std::fmt::Result { + write!(f, "DistributedLeafExec: ")?; + self.original.fmt_as(t, f) + } +} + +impl ExecutionPlan for DistributedLeafExec { + fn name(&self) -> &str { + // Delegate to the original so that metrics lookups (which compare plan.name() against + // T::static_name()) work transparently with the wrapped plan type. For example, + // node_metrics:: finds DistributedLeafExec nodes in the distributed + // plan while also finding DataSourceExec nodes in the single-node plan. + self.original.name() + } + + fn static_name() -> &'static str + where + Self: Sized, + { + datafusion::catalog::memory::DataSourceExec::static_name() + } + + fn as_any(&self) -> &dyn Any { + self + } + + fn properties(&self) -> &Arc { + self.original.properties() + } + + fn children(&self) -> Vec<&Arc> { + vec![] + } + + fn with_new_children( + self: Arc, + _children: Vec>, + ) -> Result> { + not_impl_err!("DistributedLeafExec does not accept children") + } + + fn execute( + &self, + partition: usize, + context: Arc, + ) -> Result { + let d_ctx = DistributedTaskContext::from_ctx(&context); + if d_ctx.task_count == 1 { + return self.original.execute(partition, context); + } + + let Some(plan) = self.variants.get(d_ctx.task_index) else { + return exec_err!( + "Task index {} out of range for a per_task vector of length {}", + d_ctx.task_index, + self.variants.len() + ); + }; + + plan.execute(partition, context) + } + + fn metrics(&self) -> Option { + self.original.metrics() + } + + fn partition_statistics(&self, partition: Option) -> Result { + self.original.partition_statistics(partition) + } +} diff --git a/src/execution_plans/mod.rs b/src/execution_plans/mod.rs index ba3070dd..a1ea6316 100644 --- a/src/execution_plans/mod.rs +++ b/src/execution_plans/mod.rs @@ -1,11 +1,11 @@ mod broadcast; mod children_isolator_union; mod common; +mod distributed_leaf; mod metrics; mod network_broadcast; mod network_coalesce; mod network_shuffle; -mod partition_isolator; #[cfg(any(test, feature = "integration"))] pub mod benchmarks; @@ -13,8 +13,8 @@ pub mod benchmarks; pub use broadcast::BroadcastExec; pub(crate) use children_isolator_union::ChildWeight; pub use children_isolator_union::ChildrenIsolatorUnionExec; +pub use distributed_leaf::DistributedLeafExec; pub(crate) use metrics::MetricsWrapperExec; pub use network_broadcast::NetworkBroadcastExec; pub use network_coalesce::NetworkCoalesceExec; pub use network_shuffle::NetworkShuffleExec; -pub use partition_isolator::PartitionIsolatorExec; diff --git a/src/execution_plans/network_broadcast.rs b/src/execution_plans/network_broadcast.rs index 60f1c9a1..64b06e8f 100644 --- a/src/execution_plans/network_broadcast.rs +++ b/src/execution_plans/network_broadcast.rs @@ -21,8 +21,8 @@ use uuid::Uuid; /// /// This operator works with [BroadcastExec] which scales up partitions so each /// consumer task fetches a unique set of partition numbers. Each partition request -/// is sent to all stage tasks because PartitionIsolatorExec maps the same logical -/// partition to different actual data on each task. +/// is sent to all stage tasks because each task's leaf node is specialized to serve +/// a different slice of the data for the same logical partition number. /// /// Here are some examples of how [NetworkBroadcastExec] distributes data: /// diff --git a/src/execution_plans/partition_isolator.rs b/src/execution_plans/partition_isolator.rs deleted file mode 100644 index 4207c416..00000000 --- a/src/execution_plans/partition_isolator.rs +++ /dev/null @@ -1,234 +0,0 @@ -use crate::DistributedTaskContext; -use crate::common::require_one_child; -use datafusion::execution::TaskContext; -use datafusion::physical_expr_common::metrics::MetricsSet; -use datafusion::physical_plan::ExecutionPlanProperties; -use datafusion::physical_plan::metrics::{ExecutionPlanMetricsSet, MetricBuilder}; -use datafusion::physical_plan::stream::RecordBatchStreamAdapter; -use datafusion::{ - error::Result, - execution::SendableRecordBatchStream, - physical_plan::{ - DisplayAs, DisplayFormatType, EmptyRecordBatchStream, ExecutionPlan, Partitioning, - PlanProperties, - }, -}; -use futures::TryStreamExt; -use std::{fmt::Formatter, sync::Arc}; - -/// This is a simple [ExecutionPlan] that isolates a set of N partitions from an input -/// [ExecutionPlan] with M partitions, where N < M. -/// -/// It will advertise to upper nodes that only N partitions are available, even though the child -/// plan might have more. -/// -/// The partitions exposed to upper nodes depend on: -/// 1. the amount of tasks in the stage in which [PartitionIsolatorExec] is in. -/// 2. the task index executing the [PartitionIsolatorExec] node. -/// -/// ```text -/// ┌───────────────────────────┐ ■ -/// │ NetworkCoalesceExec │ │ -/// │ (task 1) │ │ -/// └┬─┬┬─┬┬─┬┬─┬┬─┬┬─┬┬─┬┬─┬┬─┬┘ Stage N+1 -/// │1││2││3││4││5││6││7││8││9│ │ -/// └─┘└─┘└─┘└─┘└─┘└─┘└─┘└─┘└─┘ │ -/// ▲ ▲ ▲ ▲ ▲ ▲ ▲ ▲ ▲ ■ -/// ┌──┬──┬───────────────────────┴──┴──┘ │ │ │ └──┴──┴──────────────────────┬──┬──┐ -/// │ │ │ │ │ │ │ │ │ ■ -/// ┌─┐┌─┐┌─┐ ┌─┐┌─┐┌─┐ ┌─┐┌─┐┌─┐ │ -/// │1││2││3│ │4││5││6│ │7││8││9│ │ -/// ┌┴─┴┴─┴┴─┴──────────────────┐ ┌─────────┴─┴┴─┴┴─┴─────────┐ ┌──────────────────┴─┴┴─┴┴─┴┐ │ -/// │ PartitionIsolatorExec │ │ PartitionIsolatorExec │ │ PartitionIsolatorExec │ │ -/// │ (task 1) │ │ (task 2) │ │ (task 3) │ │ -/// └─▲──▲──▲───────────────────┘ └──────────▲──▲──▲──────────┘ └───────────────────▲──▲──▲─┘ │ -/// │ │ │ ◌ ◌ ◌ ◌ ◌ ◌ ◌ ◌ ◌ │ │ │ ◌ ◌ ◌ ◌ ◌ ◌ ◌ ◌ ◌ │ │ │ Stage N -/// │ │ │ │ │ │ │ │ │ │ │ │ │ │ │ │ │ │ │ │ │ │ │ │ │ │ │ │ -/// ┌─┐┌─┐┌─┐┌─┐┌─┐┌─┐┌─┐┌─┐┌─┐ ┌─┐┌─┐┌─┐┌─┐┌─┐┌─┐┌─┐┌─┐┌─┐ ┌─┐┌─┐┌─┐┌─┐┌─┐┌─┐┌─┐┌─┐┌─┐ │ -/// │1││2││3││4││5││6││7││8││9│ │1││2││3││4││5││6││7││8││9│ │1││2││3││4││5││6││7││8││9│ │ -/// ┌┴─┴┴─┴┴─┴┴─┴┴─┴┴─┴┴─┴┴─┴┴─┴┐ ┌┴─┴┴─┴┴─┴┴─┴┴─┴┴─┴┴─┴┴─┴┴─┴┐ ┌┴─┴┴─┴┴─┴┴─┴┴─┴┴─┴┴─┴┴─┴┴─┴┐ │ -/// │ DataSourceExec │ │ DataSourceExec │ │ DataSourceExec │ │ -/// │ (task 1) │ │ (task 2) │ │ (task 3) │ │ -/// └───────────────────────────┘ └───────────────────────────┘ └───────────────────────────┘ ■ -/// ``` -#[derive(Debug)] -pub struct PartitionIsolatorExec { - pub(crate) input: Arc, - pub(crate) properties: Arc, - pub(crate) n_tasks: usize, - pub(crate) metrics: ExecutionPlanMetricsSet, -} - -impl PartitionIsolatorExec { - pub fn new(input: Arc, n_tasks: usize) -> Self { - let input_partitions = input.properties().partitioning.partition_count(); - - let partition_count = Self::partition_groups(input_partitions, n_tasks)[0].len(); - - let properties = ::clone(&input.properties().clone()) - .with_partitioning(Partitioning::UnknownPartitioning(partition_count)); - - Self { - input: input.clone(), - properties: Arc::new(properties), - n_tasks, - metrics: ExecutionPlanMetricsSet::new(), - } - } - - fn partition_groups(input_partitions: usize, n_tasks: usize) -> Vec> { - let q = input_partitions / n_tasks; - let r = input_partitions % n_tasks; - - let mut off = 0; - (0..n_tasks) - .map(|i| q + if i < r { 1 } else { 0 }) - .map(|n| { - let result = (off..(off + n)).collect(); - off += n; - result - }) - .collect() - } - - pub(crate) fn partition_group( - input_partitions: usize, - task_i: usize, - n_tasks: usize, - ) -> Vec { - Self::partition_groups(input_partitions, n_tasks)[task_i].clone() - } -} - -impl DisplayAs for PartitionIsolatorExec { - fn fmt_as(&self, t: DisplayFormatType, f: &mut Formatter) -> std::fmt::Result { - let input_partitions = self.input.output_partitioning().partition_count(); - write!(f, "PartitionIsolatorExec: ")?; - - match t { - DisplayFormatType::Verbose => { - let partition_groups = Self::partition_groups(input_partitions, self.n_tasks); - - let n: usize = partition_groups.iter().map(|v| v.len()).sum(); - let mut partitions = vec![]; - for _ in 0..self.n_tasks { - partitions.push(vec!["__".to_string(); n]); - } - - for (i, partition_group) in partition_groups.iter().enumerate() { - for (j, p) in partition_group.iter().enumerate() { - partitions[i][*p] = format!("p{j}") - } - write!(f, "t{i}:[{}] ", partitions[i].join(","))?; - } - } - _ => { - write!(f, "tasks={} partitions={}", self.n_tasks, input_partitions)?; - } - } - Ok(()) - } -} - -impl ExecutionPlan for PartitionIsolatorExec { - fn name(&self) -> &str { - "PartitionIsolatorExec" - } - - fn as_any(&self) -> &dyn std::any::Any { - self - } - - fn properties(&self) -> &Arc { - &self.properties - } - - fn children(&self) -> Vec<&Arc> { - vec![&self.input] - } - - fn with_new_children( - self: Arc, - children: Vec>, - ) -> Result> { - let input = require_one_child(children)?; - Ok(Arc::new(Self::new(input, self.n_tasks))) - } - - fn execute( - &self, - partition: usize, - context: Arc, - ) -> Result { - let task_context = DistributedTaskContext::from_ctx(&context); - - let metric = MetricBuilder::new(&self.metrics).output_rows(partition); - let input_partitions = self.input.output_partitioning().partition_count(); - - let partition_group = Self::partition_group( - input_partitions, - task_context.task_index, - task_context.task_count, - ); - - // if our partition group is [7,8,9] and we are asked for parittion 1, - // then look up that index in our group and execute that partition, in this - // example partition 8 - - match partition_group.get(partition) { - Some(actual_partition_number) => { - if *actual_partition_number >= input_partitions { - //trace!("{} returning empty stream", ctx_name); - Ok(Box::pin(EmptyRecordBatchStream::new(self.input.schema())) - as SendableRecordBatchStream) - } else { - Ok(Box::pin(RecordBatchStreamAdapter::new( - self.schema(), - self.input - .execute(*actual_partition_number, context)? - .inspect_ok(move |v| metric.add(v.num_rows())), - ))) - } - } - None => Ok(Box::pin(EmptyRecordBatchStream::new(self.input.schema())) - as SendableRecordBatchStream), - } - } - - fn metrics(&self) -> Option { - Some(self.metrics.clone_inner()) - } -} - -#[cfg(test)] -mod tests { - use super::*; - - #[test] - fn test_partition_groups() { - assert_eq!( - PartitionIsolatorExec::partition_groups(2, 1), - vec![vec![0, 1]] - ); - assert_eq!( - PartitionIsolatorExec::partition_groups(6, 2), - vec![vec![0, 1, 2], vec![3, 4, 5]] - ); - assert_eq!( - PartitionIsolatorExec::partition_groups(6, 3), - vec![vec![0, 1], vec![2, 3], vec![4, 5]] - ); - assert_eq!( - PartitionIsolatorExec::partition_groups(6, 4), - vec![vec![0, 1], vec![2, 3], vec![4], vec![5]] - ); - assert_eq!( - PartitionIsolatorExec::partition_groups(10, 3), - vec![vec![0, 1, 2, 3], vec![4, 5, 6], vec![7, 8, 9]] - ); - assert_eq!( - PartitionIsolatorExec::partition_groups(10, 4), - vec![vec![0, 1, 2], vec![3, 4, 5], vec![6, 7], vec![8, 9]] - ); - } -} diff --git a/src/lib.rs b/src/lib.rs index e93bb98d..7d5bf1fd 100644 --- a/src/lib.rs +++ b/src/lib.rs @@ -27,8 +27,8 @@ pub use distributed_planner::{ TaskCountAnnotation, TaskEstimation, TaskEstimator, TaskRoutingContext, }; pub use execution_plans::{ - BroadcastExec, NetworkBroadcastExec, NetworkCoalesceExec, NetworkShuffleExec, - PartitionIsolatorExec, + BroadcastExec, DistributedLeafExec, NetworkBroadcastExec, NetworkCoalesceExec, + NetworkShuffleExec, }; pub use metrics::{ AvgLatencyMetric, BytesCounterMetric, BytesMetricExt, DISTRIBUTED_DATAFUSION_TASK_ID_LABEL, diff --git a/src/metrics/task_metrics_collector.rs b/src/metrics/task_metrics_collector.rs index 1cf6d90a..f18a4206 100644 --- a/src/metrics/task_metrics_collector.rs +++ b/src/metrics/task_metrics_collector.rs @@ -315,56 +315,4 @@ mod tests { ); } } - - /// Test that verifies PartitionIsolatorExec nodes are preserved during metrics collection. - /// This tests the corner case where PartitionIsolatorExec nodes (which have no metrics) - /// must still be included in the metrics collection to maintain correct node-to-metric mapping. - #[tokio::test] - async fn test_metrics_collection_with_partition_isolator() { - let ctx = make_test_ctx().await; - ctx.sql("SET distributed.children_isolator_unions=true;") - .await - .unwrap(); - - // Use weather dataset (already available) - register_parquet_tables(&ctx).await.unwrap(); - - // UNION query that creates PartitionIsolatorExec nodes - let query = r#" - SELECT "MinTemp" FROM weather WHERE "RainToday" = 'yes' - UNION ALL - SELECT "MaxTemp" FROM weather WHERE "RainToday" = 'no' - "#; - - let df = ctx.sql(query).await.unwrap(); - let plan = df.create_physical_plan().await.unwrap(); - execute_plan(plan.clone(), &ctx).await; - - let dist_exec = plan - .as_any() - .downcast_ref::() - .expect("expected DistributedExec"); - - let (stages, expected_task_keys) = get_stages_and_task_keys(dist_exec); - - // Verify all nodes (including PartitionIsolatorExec) are preserved in metrics collection - for expected_task_key in expected_task_keys { - let actual_metrics = dist_exec - .task_metrics - .as_ref() - .unwrap() - .get(&expected_task_key) - .unwrap(); - let stage = stages.get(&(expected_task_key.stage_id as usize)).unwrap(); - let stage_plan = stage.local_plan().unwrap(); - - // Verify metrics count matches - this ensures all nodes are included in metrics collection - // regardless of whether they have metrics or not (some nodes may have empty metrics sets) - assert_eq!( - actual_metrics.len(), - count_plan_nodes_up_to_network_boundary(stage_plan), - "Metrics count must match plan nodes for stage {expected_task_key:?}" - ); - } - } } diff --git a/src/metrics/task_metrics_rewriter.rs b/src/metrics/task_metrics_rewriter.rs index 4b442f0a..92e71ed8 100644 --- a/src/metrics/task_metrics_rewriter.rs +++ b/src/metrics/task_metrics_rewriter.rs @@ -1,4 +1,5 @@ -use crate::common::serialize_uuid; +use crate::DistributedTaskContext; +use crate::common::{TreeNodeExt, serialize_uuid}; use crate::coordinator::{DistributedExec, MetricsStore}; use crate::distributed_planner::NetworkBoundaryExt; use crate::execution_plans::MetricsWrapperExec; @@ -7,6 +8,7 @@ use crate::metrics::collect_plan_metrics; use crate::metrics::proto::metrics_set_proto_to_df; use crate::stage::{LocalStage, Stage}; use crate::worker::generated::worker::TaskKey; +use datafusion::common::HashMap; use datafusion::common::plan_err; use datafusion::common::tree_node::Transformed; use datafusion::common::tree_node::TreeNode; @@ -213,62 +215,75 @@ pub fn stage_metrics_rewriter( metrics_collection: Arc, format: DistributedMetricsFormat, ) -> Result> { - let mut node_idx = 0; + // Phase 1 — accumulate per-task metrics into a map keyed by node identity. + // + // For each task, the plan is traversed with `apply_with_dt_ctx`, which visits nodes in pre-order + // traversal, ignoring branches that do not belong to the recursed DistributedTaskContext + // (e.g., because of the presence of ChildrenIsolatorUnionExec). + // + // The raw allocation address of each `Arc` as the node key. + // The planning plan is not modified between traversals, so these addresses are stable. + let mut node_metrics_map: HashMap = HashMap::new(); + + for task_id in 0..stage.tasks { + let d_ctx = DistributedTaskContext { + task_index: task_id, + task_count: stage.tasks, + }; + let task_key = TaskKey { + query_id: serialize_uuid(&stage.query_id), + stage_id: stage.num as u64, + task_number: task_id as u64, + }; + let Some(task_metrics) = metrics_collection.get(&task_key) else { + return internal_err!( + "not enough metrics provided to rewrite task: missing metrics for task {} in stage {}", + task_id, + stage.num + ); + }; + + let mut per_task_counter = 0usize; + stage.plan.apply_with_dt_ctx(d_ctx, |node, _ctx| { + if per_task_counter >= task_metrics.len() { + return internal_err!( + "not enough metrics provided to rewrite task: {} metrics provided", + task_metrics.len() + ); + } - Arc::clone(&stage.plan).transform_down(|plan| { - // Collect metrics for this node. It should contain metrics from each task. - let mut stage_metrics = MetricsSet::new(); + let node_metrics_protos = task_metrics[per_task_counter].clone(); + let mut node_metrics = metrics_set_proto_to_df(&node_metrics_protos)?; + let rewrite_ctx = format.to_rewrite_ctx(task_id as u64); + node_metrics = rewrite_ctx.maybe_rewrite_node_metics(node_metrics); - for task_id in 0..stage.tasks { - let task_key = TaskKey { - query_id: serialize_uuid(&stage.query_id), - stage_id: stage.num as u64, - task_number: task_id as u64, - }; - match metrics_collection.get(&task_key) { - Some(task_metrics) => { - if node_idx >= task_metrics.len() { - return internal_err!( - "not enough metrics provided to rewrite task: {} metrics provided", - task_metrics.len() - ); - } - let node_metrics_protos = task_metrics[node_idx].clone(); - let mut node_metrics = metrics_set_proto_to_df(&node_metrics_protos)?; - - let rewrite_ctx = format.to_rewrite_ctx(task_id as u64); - node_metrics = rewrite_ctx.maybe_rewrite_node_metics(node_metrics); - - for metric in node_metrics.iter().map(Arc::clone) { - stage_metrics.push(metric); - } - } - None => { - return internal_err!( - "not enough metrics provided to rewrite task: missing metrics for task {} in stage {}", - task_id, - stage.num - ); - } + let id = Arc::as_ptr(node) as *const () as usize; + let entry = node_metrics_map.entry(id).or_default(); + for metric in node_metrics.iter().map(Arc::clone) { + entry.push(metric); } - } - node_idx += 1; + per_task_counter += 1; + Ok(TreeNodeRecursion::Continue) + })?; + } - let wrapped_plan_node: Arc = Arc::new(MetricsWrapperExec::new( - plan.clone(), - stage_metrics, - )); - Ok(Transformed::new( - wrapped_plan_node, - true, - if plan.is_network_boundary() { - TreeNodeRecursion::Jump - } else { - TreeNodeRecursion::Continue - } - )) - }).map(|v| v.data) + // Phase 2 — rewrite: wrap every node with its accumulated metrics. + // Nodes that were inactive for all tasks (never visited in phase 1) get empty metrics. + Arc::clone(&stage.plan) + .transform_down(|plan| { + let id = Arc::as_ptr(&plan) as *const () as usize; + let metrics = node_metrics_map.remove(&id).unwrap_or_default(); + Ok(Transformed::new( + Arc::new(MetricsWrapperExec::new(plan.clone(), metrics)), + true, + match plan.is_network_boundary() { + true => TreeNodeRecursion::Jump, + false => TreeNodeRecursion::Continue, + }, + )) + }) + .map(|v| v.data) } #[cfg(test)] diff --git a/src/protobuf/distributed_codec.rs b/src/protobuf/distributed_codec.rs index 3ba3ac0f..c6c03e55 100644 --- a/src/protobuf/distributed_codec.rs +++ b/src/protobuf/distributed_codec.rs @@ -1,4 +1,5 @@ use super::get_distributed_user_codecs; +use crate::NetworkShuffleExec; use crate::common::{deserialize_uuid, serialize_uuid}; use crate::execution_plans::{ BroadcastExec, ChildWeight, ChildrenIsolatorUnionExec, NetworkBroadcastExec, @@ -7,7 +8,6 @@ use crate::execution_plans::{ use crate::stage::{LocalStage, RemoteStage, Stage}; use crate::worker::WorkerConnectionPool; use crate::{DistributedTaskContext, NetworkBoundary}; -use crate::{NetworkShuffleExec, PartitionIsolatorExec}; use bytes::Bytes; use datafusion::arrow::datatypes::Schema; use datafusion::arrow::datatypes::SchemaRef; @@ -144,21 +144,6 @@ impl PhysicalExtensionCodec for DistributedCodec { parse_stage_proto(input_stage, inputs)?, ))) } - DistributedExecNode::PartitionIsolator(PartitionIsolatorExecProto { n_tasks }) => { - if inputs.len() != 1 { - return Err(proto_error(format!( - "PartitionIsolatorExec expects exactly one child, got {}", - inputs.len() - ))); - } - - let child = inputs.first().unwrap(); - - Ok(Arc::new(PartitionIsolatorExec::new( - child.clone(), - n_tasks as usize, - ))) - } DistributedExecNode::NetworkBroadcast(NetworkBroadcastExecProto { schema, partitioning, @@ -305,16 +290,6 @@ impl PhysicalExtensionCodec for DistributedCodec { node: Some(DistributedExecNode::NetworkCoalesceTasks(inner)), }; - wrapper.encode(buf).map_err(|e| proto_error(format!("{e}"))) - } else if let Some(node) = node.as_any().downcast_ref::() { - let inner = PartitionIsolatorExecProto { - n_tasks: node.n_tasks as u64, - }; - - let wrapper = DistributedExecProto { - node: Some(DistributedExecNode::PartitionIsolator(inner)), - }; - wrapper.encode(buf).map_err(|e| proto_error(format!("{e}"))) } else if let Some(node) = node.as_any().downcast_ref::() { let inner = NetworkBroadcastExecProto { @@ -414,8 +389,7 @@ pub enum DistributedExecNode { NetworkHashShuffle(NetworkShuffleExecProto), #[prost(message, tag = "2")] NetworkCoalesceTasks(NetworkCoalesceExecProto), - #[prost(message, tag = "3")] - PartitionIsolator(PartitionIsolatorExecProto), + // reserved 3 #[prost(message, tag = "4")] ChildrenIsolatorUnion(ChildrenIsolatorUnionExecProto), #[prost(message, tag = "5")] @@ -424,12 +398,6 @@ pub enum DistributedExecNode { Broadcast(BroadcastExecProto), } -#[derive(Clone, PartialEq, ::prost::Message)] -pub struct PartitionIsolatorExecProto { - #[prost(uint64, tag = "1")] - pub n_tasks: u64, -} - /// Protobuf representation of the [NetworkShuffleExec] physical node. It serves as /// an intermediate format for serializing/deserializing [NetworkShuffleExec] nodes /// to send them over the wire. @@ -563,13 +531,12 @@ mod tests { use datafusion::arrow::datatypes::{DataType, Field}; use datafusion::physical_expr::LexOrdering; use datafusion::physical_plan::empty::EmptyExec; + use datafusion::prelude::SessionContext; use datafusion::{ physical_expr::{Partitioning, PhysicalSortExpr, expressions::Column, expressions::col}, physical_plan::{ExecutionPlan, displayable, sorts::sort::SortExec, union::UnionExec}, }; - use datafusion::prelude::SessionContext; - fn empty_exec() -> Arc { Arc::new(EmptyExec::new(SchemaRef::new(Schema::empty()))) } @@ -623,30 +590,7 @@ mod tests { } #[test] - fn test_roundtrip_isolator_flight() -> datafusion::common::Result<()> { - let codec = DistributedCodec; - let ctx = create_context(); - - let schema = schema_i32("b"); - let flight = Arc::new(new_network_hash_shuffle_exec( - Partitioning::UnknownPartitioning(1), - schema, - dummy_stage(), - )); - - let plan: Arc = Arc::new(PartitionIsolatorExec::new(flight.clone(), 1)); - - let mut buf = Vec::new(); - codec.try_encode(plan.clone(), &mut buf)?; - - let decoded = codec.try_decode(&buf, &[flight], &ctx)?; - assert_eq!(repr(&plan), repr(&decoded)); - - Ok(()) - } - - #[test] - fn test_roundtrip_isolator_union() -> datafusion::common::Result<()> { + fn test_roundtrip_union() -> datafusion::common::Result<()> { let codec = DistributedCodec; let ctx = create_context(); @@ -663,7 +607,8 @@ mod tests { )); let union = UnionExec::try_new(vec![left.clone(), right.clone()])?; - let plan: Arc = Arc::new(PartitionIsolatorExec::new(union.clone(), 1)); + let plan: Arc = + Arc::new(NetworkCoalesceExec::try_new(union.clone(), 1, 1)?); let mut buf = Vec::new(); codec.try_encode(plan.clone(), &mut buf)?; @@ -675,7 +620,7 @@ mod tests { } #[test] - fn test_roundtrip_isolator_sort_flight() -> datafusion::common::Result<()> { + fn test_roundtrip_sort_flight() -> datafusion::common::Result<()> { let codec = DistributedCodec; let ctx = create_context(); @@ -695,7 +640,8 @@ mod tests { flight.clone(), )); - let plan: Arc = Arc::new(PartitionIsolatorExec::new(sort.clone(), 1)); + let plan: Arc = + Arc::new(NetworkCoalesceExec::try_new(sort.clone(), 1, 1)?); let mut buf = Vec::new(); codec.try_encode(plan.clone(), &mut buf)?; @@ -771,7 +717,7 @@ mod tests { } #[test] - fn test_roundtrip_isolator_flight_coalesce() -> datafusion::common::Result<()> { + fn test_roundtrip_flight_coalesce() -> datafusion::common::Result<()> { let codec = DistributedCodec; let ctx = create_context(); @@ -782,7 +728,8 @@ mod tests { dummy_stage(), )); - let plan: Arc = Arc::new(PartitionIsolatorExec::new(flight.clone(), 1)); + let plan: Arc = + Arc::new(NetworkCoalesceExec::try_new(flight.clone(), 1, 1)?); let mut buf = Vec::new(); codec.try_encode(plan.clone(), &mut buf)?; @@ -794,7 +741,7 @@ mod tests { } #[test] - fn test_roundtrip_isolator_union_coalesce() -> datafusion::common::Result<()> { + fn test_roundtrip_union_coalesce() -> datafusion::common::Result<()> { let codec = DistributedCodec; let ctx = create_context(); @@ -811,7 +758,8 @@ mod tests { )); let union = UnionExec::try_new(vec![left.clone(), right.clone()])?; - let plan: Arc = Arc::new(PartitionIsolatorExec::new(union.clone(), 3)); + let plan: Arc = + Arc::new(NetworkCoalesceExec::try_new(union.clone(), 1, 1)?); let mut buf = Vec::new(); codec.try_encode(plan.clone(), &mut buf)?; diff --git a/src/stage.rs b/src/stage.rs index 5c584d7d..7f170974 100644 --- a/src/stage.rs +++ b/src/stage.rs @@ -1,7 +1,6 @@ use crate::coordinator::DistributedExec; use crate::execution_plans::NetworkCoalesceExec; use crate::metrics::DISTRIBUTED_DATAFUSION_TASK_ID_LABEL; -use crate::{NetworkShuffleExec, PartitionIsolatorExec}; use datafusion::common::{HashMap, config_err}; use datafusion::common::{exec_err, plan_err}; use datafusion::error::Result; @@ -157,7 +156,7 @@ impl DistributedTaskContext { } } -use crate::{DistributedMetricsFormat, rewrite_distributed_plan_with_metrics}; +use crate::{DistributedMetricsFormat, NetworkShuffleExec, rewrite_distributed_plan_with_metrics}; use crate::{NetworkBoundary, NetworkBoundaryExt}; use datafusion::common::DataFusionError; use datafusion::physical_expr::Partitioning; @@ -565,7 +564,7 @@ fn display_single_task(stage: &Stage, task_i: usize) -> Result { fn display_plan( plan: &Arc, task_i: usize, - n_tasks: usize, + _n_tasks: usize, stage_num: usize, ) -> Result { // draw all plans @@ -596,31 +595,14 @@ fn display_plan( usize, ); let mut queue: VecDeque = VecDeque::from([(plan, None, 0usize)]); - let mut isolator_partition_group = None; node_index = 0; while let Some((plan, maybe_parent, parent_idx)) = queue.pop_front() { node_index += 1; - if let Some(node) = plan.as_any().downcast_ref::() { - isolator_partition_group = Some(PartitionIsolatorExec::partition_group( - node.input.output_partitioning().partition_count(), - task_i, - n_tasks, - )); - } if let Some(parent) = maybe_parent { let output_partitions = plan.output_partitioning().partition_count(); for i in 0..output_partitions { - let mut style = ""; - if plan.as_any().is::() { - if i >= isolator_partition_group.as_ref().map_or(0, |v| v.len()) { - style = "[style=dotted, label=empty]"; - } - } else if let Some(partition_group) = &isolator_partition_group - && !partition_group.contains(&i) - { - style = "[style=invis]"; - } + let style = ""; writeln!( f, diff --git a/src/test_utils/routing.rs b/src/test_utils/routing.rs index 47b3a279..fbf41158 100644 --- a/src/test_utils/routing.rs +++ b/src/test_utils/routing.rs @@ -22,8 +22,9 @@ use std::{any::Any, fmt::Formatter, sync::Arc}; use tonic::async_trait; use url::Url; +use crate::execution_plans::DistributedLeafExec; use crate::worker::LocalWorkerContext; -use crate::{DistributedTaskContext, PartitionIsolatorExec, TaskEstimation, TaskEstimator}; +use crate::{DistributedTaskContext, TaskEstimation, TaskEstimator}; // Table function that creates a `URLEmitterExec` for testing task routing. #[derive(Debug)] @@ -108,6 +109,10 @@ pub struct URLEmitterExec { task_count: usize, tag: String, projection: Option>, + /// How many of the visible partitions actually produce data. Partitions at index + /// `>= effective_partitions` return an empty stream, letting tail tasks in an uneven + /// distribution produce fewer rows without changing the visible partition count. + effective_partitions: usize, } impl URLEmitterExec { @@ -131,8 +136,24 @@ impl URLEmitterExec { task_count, tag, projection, + effective_partitions: partitions, } } + + fn with_partitions(mut self, visible: usize, effective: usize) -> Self { + let schema = match &self.projection { + Some(indices) => Arc::new(url_emitter_schema().project(indices).unwrap()), + None => url_emitter_schema(), + }; + self.properties = Arc::new(PlanProperties::new( + EquivalenceProperties::new(schema), + datafusion::physical_plan::Partitioning::UnknownPartitioning(visible), + datafusion::physical_plan::execution_plan::EmissionType::Incremental, + datafusion::physical_plan::execution_plan::Boundedness::Bounded, + )); + self.effective_partitions = effective; + self + } } impl DisplayAs for URLEmitterExec { @@ -173,18 +194,23 @@ impl ExecutionPlan for URLEmitterExec { fn execute( &self, - _partition: usize, + partition: usize, context: Arc, ) -> datafusion::error::Result { + let schema = match &self.projection { + Some(indices) => Arc::new(url_emitter_schema().project(indices).unwrap()), + None => url_emitter_schema(), + }; + // Partitions beyond the effective range are padding; return empty. + if partition >= self.effective_partitions { + use datafusion::physical_plan::empty::EmptyExec; + return EmptyExec::new(schema).execute(0, context); + } let distributed_ctx = DistributedTaskContext::from_ctx(&context); let local_worker_ctx = context .session_config() .get_extension::() .expect("URLEmitterExec requires LocalWorkerContext during distributed execution"); - let schema = match &self.projection { - Some(indices) => Arc::new(url_emitter_schema().project(indices).unwrap()), - None => url_emitter_schema(), - }; let mut columns: Vec> = vec![ Arc::new(Int64Array::from(vec![distributed_ctx.task_count as i64])), Arc::new(Int64Array::from(vec![distributed_ctx.task_index as i64])), @@ -244,9 +270,26 @@ impl TaskEstimator for URLEmitterTaskEstimator { task_count: usize, _cfg: &datafusion::config::ConfigOptions, ) -> Option> { - let plan: Arc = - Arc::new(PartitionIsolatorExec::new(Arc::clone(plan), task_count)); - Some(plan) + let exec = plan.as_any().downcast_ref::()?; + let p = exec.properties.partitioning.partition_count(); + // Expose ceil(p / task_count) partitions per task so the network boundary + // computes a consistent output partition count. + let visible = p.div_ceil(task_count).max(1); + let template = Arc::new(exec.clone().with_partitions(visible, visible)); + + // Distribute p partitions across task_count tasks using the floor/remainder algorithm: + // the first (p % task_count) tasks get ceil(p/task_count) effective partitions, the rest + // get floor — using the floor/remainder distribution algorithm. + let q = p / task_count; + let r = p % task_count; + let per_task: Vec> = (0..task_count) + .map(|task_idx| { + let effective = q + if task_idx < r { 1 } else { 0 }; + Arc::new(exec.clone().with_partitions(visible, effective)) as _ + }) + .collect(); + + Some(Arc::new(DistributedLeafExec::new(template as _, per_task))) } fn route_tasks( @@ -271,6 +314,8 @@ struct URLEmitterExecProto { tag: String, #[prost(uint64, repeated, tag = "4")] projection: Vec, + #[prost(uint64, tag = "5")] + effective_partitions: u64, } #[derive(Debug)] @@ -292,16 +337,22 @@ impl PhysicalExtensionCodec for URLEmitterExtensionCodec { let proto = URLEmitterExecProto::decode(buf) .map_err(|e| proto_error(format!("Failed to decode URLEmitterExecProto: {e}")))?; - Ok(Arc::new(URLEmitterExec::new( - proto.partitions as usize, - proto.task_count as usize, - proto.tag, - if proto.projection.is_empty() { - None - } else { - Some(proto.projection.into_iter().map(|v| v as usize).collect()) - }, - ))) + Ok(Arc::new( + URLEmitterExec::new( + proto.partitions as usize, + proto.task_count as usize, + proto.tag, + if proto.projection.is_empty() { + None + } else { + Some(proto.projection.into_iter().map(|v| v as usize).collect()) + }, + ) + .with_partitions( + proto.partitions as usize, + proto.effective_partitions as usize, + ), + )) } fn try_encode(&self, node: Arc, buf: &mut Vec) -> Result<()> { @@ -320,6 +371,7 @@ impl PhysicalExtensionCodec for URLEmitterExtensionCodec { .into_iter() .map(|v| v as u64) .collect(), + effective_partitions: exec.effective_partitions as u64, }; proto diff --git a/src/worker/impl_execute_task.rs b/src/worker/impl_execute_task.rs index 13e3a4a8..a902ad06 100644 --- a/src/worker/impl_execute_task.rs +++ b/src/worker/impl_execute_task.rs @@ -1,14 +1,14 @@ -use crate::DistributedConfig; -use crate::common::{map_last_stream, on_drop_stream}; +use crate::common::{TreeNodeExt, map_last_stream, on_drop_stream}; use crate::metrics::proto::df_metrics_set_to_proto; use crate::protobuf::datafusion_error_to_tonic_status; use crate::worker::generated::worker::{FlightAppMetadata, PreOrderTaskMetrics}; use crate::worker::worker_service::{TaskDataEntries, Worker}; +use crate::{DistributedConfig, DistributedTaskContext}; use arrow_flight::encode::{DictionaryHandling, FlightDataEncoder, FlightDataEncoderBuilder}; use arrow_flight::error::FlightError; use arrow_select::dictionary::garbage_collect_any_dictionary; use datafusion::arrow::array::{Array, AsArray, RecordBatch, RecordBatchOptions}; -use datafusion::common::tree_node::{TreeNode, TreeNodeRecursion}; +use datafusion::common::tree_node::TreeNodeRecursion; use datafusion::common::{Result, exec_err, internal_err}; use crate::worker::generated::worker::ExecuteTaskRequest; @@ -62,6 +62,7 @@ pub(crate) async fn execute_local_task( let plan = task_data.plan; let task_ctx = task_data.task_ctx; let d_cfg = DistributedConfig::from_config_options(task_ctx.session_config().options())?; + let d_ctx = DistributedTaskContext::from_ctx(&task_ctx).as_ref().clone(); let send_metrics = d_cfg.collect_metrics; let partition_count = plan.properties().partitioning.partition_count(); @@ -90,6 +91,7 @@ pub(crate) async fn execute_local_task( let plan_for_drop = Arc::clone(&plan); let fully_finished = Arc::new(AtomicBool::new(false)); let fully_finished_cloned = Arc::clone(&fully_finished); + let d_ctx_for_last = d_ctx.clone(); let stream = map_last_stream(stream, move |msg, last_msg_in_stream| { if !last_msg_in_stream { return msg; @@ -109,7 +111,7 @@ pub(crate) async fn execute_local_task( if send_metrics { // Last message of the last partition. This is the moment to send // the metrics back. - send_metrics_via_channel(&metrics_tx, &plan); + send_metrics_via_channel(&metrics_tx, &plan, d_ctx_for_last.clone()); } } fully_finished.store(true, Ordering::SeqCst); @@ -120,6 +122,7 @@ pub(crate) async fn execute_local_task( let task_data_entries_for_drop = Arc::clone(task_data_entries); let metrics_tx = Arc::clone(&task_data.metrics_tx); let key_for_drop = key.clone(); + let d_ctx_for_drop = d_ctx.clone(); let stream = on_drop_stream(stream, move || { if !fully_finished_cloned.load(Ordering::SeqCst) { // Stream was dropped before fully consumed -- see https://github.com/datafusion-contrib/datafusion-distributed/issues/412 @@ -134,7 +137,11 @@ pub(crate) async fn execute_local_task( entries.invalidate(&k).await; }); if send_metrics { - send_metrics_via_channel(&metrics_tx, &plan_for_drop); + send_metrics_via_channel( + &metrics_tx, + &plan_for_drop, + d_ctx_for_drop.clone(), + ); } } } @@ -245,9 +252,10 @@ fn build_flight_data_stream( fn send_metrics_via_channel( metrics_tx: &Arc>>>, plan: &Arc, + dt_ctx: DistributedTaskContext, ) { let mut metrics = vec![]; - let _ = plan.apply(|node| { + let _ = plan.apply_with_dt_ctx(dt_ctx, |node, _| { metrics.push( node.metrics() .and_then(|m| df_metrics_set_to_proto(&m).ok()) diff --git a/tests/clickbench_plans_test.rs b/tests/clickbench_plans_test.rs index 04473548..6c944ab8 100644 --- a/tests/clickbench_plans_test.rs +++ b/tests/clickbench_plans_test.rs @@ -26,18 +26,17 @@ mod tests { #[tokio::test] async fn test_clickbench_1() -> Result<()> { let display = test_clickbench_query("q1").await?; - assert_snapshot!(display, @" + assert_snapshot!(display, @r" ┌───── DistributedExec ── Tasks: t0:[p0] │ ProjectionExec: expr=[count(Int64(1))@0 as count(*)] │ AggregateExec: mode=Final, gby=[], aggr=[count(Int64(1))] │ CoalescePartitionsExec - │ [Stage 1] => NetworkCoalesceExec: output_partitions=6, input_tasks=3 + │ [Stage 1] => NetworkCoalesceExec: output_partitions=9, input_tasks=3 └────────────────────────────────────────────────── - ┌───── Stage 1 ── Tasks: t0:[p0..p1] t1:[p2..p3] t2:[p4..p5] + ┌───── Stage 1 ── Tasks: t0:[p0..p2] t1:[p3..p5] t2:[p6..p8] │ AggregateExec: mode=Partial, gby=[], aggr=[count(Int64(1))] │ FilterExec: AdvEngineID@0 != 0, projection=[] - │ PartitionIsolatorExec: tasks=3 partitions=5 - │ DataSourceExec: file_groups={5 groups: [[/testdata/clickbench/plans_range0-3/hits/0.parquet:..], [/testdata/clickbench/plans_range0-3/hits/1.parquet:..], [/testdata/clickbench/plans_range0-3/hits/1.parquet:..], [/testdata/clickbench/plans_range0-3/hits/2.parquet:..], [/testdata/clickbench/plans_range0-3/hits/2.parquet:..]]}, projection=[AdvEngineID], file_type=parquet, predicate=AdvEngineID@40 != 0, pruning_predicate=AdvEngineID_null_count@2 != row_count@3 AND (AdvEngineID_min@0 != 0 OR 0 != AdvEngineID_max@1), required_guarantees=[AdvEngineID not in (0)] + │ DistributedLeafExec: DataSourceExec: file_groups={3 groups: [[/testdata/clickbench/plans_range0-3/hits/0.parquet:.., /testdata/clickbench/plans_range0-3/hits/1.parquet:..], [/testdata/clickbench/plans_range0-3/hits/1.parquet:.., /testdata/clickbench/plans_range0-3/hits/2.parquet:..], [/testdata/clickbench/plans_range0-3/hits/2.parquet:..]]}, projection=[AdvEngineID], file_type=parquet, predicate=AdvEngineID@40 != 0, pruning_predicate=AdvEngineID_null_count@2 != row_count@3 AND (AdvEngineID_min@0 != 0 OR 0 != AdvEngineID_max@1), required_guarantees=[AdvEngineID not in (0)] └────────────────────────────────────────────────── "); Ok(()) @@ -46,17 +45,16 @@ mod tests { #[tokio::test] async fn test_clickbench_2() -> Result<()> { let display = test_clickbench_query("q2").await?; - assert_snapshot!(display, @" + assert_snapshot!(display, @r" ┌───── DistributedExec ── Tasks: t0:[p0] │ ProjectionExec: expr=[sum(hits.AdvEngineID)@0 as sum(hits.AdvEngineID), count(Int64(1))@1 as count(*), avg(hits.ResolutionWidth)@2 as avg(hits.ResolutionWidth)] │ AggregateExec: mode=Final, gby=[], aggr=[sum(hits.AdvEngineID), count(Int64(1)), avg(hits.ResolutionWidth)] │ CoalescePartitionsExec - │ [Stage 1] => NetworkCoalesceExec: output_partitions=6, input_tasks=3 + │ [Stage 1] => NetworkCoalesceExec: output_partitions=9, input_tasks=3 └────────────────────────────────────────────────── - ┌───── Stage 1 ── Tasks: t0:[p0..p1] t1:[p2..p3] t2:[p4..p5] + ┌───── Stage 1 ── Tasks: t0:[p0..p2] t1:[p3..p5] t2:[p6..p8] │ AggregateExec: mode=Partial, gby=[], aggr=[sum(hits.AdvEngineID), count(Int64(1)), avg(hits.ResolutionWidth)] - │ PartitionIsolatorExec: tasks=3 partitions=5 - │ DataSourceExec: file_groups={5 groups: [[/testdata/clickbench/plans_range0-3/hits/0.parquet:..], [/testdata/clickbench/plans_range0-3/hits/1.parquet:..], [/testdata/clickbench/plans_range0-3/hits/1.parquet:..], [/testdata/clickbench/plans_range0-3/hits/2.parquet:..], [/testdata/clickbench/plans_range0-3/hits/2.parquet:..]]}, projection=[ResolutionWidth, AdvEngineID], file_type=parquet + │ DistributedLeafExec: DataSourceExec: file_groups={3 groups: [[/testdata/clickbench/plans_range0-3/hits/0.parquet:.., /testdata/clickbench/plans_range0-3/hits/1.parquet:..], [/testdata/clickbench/plans_range0-3/hits/1.parquet:.., /testdata/clickbench/plans_range0-3/hits/2.parquet:..], [/testdata/clickbench/plans_range0-3/hits/2.parquet:..]]}, projection=[ResolutionWidth, AdvEngineID], file_type=parquet └────────────────────────────────────────────────── "); Ok(()) @@ -84,7 +82,7 @@ mod tests { #[tokio::test] async fn test_clickbench_4() -> Result<()> { let display = test_clickbench_query("q4").await?; - assert_snapshot!(display, @" + assert_snapshot!(display, @r" ┌───── DistributedExec ── Tasks: t0:[p0] │ ProjectionExec: expr=[count(alias1)@0 as count(DISTINCT hits.UserID)] │ AggregateExec: mode=Final, gby=[], aggr=[count(alias1)] @@ -97,10 +95,9 @@ mod tests { │ [Stage 1] => NetworkShuffleExec: output_partitions=3, input_tasks=3 └────────────────────────────────────────────────── ┌───── Stage 1 ── Tasks: t0:[p0..p5] t1:[p0..p5] t2:[p0..p5] - │ RepartitionExec: partitioning=Hash([alias1@0], 6), input_partitions=2 + │ RepartitionExec: partitioning=Hash([alias1@0], 6), input_partitions=3 │ AggregateExec: mode=Partial, gby=[UserID@0 as alias1], aggr=[] - │ PartitionIsolatorExec: tasks=3 partitions=5 - │ DataSourceExec: file_groups={5 groups: [[/testdata/clickbench/plans_range0-3/hits/0.parquet:..], [/testdata/clickbench/plans_range0-3/hits/1.parquet:..], [/testdata/clickbench/plans_range0-3/hits/1.parquet:..], [/testdata/clickbench/plans_range0-3/hits/2.parquet:..], [/testdata/clickbench/plans_range0-3/hits/2.parquet:..]]}, projection=[UserID], file_type=parquet + │ DistributedLeafExec: DataSourceExec: file_groups={3 groups: [[/testdata/clickbench/plans_range0-3/hits/0.parquet:.., /testdata/clickbench/plans_range0-3/hits/1.parquet:..], [/testdata/clickbench/plans_range0-3/hits/1.parquet:.., /testdata/clickbench/plans_range0-3/hits/2.parquet:..], [/testdata/clickbench/plans_range0-3/hits/2.parquet:..]]}, projection=[UserID], file_type=parquet └────────────────────────────────────────────────── "); Ok(()) @@ -109,7 +106,7 @@ mod tests { #[tokio::test] async fn test_clickbench_5() -> Result<()> { let display = test_clickbench_query("q5").await?; - assert_snapshot!(display, @" + assert_snapshot!(display, @r" ┌───── DistributedExec ── Tasks: t0:[p0] │ ProjectionExec: expr=[count(alias1)@0 as count(DISTINCT hits.SearchPhrase)] │ AggregateExec: mode=Final, gby=[], aggr=[count(alias1)] @@ -122,10 +119,9 @@ mod tests { │ [Stage 1] => NetworkShuffleExec: output_partitions=3, input_tasks=3 └────────────────────────────────────────────────── ┌───── Stage 1 ── Tasks: t0:[p0..p5] t1:[p0..p5] t2:[p0..p5] - │ RepartitionExec: partitioning=Hash([alias1@0], 6), input_partitions=2 + │ RepartitionExec: partitioning=Hash([alias1@0], 6), input_partitions=3 │ AggregateExec: mode=Partial, gby=[SearchPhrase@0 as alias1], aggr=[] - │ PartitionIsolatorExec: tasks=3 partitions=5 - │ DataSourceExec: file_groups={5 groups: [[/testdata/clickbench/plans_range0-3/hits/0.parquet:..], [/testdata/clickbench/plans_range0-3/hits/1.parquet:..], [/testdata/clickbench/plans_range0-3/hits/1.parquet:..], [/testdata/clickbench/plans_range0-3/hits/2.parquet:..], [/testdata/clickbench/plans_range0-3/hits/2.parquet:..]]}, projection=[SearchPhrase], file_type=parquet + │ DistributedLeafExec: DataSourceExec: file_groups={3 groups: [[/testdata/clickbench/plans_range0-3/hits/0.parquet:.., /testdata/clickbench/plans_range0-3/hits/1.parquet:..], [/testdata/clickbench/plans_range0-3/hits/1.parquet:.., /testdata/clickbench/plans_range0-3/hits/2.parquet:..], [/testdata/clickbench/plans_range0-3/hits/2.parquet:..]]}, projection=[SearchPhrase], file_type=parquet └────────────────────────────────────────────────── "); Ok(()) @@ -142,7 +138,7 @@ mod tests { #[tokio::test] async fn test_clickbench_7() -> Result<()> { let display = test_clickbench_query("q7").await?; - assert_snapshot!(display, @" + assert_snapshot!(display, @r" ┌───── DistributedExec ── Tasks: t0:[p0] │ ProjectionExec: expr=[AdvEngineID@0 as AdvEngineID, count(*)@1 as count(*)] │ SortPreservingMergeExec: [count(Int64(1))@2 DESC] @@ -155,11 +151,10 @@ mod tests { │ [Stage 1] => NetworkShuffleExec: output_partitions=3, input_tasks=3 └────────────────────────────────────────────────── ┌───── Stage 1 ── Tasks: t0:[p0..p5] t1:[p0..p5] t2:[p0..p5] - │ RepartitionExec: partitioning=Hash([AdvEngineID@0], 6), input_partitions=2 + │ RepartitionExec: partitioning=Hash([AdvEngineID@0], 6), input_partitions=3 │ AggregateExec: mode=Partial, gby=[AdvEngineID@0 as AdvEngineID], aggr=[count(Int64(1))] │ FilterExec: AdvEngineID@0 != 0 - │ PartitionIsolatorExec: tasks=3 partitions=5 - │ DataSourceExec: file_groups={5 groups: [[/testdata/clickbench/plans_range0-3/hits/0.parquet:..], [/testdata/clickbench/plans_range0-3/hits/1.parquet:..], [/testdata/clickbench/plans_range0-3/hits/1.parquet:..], [/testdata/clickbench/plans_range0-3/hits/2.parquet:..], [/testdata/clickbench/plans_range0-3/hits/2.parquet:..]]}, projection=[AdvEngineID], file_type=parquet, predicate=AdvEngineID@40 != 0, pruning_predicate=AdvEngineID_null_count@2 != row_count@3 AND (AdvEngineID_min@0 != 0 OR 0 != AdvEngineID_max@1), required_guarantees=[AdvEngineID not in (0)] + │ DistributedLeafExec: DataSourceExec: file_groups={3 groups: [[/testdata/clickbench/plans_range0-3/hits/0.parquet:.., /testdata/clickbench/plans_range0-3/hits/1.parquet:..], [/testdata/clickbench/plans_range0-3/hits/1.parquet:.., /testdata/clickbench/plans_range0-3/hits/2.parquet:..], [/testdata/clickbench/plans_range0-3/hits/2.parquet:..]]}, projection=[AdvEngineID], file_type=parquet, predicate=AdvEngineID@40 != 0, pruning_predicate=AdvEngineID_null_count@2 != row_count@3 AND (AdvEngineID_min@0 != 0 OR 0 != AdvEngineID_max@1), required_guarantees=[AdvEngineID not in (0)] └────────────────────────────────────────────────── "); Ok(()) @@ -168,7 +163,7 @@ mod tests { #[tokio::test] async fn test_clickbench_8() -> Result<()> { let display = test_clickbench_query("q8").await?; - assert_snapshot!(display, @" + assert_snapshot!(display, @r" ┌───── DistributedExec ── Tasks: t0:[p0] │ SortPreservingMergeExec: [u@1 DESC], fetch=10 │ SortExec: TopK(fetch=10), expr=[u@1 DESC], preserve_partitioning=[true] @@ -183,10 +178,9 @@ mod tests { │ [Stage 1] => NetworkShuffleExec: output_partitions=3, input_tasks=3 └────────────────────────────────────────────────── ┌───── Stage 1 ── Tasks: t0:[p0..p5] t1:[p0..p5] t2:[p0..p5] - │ RepartitionExec: partitioning=Hash([RegionID@0, alias1@1], 6), input_partitions=2 + │ RepartitionExec: partitioning=Hash([RegionID@0, alias1@1], 6), input_partitions=3 │ AggregateExec: mode=Partial, gby=[RegionID@0 as RegionID, UserID@1 as alias1], aggr=[] - │ PartitionIsolatorExec: tasks=3 partitions=5 - │ DataSourceExec: file_groups={5 groups: [[/testdata/clickbench/plans_range0-3/hits/0.parquet:..], [/testdata/clickbench/plans_range0-3/hits/1.parquet:..], [/testdata/clickbench/plans_range0-3/hits/1.parquet:..], [/testdata/clickbench/plans_range0-3/hits/2.parquet:..], [/testdata/clickbench/plans_range0-3/hits/2.parquet:..]]}, projection=[RegionID, UserID], file_type=parquet + │ DistributedLeafExec: DataSourceExec: file_groups={3 groups: [[/testdata/clickbench/plans_range0-3/hits/0.parquet:.., /testdata/clickbench/plans_range0-3/hits/1.parquet:..], [/testdata/clickbench/plans_range0-3/hits/1.parquet:.., /testdata/clickbench/plans_range0-3/hits/2.parquet:..], [/testdata/clickbench/plans_range0-3/hits/2.parquet:..]]}, projection=[RegionID, UserID], file_type=parquet └────────────────────────────────────────────────── "); Ok(()) @@ -195,7 +189,7 @@ mod tests { #[tokio::test] async fn test_clickbench_9() -> Result<()> { let display = test_clickbench_query("q9").await?; - assert_snapshot!(display, @" + assert_snapshot!(display, @r" ┌───── DistributedExec ── Tasks: t0:[p0] │ SortPreservingMergeExec: [c@2 DESC], fetch=10 │ [Stage 2] => NetworkCoalesceExec: output_partitions=6, input_tasks=2 @@ -207,10 +201,9 @@ mod tests { │ [Stage 1] => NetworkShuffleExec: output_partitions=3, input_tasks=3 └────────────────────────────────────────────────── ┌───── Stage 1 ── Tasks: t0:[p0..p5] t1:[p0..p5] t2:[p0..p5] - │ RepartitionExec: partitioning=Hash([RegionID@0], 6), input_partitions=2 + │ RepartitionExec: partitioning=Hash([RegionID@0], 6), input_partitions=3 │ AggregateExec: mode=Partial, gby=[RegionID@0 as RegionID], aggr=[sum(hits.AdvEngineID), count(Int64(1)), avg(hits.ResolutionWidth), count(DISTINCT hits.UserID)] - │ PartitionIsolatorExec: tasks=3 partitions=5 - │ DataSourceExec: file_groups={5 groups: [[/testdata/clickbench/plans_range0-3/hits/0.parquet:..], [/testdata/clickbench/plans_range0-3/hits/1.parquet:..], [/testdata/clickbench/plans_range0-3/hits/1.parquet:..], [/testdata/clickbench/plans_range0-3/hits/2.parquet:..], [/testdata/clickbench/plans_range0-3/hits/2.parquet:..]]}, projection=[RegionID, UserID, ResolutionWidth, AdvEngineID], file_type=parquet + │ DistributedLeafExec: DataSourceExec: file_groups={3 groups: [[/testdata/clickbench/plans_range0-3/hits/0.parquet:.., /testdata/clickbench/plans_range0-3/hits/1.parquet:..], [/testdata/clickbench/plans_range0-3/hits/1.parquet:.., /testdata/clickbench/plans_range0-3/hits/2.parquet:..], [/testdata/clickbench/plans_range0-3/hits/2.parquet:..]]}, projection=[RegionID, UserID, ResolutionWidth, AdvEngineID], file_type=parquet └────────────────────────────────────────────────── "); Ok(()) @@ -219,7 +212,7 @@ mod tests { #[tokio::test] async fn test_clickbench_10() -> Result<()> { let display = test_clickbench_query("q10").await?; - assert_snapshot!(display, @" + assert_snapshot!(display, @r" ┌───── DistributedExec ── Tasks: t0:[p0] │ SortPreservingMergeExec: [u@1 DESC], fetch=10 │ SortExec: TopK(fetch=10), expr=[u@1 DESC], preserve_partitioning=[true] @@ -234,11 +227,10 @@ mod tests { │ [Stage 1] => NetworkShuffleExec: output_partitions=3, input_tasks=3 └────────────────────────────────────────────────── ┌───── Stage 1 ── Tasks: t0:[p0..p5] t1:[p0..p5] t2:[p0..p5] - │ RepartitionExec: partitioning=Hash([MobilePhoneModel@0, alias1@1], 6), input_partitions=2 + │ RepartitionExec: partitioning=Hash([MobilePhoneModel@0, alias1@1], 6), input_partitions=3 │ AggregateExec: mode=Partial, gby=[MobilePhoneModel@1 as MobilePhoneModel, UserID@0 as alias1], aggr=[] │ FilterExec: MobilePhoneModel@1 != - │ PartitionIsolatorExec: tasks=3 partitions=5 - │ DataSourceExec: file_groups={5 groups: [[/testdata/clickbench/plans_range0-3/hits/0.parquet:..], [/testdata/clickbench/plans_range0-3/hits/1.parquet:..], [/testdata/clickbench/plans_range0-3/hits/1.parquet:..], [/testdata/clickbench/plans_range0-3/hits/2.parquet:..], [/testdata/clickbench/plans_range0-3/hits/2.parquet:..]]}, projection=[UserID, MobilePhoneModel], file_type=parquet, predicate=MobilePhoneModel@34 != , pruning_predicate=MobilePhoneModel_null_count@2 != row_count@3 AND (MobilePhoneModel_min@0 != OR != MobilePhoneModel_max@1), required_guarantees=[MobilePhoneModel not in ()] + │ DistributedLeafExec: DataSourceExec: file_groups={3 groups: [[/testdata/clickbench/plans_range0-3/hits/0.parquet:.., /testdata/clickbench/plans_range0-3/hits/1.parquet:..], [/testdata/clickbench/plans_range0-3/hits/1.parquet:.., /testdata/clickbench/plans_range0-3/hits/2.parquet:..], [/testdata/clickbench/plans_range0-3/hits/2.parquet:..]]}, projection=[UserID, MobilePhoneModel], file_type=parquet, predicate=MobilePhoneModel@34 != , pruning_predicate=MobilePhoneModel_null_count@2 != row_count@3 AND (MobilePhoneModel_min@0 != OR != MobilePhoneModel_max@1), required_guarantees=[MobilePhoneModel not in ()] └────────────────────────────────────────────────── "); Ok(()) @@ -247,7 +239,7 @@ mod tests { #[tokio::test] async fn test_clickbench_11() -> Result<()> { let display = test_clickbench_query("q11").await?; - assert_snapshot!(display, @" + assert_snapshot!(display, @r" ┌───── DistributedExec ── Tasks: t0:[p0] │ SortPreservingMergeExec: [u@2 DESC], fetch=10 │ SortExec: TopK(fetch=10), expr=[u@2 DESC], preserve_partitioning=[true] @@ -262,11 +254,10 @@ mod tests { │ [Stage 1] => NetworkShuffleExec: output_partitions=3, input_tasks=3 └────────────────────────────────────────────────── ┌───── Stage 1 ── Tasks: t0:[p0..p5] t1:[p0..p5] t2:[p0..p5] - │ RepartitionExec: partitioning=Hash([MobilePhone@0, MobilePhoneModel@1, alias1@2], 6), input_partitions=2 + │ RepartitionExec: partitioning=Hash([MobilePhone@0, MobilePhoneModel@1, alias1@2], 6), input_partitions=3 │ AggregateExec: mode=Partial, gby=[MobilePhone@1 as MobilePhone, MobilePhoneModel@2 as MobilePhoneModel, UserID@0 as alias1], aggr=[] │ FilterExec: MobilePhoneModel@2 != - │ PartitionIsolatorExec: tasks=3 partitions=5 - │ DataSourceExec: file_groups={5 groups: [[/testdata/clickbench/plans_range0-3/hits/0.parquet:..], [/testdata/clickbench/plans_range0-3/hits/1.parquet:..], [/testdata/clickbench/plans_range0-3/hits/1.parquet:..], [/testdata/clickbench/plans_range0-3/hits/2.parquet:..], [/testdata/clickbench/plans_range0-3/hits/2.parquet:..]]}, projection=[UserID, MobilePhone, MobilePhoneModel], file_type=parquet, predicate=MobilePhoneModel@34 != , pruning_predicate=MobilePhoneModel_null_count@2 != row_count@3 AND (MobilePhoneModel_min@0 != OR != MobilePhoneModel_max@1), required_guarantees=[MobilePhoneModel not in ()] + │ DistributedLeafExec: DataSourceExec: file_groups={3 groups: [[/testdata/clickbench/plans_range0-3/hits/0.parquet:.., /testdata/clickbench/plans_range0-3/hits/1.parquet:..], [/testdata/clickbench/plans_range0-3/hits/1.parquet:.., /testdata/clickbench/plans_range0-3/hits/2.parquet:..], [/testdata/clickbench/plans_range0-3/hits/2.parquet:..]]}, projection=[UserID, MobilePhone, MobilePhoneModel], file_type=parquet, predicate=MobilePhoneModel@34 != , pruning_predicate=MobilePhoneModel_null_count@2 != row_count@3 AND (MobilePhoneModel_min@0 != OR != MobilePhoneModel_max@1), required_guarantees=[MobilePhoneModel not in ()] └────────────────────────────────────────────────── "); Ok(()) @@ -275,7 +266,7 @@ mod tests { #[tokio::test] async fn test_clickbench_12() -> Result<()> { let display = test_clickbench_query("q12").await?; - assert_snapshot!(display, @" + assert_snapshot!(display, @r" ┌───── DistributedExec ── Tasks: t0:[p0] │ SortPreservingMergeExec: [c@1 DESC], fetch=10 │ [Stage 2] => NetworkCoalesceExec: output_partitions=6, input_tasks=2 @@ -287,11 +278,10 @@ mod tests { │ [Stage 1] => NetworkShuffleExec: output_partitions=3, input_tasks=3 └────────────────────────────────────────────────── ┌───── Stage 1 ── Tasks: t0:[p0..p5] t1:[p0..p5] t2:[p0..p5] - │ RepartitionExec: partitioning=Hash([SearchPhrase@0], 6), input_partitions=2 + │ RepartitionExec: partitioning=Hash([SearchPhrase@0], 6), input_partitions=3 │ AggregateExec: mode=Partial, gby=[SearchPhrase@0 as SearchPhrase], aggr=[count(Int64(1))] │ FilterExec: SearchPhrase@0 != - │ PartitionIsolatorExec: tasks=3 partitions=5 - │ DataSourceExec: file_groups={5 groups: [[/testdata/clickbench/plans_range0-3/hits/0.parquet:..], [/testdata/clickbench/plans_range0-3/hits/1.parquet:..], [/testdata/clickbench/plans_range0-3/hits/1.parquet:..], [/testdata/clickbench/plans_range0-3/hits/2.parquet:..], [/testdata/clickbench/plans_range0-3/hits/2.parquet:..]]}, projection=[SearchPhrase], file_type=parquet, predicate=SearchPhrase@39 != , pruning_predicate=SearchPhrase_null_count@2 != row_count@3 AND (SearchPhrase_min@0 != OR != SearchPhrase_max@1), required_guarantees=[SearchPhrase not in ()] + │ DistributedLeafExec: DataSourceExec: file_groups={3 groups: [[/testdata/clickbench/plans_range0-3/hits/0.parquet:.., /testdata/clickbench/plans_range0-3/hits/1.parquet:..], [/testdata/clickbench/plans_range0-3/hits/1.parquet:.., /testdata/clickbench/plans_range0-3/hits/2.parquet:..], [/testdata/clickbench/plans_range0-3/hits/2.parquet:..]]}, projection=[SearchPhrase], file_type=parquet, predicate=SearchPhrase@39 != , pruning_predicate=SearchPhrase_null_count@2 != row_count@3 AND (SearchPhrase_min@0 != OR != SearchPhrase_max@1), required_guarantees=[SearchPhrase not in ()] └────────────────────────────────────────────────── "); Ok(()) @@ -300,7 +290,7 @@ mod tests { #[tokio::test] async fn test_clickbench_13() -> Result<()> { let display = test_clickbench_query("q13").await?; - assert_snapshot!(display, @" + assert_snapshot!(display, @r" ┌───── DistributedExec ── Tasks: t0:[p0] │ SortPreservingMergeExec: [u@1 DESC], fetch=10 │ SortExec: TopK(fetch=10), expr=[u@1 DESC], preserve_partitioning=[true] @@ -315,11 +305,10 @@ mod tests { │ [Stage 1] => NetworkShuffleExec: output_partitions=3, input_tasks=3 └────────────────────────────────────────────────── ┌───── Stage 1 ── Tasks: t0:[p0..p5] t1:[p0..p5] t2:[p0..p5] - │ RepartitionExec: partitioning=Hash([SearchPhrase@0, alias1@1], 6), input_partitions=2 + │ RepartitionExec: partitioning=Hash([SearchPhrase@0, alias1@1], 6), input_partitions=3 │ AggregateExec: mode=Partial, gby=[SearchPhrase@1 as SearchPhrase, UserID@0 as alias1], aggr=[] │ FilterExec: SearchPhrase@1 != - │ PartitionIsolatorExec: tasks=3 partitions=5 - │ DataSourceExec: file_groups={5 groups: [[/testdata/clickbench/plans_range0-3/hits/0.parquet:..], [/testdata/clickbench/plans_range0-3/hits/1.parquet:..], [/testdata/clickbench/plans_range0-3/hits/1.parquet:..], [/testdata/clickbench/plans_range0-3/hits/2.parquet:..], [/testdata/clickbench/plans_range0-3/hits/2.parquet:..]]}, projection=[UserID, SearchPhrase], file_type=parquet, predicate=SearchPhrase@39 != , pruning_predicate=SearchPhrase_null_count@2 != row_count@3 AND (SearchPhrase_min@0 != OR != SearchPhrase_max@1), required_guarantees=[SearchPhrase not in ()] + │ DistributedLeafExec: DataSourceExec: file_groups={3 groups: [[/testdata/clickbench/plans_range0-3/hits/0.parquet:.., /testdata/clickbench/plans_range0-3/hits/1.parquet:..], [/testdata/clickbench/plans_range0-3/hits/1.parquet:.., /testdata/clickbench/plans_range0-3/hits/2.parquet:..], [/testdata/clickbench/plans_range0-3/hits/2.parquet:..]]}, projection=[UserID, SearchPhrase], file_type=parquet, predicate=SearchPhrase@39 != , pruning_predicate=SearchPhrase_null_count@2 != row_count@3 AND (SearchPhrase_min@0 != OR != SearchPhrase_max@1), required_guarantees=[SearchPhrase not in ()] └────────────────────────────────────────────────── "); Ok(()) @@ -328,7 +317,7 @@ mod tests { #[tokio::test] async fn test_clickbench_14() -> Result<()> { let display = test_clickbench_query("q14").await?; - assert_snapshot!(display, @" + assert_snapshot!(display, @r" ┌───── DistributedExec ── Tasks: t0:[p0] │ SortPreservingMergeExec: [c@2 DESC], fetch=10 │ [Stage 2] => NetworkCoalesceExec: output_partitions=6, input_tasks=2 @@ -340,11 +329,10 @@ mod tests { │ [Stage 1] => NetworkShuffleExec: output_partitions=3, input_tasks=3 └────────────────────────────────────────────────── ┌───── Stage 1 ── Tasks: t0:[p0..p5] t1:[p0..p5] t2:[p0..p5] - │ RepartitionExec: partitioning=Hash([SearchEngineID@0, SearchPhrase@1], 6), input_partitions=2 + │ RepartitionExec: partitioning=Hash([SearchEngineID@0, SearchPhrase@1], 6), input_partitions=3 │ AggregateExec: mode=Partial, gby=[SearchEngineID@0 as SearchEngineID, SearchPhrase@1 as SearchPhrase], aggr=[count(Int64(1))] │ FilterExec: SearchPhrase@1 != - │ PartitionIsolatorExec: tasks=3 partitions=5 - │ DataSourceExec: file_groups={5 groups: [[/testdata/clickbench/plans_range0-3/hits/0.parquet:..], [/testdata/clickbench/plans_range0-3/hits/1.parquet:..], [/testdata/clickbench/plans_range0-3/hits/1.parquet:..], [/testdata/clickbench/plans_range0-3/hits/2.parquet:..], [/testdata/clickbench/plans_range0-3/hits/2.parquet:..]]}, projection=[SearchEngineID, SearchPhrase], file_type=parquet, predicate=SearchPhrase@39 != , pruning_predicate=SearchPhrase_null_count@2 != row_count@3 AND (SearchPhrase_min@0 != OR != SearchPhrase_max@1), required_guarantees=[SearchPhrase not in ()] + │ DistributedLeafExec: DataSourceExec: file_groups={3 groups: [[/testdata/clickbench/plans_range0-3/hits/0.parquet:.., /testdata/clickbench/plans_range0-3/hits/1.parquet:..], [/testdata/clickbench/plans_range0-3/hits/1.parquet:.., /testdata/clickbench/plans_range0-3/hits/2.parquet:..], [/testdata/clickbench/plans_range0-3/hits/2.parquet:..]]}, projection=[SearchEngineID, SearchPhrase], file_type=parquet, predicate=SearchPhrase@39 != , pruning_predicate=SearchPhrase_null_count@2 != row_count@3 AND (SearchPhrase_min@0 != OR != SearchPhrase_max@1), required_guarantees=[SearchPhrase not in ()] └────────────────────────────────────────────────── "); Ok(()) @@ -353,7 +341,7 @@ mod tests { #[tokio::test] async fn test_clickbench_15() -> Result<()> { let display = test_clickbench_query("q15").await?; - assert_snapshot!(display, @" + assert_snapshot!(display, @r" ┌───── DistributedExec ── Tasks: t0:[p0] │ ProjectionExec: expr=[UserID@0 as UserID, count(*)@1 as count(*)] │ SortPreservingMergeExec: [count(Int64(1))@2 DESC], fetch=10 @@ -366,10 +354,9 @@ mod tests { │ [Stage 1] => NetworkShuffleExec: output_partitions=3, input_tasks=3 └────────────────────────────────────────────────── ┌───── Stage 1 ── Tasks: t0:[p0..p5] t1:[p0..p5] t2:[p0..p5] - │ RepartitionExec: partitioning=Hash([UserID@0], 6), input_partitions=2 + │ RepartitionExec: partitioning=Hash([UserID@0], 6), input_partitions=3 │ AggregateExec: mode=Partial, gby=[UserID@0 as UserID], aggr=[count(Int64(1))] - │ PartitionIsolatorExec: tasks=3 partitions=5 - │ DataSourceExec: file_groups={5 groups: [[/testdata/clickbench/plans_range0-3/hits/0.parquet:..], [/testdata/clickbench/plans_range0-3/hits/1.parquet:..], [/testdata/clickbench/plans_range0-3/hits/1.parquet:..], [/testdata/clickbench/plans_range0-3/hits/2.parquet:..], [/testdata/clickbench/plans_range0-3/hits/2.parquet:..]]}, projection=[UserID], file_type=parquet + │ DistributedLeafExec: DataSourceExec: file_groups={3 groups: [[/testdata/clickbench/plans_range0-3/hits/0.parquet:.., /testdata/clickbench/plans_range0-3/hits/1.parquet:..], [/testdata/clickbench/plans_range0-3/hits/1.parquet:.., /testdata/clickbench/plans_range0-3/hits/2.parquet:..], [/testdata/clickbench/plans_range0-3/hits/2.parquet:..]]}, projection=[UserID], file_type=parquet └────────────────────────────────────────────────── "); Ok(()) @@ -378,7 +365,7 @@ mod tests { #[tokio::test] async fn test_clickbench_16() -> Result<()> { let display = test_clickbench_query("q16").await?; - assert_snapshot!(display, @" + assert_snapshot!(display, @r" ┌───── DistributedExec ── Tasks: t0:[p0] │ ProjectionExec: expr=[UserID@0 as UserID, SearchPhrase@1 as SearchPhrase, count(*)@2 as count(*)] │ SortPreservingMergeExec: [count(Int64(1))@3 DESC], fetch=10 @@ -391,10 +378,9 @@ mod tests { │ [Stage 1] => NetworkShuffleExec: output_partitions=3, input_tasks=3 └────────────────────────────────────────────────── ┌───── Stage 1 ── Tasks: t0:[p0..p5] t1:[p0..p5] t2:[p0..p5] - │ RepartitionExec: partitioning=Hash([UserID@0, SearchPhrase@1], 6), input_partitions=2 + │ RepartitionExec: partitioning=Hash([UserID@0, SearchPhrase@1], 6), input_partitions=3 │ AggregateExec: mode=Partial, gby=[UserID@0 as UserID, SearchPhrase@1 as SearchPhrase], aggr=[count(Int64(1))] - │ PartitionIsolatorExec: tasks=3 partitions=5 - │ DataSourceExec: file_groups={5 groups: [[/testdata/clickbench/plans_range0-3/hits/0.parquet:..], [/testdata/clickbench/plans_range0-3/hits/1.parquet:..], [/testdata/clickbench/plans_range0-3/hits/1.parquet:..], [/testdata/clickbench/plans_range0-3/hits/2.parquet:..], [/testdata/clickbench/plans_range0-3/hits/2.parquet:..]]}, projection=[UserID, SearchPhrase], file_type=parquet + │ DistributedLeafExec: DataSourceExec: file_groups={3 groups: [[/testdata/clickbench/plans_range0-3/hits/0.parquet:.., /testdata/clickbench/plans_range0-3/hits/1.parquet:..], [/testdata/clickbench/plans_range0-3/hits/1.parquet:.., /testdata/clickbench/plans_range0-3/hits/2.parquet:..], [/testdata/clickbench/plans_range0-3/hits/2.parquet:..]]}, projection=[UserID, SearchPhrase], file_type=parquet └────────────────────────────────────────────────── "); Ok(()) @@ -424,10 +410,9 @@ mod tests { │ [Stage 1] => NetworkShuffleExec: output_partitions=3, input_tasks=3 └────────────────────────────────────────────────── ┌───── Stage 1 ── Tasks: t0:[p0..p5] t1:[p0..p5] t2:[p0..p5] - │ RepartitionExec: partitioning=Hash([UserID@0, date_part(Utf8("MINUTE"),to_timestamp_seconds(hits.EventTime))@1, SearchPhrase@2], 6), input_partitions=2 + │ RepartitionExec: partitioning=Hash([UserID@0, date_part(Utf8("MINUTE"),to_timestamp_seconds(hits.EventTime))@1, SearchPhrase@2], 6), input_partitions=3 │ AggregateExec: mode=Partial, gby=[UserID@1 as UserID, date_part(MINUTE, to_timestamp_seconds(EventTime@0)) as date_part(Utf8("MINUTE"),to_timestamp_seconds(hits.EventTime)), SearchPhrase@2 as SearchPhrase], aggr=[count(Int64(1))] - │ PartitionIsolatorExec: tasks=3 partitions=5 - │ DataSourceExec: file_groups={5 groups: [[/testdata/clickbench/plans_range0-3/hits/0.parquet:..], [/testdata/clickbench/plans_range0-3/hits/1.parquet:..], [/testdata/clickbench/plans_range0-3/hits/1.parquet:..], [/testdata/clickbench/plans_range0-3/hits/2.parquet:..], [/testdata/clickbench/plans_range0-3/hits/2.parquet:..]]}, projection=[EventTime, UserID, SearchPhrase], file_type=parquet + │ DistributedLeafExec: DataSourceExec: file_groups={3 groups: [[/testdata/clickbench/plans_range0-3/hits/0.parquet:.., /testdata/clickbench/plans_range0-3/hits/1.parquet:..], [/testdata/clickbench/plans_range0-3/hits/1.parquet:.., /testdata/clickbench/plans_range0-3/hits/2.parquet:..], [/testdata/clickbench/plans_range0-3/hits/2.parquet:..]]}, projection=[EventTime, UserID, SearchPhrase], file_type=parquet └────────────────────────────────────────────────── "#); Ok(()) @@ -436,15 +421,14 @@ mod tests { #[tokio::test] async fn test_clickbench_19() -> Result<()> { let display = test_clickbench_query("q19").await?; - assert_snapshot!(display, @" + assert_snapshot!(display, @r" ┌───── DistributedExec ── Tasks: t0:[p0] │ CoalescePartitionsExec - │ [Stage 1] => NetworkCoalesceExec: output_partitions=6, input_tasks=3 + │ [Stage 1] => NetworkCoalesceExec: output_partitions=9, input_tasks=3 └────────────────────────────────────────────────── - ┌───── Stage 1 ── Tasks: t0:[p0..p1] t1:[p2..p3] t2:[p4..p5] + ┌───── Stage 1 ── Tasks: t0:[p0..p2] t1:[p3..p5] t2:[p6..p8] │ FilterExec: UserID@0 = 435090932899640449 - │ PartitionIsolatorExec: tasks=3 partitions=5 - │ DataSourceExec: file_groups={5 groups: [[/testdata/clickbench/plans_range0-3/hits/0.parquet:..], [/testdata/clickbench/plans_range0-3/hits/1.parquet:..], [/testdata/clickbench/plans_range0-3/hits/1.parquet:..], [/testdata/clickbench/plans_range0-3/hits/2.parquet:..], [/testdata/clickbench/plans_range0-3/hits/2.parquet:..]]}, projection=[UserID], file_type=parquet, predicate=UserID@9 = 435090932899640449, pruning_predicate=UserID_null_count@2 != row_count@3 AND UserID_min@0 <= 435090932899640449 AND 435090932899640449 <= UserID_max@1, required_guarantees=[UserID in (435090932899640449)] + │ DistributedLeafExec: DataSourceExec: file_groups={3 groups: [[/testdata/clickbench/plans_range0-3/hits/0.parquet:.., /testdata/clickbench/plans_range0-3/hits/1.parquet:..], [/testdata/clickbench/plans_range0-3/hits/1.parquet:.., /testdata/clickbench/plans_range0-3/hits/2.parquet:..], [/testdata/clickbench/plans_range0-3/hits/2.parquet:..]]}, projection=[UserID], file_type=parquet, predicate=UserID@9 = 435090932899640449, pruning_predicate=UserID_null_count@2 != row_count@3 AND UserID_min@0 <= 435090932899640449 AND 435090932899640449 <= UserID_max@1, required_guarantees=[UserID in (435090932899640449)] └────────────────────────────────────────────────── "); Ok(()) @@ -453,18 +437,17 @@ mod tests { #[tokio::test] async fn test_clickbench_20() -> Result<()> { let display = test_clickbench_query("q20").await?; - assert_snapshot!(display, @" + assert_snapshot!(display, @r" ┌───── DistributedExec ── Tasks: t0:[p0] │ ProjectionExec: expr=[count(Int64(1))@0 as count(*)] │ AggregateExec: mode=Final, gby=[], aggr=[count(Int64(1))] │ CoalescePartitionsExec - │ [Stage 1] => NetworkCoalesceExec: output_partitions=6, input_tasks=3 + │ [Stage 1] => NetworkCoalesceExec: output_partitions=9, input_tasks=3 └────────────────────────────────────────────────── - ┌───── Stage 1 ── Tasks: t0:[p0..p1] t1:[p2..p3] t2:[p4..p5] + ┌───── Stage 1 ── Tasks: t0:[p0..p2] t1:[p3..p5] t2:[p6..p8] │ AggregateExec: mode=Partial, gby=[], aggr=[count(Int64(1))] │ FilterExec: CAST(URL@0 AS Utf8View) LIKE %google%, projection=[] - │ PartitionIsolatorExec: tasks=3 partitions=5 - │ DataSourceExec: file_groups={5 groups: [[/testdata/clickbench/plans_range0-3/hits/0.parquet:..], [/testdata/clickbench/plans_range0-3/hits/1.parquet:..], [/testdata/clickbench/plans_range0-3/hits/1.parquet:..], [/testdata/clickbench/plans_range0-3/hits/2.parquet:..], [/testdata/clickbench/plans_range0-3/hits/2.parquet:..]]}, projection=[URL], file_type=parquet, predicate=CAST(URL@13 AS Utf8View) LIKE %google% + │ DistributedLeafExec: DataSourceExec: file_groups={3 groups: [[/testdata/clickbench/plans_range0-3/hits/0.parquet:.., /testdata/clickbench/plans_range0-3/hits/1.parquet:..], [/testdata/clickbench/plans_range0-3/hits/1.parquet:.., /testdata/clickbench/plans_range0-3/hits/2.parquet:..], [/testdata/clickbench/plans_range0-3/hits/2.parquet:..]]}, projection=[URL], file_type=parquet, predicate=CAST(URL@13 AS Utf8View) LIKE %google% └────────────────────────────────────────────────── "); Ok(()) @@ -489,16 +472,15 @@ mod tests { #[tokio::test] async fn test_clickbench_23() -> Result<()> { let display = test_clickbench_query("q23").await?; - assert_snapshot!(display, @" + assert_snapshot!(display, @r" ┌───── DistributedExec ── Tasks: t0:[p0] │ SortPreservingMergeExec: [EventTime@4 ASC NULLS LAST], fetch=10 - │ [Stage 1] => NetworkCoalesceExec: output_partitions=6, input_tasks=3 + │ [Stage 1] => NetworkCoalesceExec: output_partitions=9, input_tasks=3 └────────────────────────────────────────────────── - ┌───── Stage 1 ── Tasks: t0:[p0..p1] t1:[p2..p3] t2:[p4..p5] + ┌───── Stage 1 ── Tasks: t0:[p0..p2] t1:[p3..p5] t2:[p6..p8] │ SortExec: TopK(fetch=10), expr=[EventTime@4 ASC NULLS LAST], preserve_partitioning=[true] │ FilterExec: CAST(URL@13 AS Utf8View) LIKE %google% - │ PartitionIsolatorExec: tasks=3 partitions=5 - │ DataSourceExec: file_groups={5 groups: [[/testdata/clickbench/plans_range0-3/hits/0.parquet:..], [/testdata/clickbench/plans_range0-3/hits/1.parquet:..], [/testdata/clickbench/plans_range0-3/hits/1.parquet:..], [/testdata/clickbench/plans_range0-3/hits/2.parquet:..], [/testdata/clickbench/plans_range0-3/hits/2.parquet:..]]}, projection=[WatchID, JavaEnable, Title, GoodEvent, EventTime, EventDate, CounterID, ClientIP, RegionID, UserID, CounterClass, OS, UserAgent, URL, Referer, IsRefresh, RefererCategoryID, RefererRegionID, URLCategoryID, URLRegionID, ResolutionWidth, ResolutionHeight, ResolutionDepth, FlashMajor, FlashMinor, FlashMinor2, NetMajor, NetMinor, UserAgentMajor, UserAgentMinor, CookieEnable, JavascriptEnable, IsMobile, MobilePhone, MobilePhoneModel, Params, IPNetworkID, TraficSourceID, SearchEngineID, SearchPhrase, AdvEngineID, IsArtifical, WindowClientWidth, WindowClientHeight, ClientTimeZone, ClientEventTime, SilverlightVersion1, SilverlightVersion2, SilverlightVersion3, SilverlightVersion4, PageCharset, CodeVersion, IsLink, IsDownload, IsNotBounce, FUniqID, OriginalURL, HID, IsOldCounter, IsEvent, IsParameter, DontCountHits, WithHash, HitColor, LocalEventTime, Age, Sex, Income, Interests, Robotness, RemoteIP, WindowName, OpenerName, HistoryLength, BrowserLanguage, BrowserCountry, SocialNetwork, SocialAction, HTTPError, SendTiming, DNSTiming, ConnectTiming, ResponseStartTiming, ResponseEndTiming, FetchTiming, SocialSourceNetworkID, SocialSourcePage, ParamPrice, ParamOrderID, ParamCurrency, ParamCurrencyID, OpenstatServiceName, OpenstatCampaignID, OpenstatAdID, OpenstatSourceID, UTMSource, UTMMedium, UTMCampaign, UTMContent, UTMTerm, FromTag, HasGCLID, RefererHash, URLHash, CLID], file_type=parquet, predicate=CAST(URL@13 AS Utf8View) LIKE %google% AND DynamicFilter [ empty ] + │ DistributedLeafExec: DataSourceExec: file_groups={3 groups: [[/testdata/clickbench/plans_range0-3/hits/0.parquet:.., /testdata/clickbench/plans_range0-3/hits/1.parquet:..], [/testdata/clickbench/plans_range0-3/hits/1.parquet:.., /testdata/clickbench/plans_range0-3/hits/2.parquet:..], [/testdata/clickbench/plans_range0-3/hits/2.parquet:..]]}, projection=[WatchID, JavaEnable, Title, GoodEvent, EventTime, EventDate, CounterID, ClientIP, RegionID, UserID, CounterClass, OS, UserAgent, URL, Referer, IsRefresh, RefererCategoryID, RefererRegionID, URLCategoryID, URLRegionID, ResolutionWidth, ResolutionHeight, ResolutionDepth, FlashMajor, FlashMinor, FlashMinor2, NetMajor, NetMinor, UserAgentMajor, UserAgentMinor, CookieEnable, JavascriptEnable, IsMobile, MobilePhone, MobilePhoneModel, Params, IPNetworkID, TraficSourceID, SearchEngineID, SearchPhrase, AdvEngineID, IsArtifical, WindowClientWidth, WindowClientHeight, ClientTimeZone, ClientEventTime, SilverlightVersion1, SilverlightVersion2, SilverlightVersion3, SilverlightVersion4, PageCharset, CodeVersion, IsLink, IsDownload, IsNotBounce, FUniqID, OriginalURL, HID, IsOldCounter, IsEvent, IsParameter, DontCountHits, WithHash, HitColor, LocalEventTime, Age, Sex, Income, Interests, Robotness, RemoteIP, WindowName, OpenerName, HistoryLength, BrowserLanguage, BrowserCountry, SocialNetwork, SocialAction, HTTPError, SendTiming, DNSTiming, ConnectTiming, ResponseStartTiming, ResponseEndTiming, FetchTiming, SocialSourceNetworkID, SocialSourcePage, ParamPrice, ParamOrderID, ParamCurrency, ParamCurrencyID, OpenstatServiceName, OpenstatCampaignID, OpenstatAdID, OpenstatSourceID, UTMSource, UTMMedium, UTMCampaign, UTMContent, UTMTerm, FromTag, HasGCLID, RefererHash, URLHash, CLID], file_type=parquet, predicate=CAST(URL@13 AS Utf8View) LIKE %google% AND DynamicFilter [ empty ] └────────────────────────────────────────────────── "); Ok(()) @@ -515,16 +497,15 @@ mod tests { #[tokio::test] async fn test_clickbench_25() -> Result<()> { let display = test_clickbench_query("q25").await?; - assert_snapshot!(display, @" + assert_snapshot!(display, @r" ┌───── DistributedExec ── Tasks: t0:[p0] │ SortPreservingMergeExec: [SearchPhrase@0 ASC NULLS LAST], fetch=10 - │ [Stage 1] => NetworkCoalesceExec: output_partitions=6, input_tasks=3 + │ [Stage 1] => NetworkCoalesceExec: output_partitions=9, input_tasks=3 └────────────────────────────────────────────────── - ┌───── Stage 1 ── Tasks: t0:[p0..p1] t1:[p2..p3] t2:[p4..p5] + ┌───── Stage 1 ── Tasks: t0:[p0..p2] t1:[p3..p5] t2:[p6..p8] │ SortExec: TopK(fetch=10), expr=[SearchPhrase@0 ASC NULLS LAST], preserve_partitioning=[true] │ FilterExec: SearchPhrase@0 != - │ PartitionIsolatorExec: tasks=3 partitions=5 - │ DataSourceExec: file_groups={5 groups: [[/testdata/clickbench/plans_range0-3/hits/0.parquet:..], [/testdata/clickbench/plans_range0-3/hits/1.parquet:..], [/testdata/clickbench/plans_range0-3/hits/1.parquet:..], [/testdata/clickbench/plans_range0-3/hits/2.parquet:..], [/testdata/clickbench/plans_range0-3/hits/2.parquet:..]]}, projection=[SearchPhrase], file_type=parquet, predicate=SearchPhrase@39 != AND DynamicFilter [ empty ], pruning_predicate=SearchPhrase_null_count@2 != row_count@3 AND (SearchPhrase_min@0 != OR != SearchPhrase_max@1), required_guarantees=[SearchPhrase not in ()] + │ DistributedLeafExec: DataSourceExec: file_groups={3 groups: [[/testdata/clickbench/plans_range0-3/hits/0.parquet:.., /testdata/clickbench/plans_range0-3/hits/1.parquet:..], [/testdata/clickbench/plans_range0-3/hits/1.parquet:.., /testdata/clickbench/plans_range0-3/hits/2.parquet:..], [/testdata/clickbench/plans_range0-3/hits/2.parquet:..]]}, projection=[SearchPhrase], file_type=parquet, predicate=SearchPhrase@39 != AND DynamicFilter [ empty ], pruning_predicate=SearchPhrase_null_count@2 != row_count@3 AND (SearchPhrase_min@0 != OR != SearchPhrase_max@1), required_guarantees=[SearchPhrase not in ()] └────────────────────────────────────────────────── "); Ok(()) @@ -533,18 +514,17 @@ mod tests { #[tokio::test] async fn test_clickbench_26() -> Result<()> { let display = test_clickbench_query("q26").await?; - assert_snapshot!(display, @" + assert_snapshot!(display, @r" ┌───── DistributedExec ── Tasks: t0:[p0] │ ProjectionExec: expr=[SearchPhrase@0 as SearchPhrase] │ SortPreservingMergeExec: [EventTime@1 ASC NULLS LAST, SearchPhrase@0 ASC NULLS LAST], fetch=10 - │ [Stage 1] => NetworkCoalesceExec: output_partitions=6, input_tasks=3 + │ [Stage 1] => NetworkCoalesceExec: output_partitions=9, input_tasks=3 └────────────────────────────────────────────────── - ┌───── Stage 1 ── Tasks: t0:[p0..p1] t1:[p2..p3] t2:[p4..p5] + ┌───── Stage 1 ── Tasks: t0:[p0..p2] t1:[p3..p5] t2:[p6..p8] │ SortExec: TopK(fetch=10), expr=[EventTime@1 ASC NULLS LAST, SearchPhrase@0 ASC NULLS LAST], preserve_partitioning=[true] │ ProjectionExec: expr=[SearchPhrase@1 as SearchPhrase, EventTime@0 as EventTime] │ FilterExec: SearchPhrase@1 != - │ PartitionIsolatorExec: tasks=3 partitions=5 - │ DataSourceExec: file_groups={5 groups: [[/testdata/clickbench/plans_range0-3/hits/0.parquet:..], [/testdata/clickbench/plans_range0-3/hits/1.parquet:..], [/testdata/clickbench/plans_range0-3/hits/1.parquet:..], [/testdata/clickbench/plans_range0-3/hits/2.parquet:..], [/testdata/clickbench/plans_range0-3/hits/2.parquet:..]]}, projection=[EventTime, SearchPhrase], file_type=parquet, predicate=SearchPhrase@39 != AND DynamicFilter [ empty ], pruning_predicate=SearchPhrase_null_count@2 != row_count@3 AND (SearchPhrase_min@0 != OR != SearchPhrase_max@1), required_guarantees=[SearchPhrase not in ()] + │ DistributedLeafExec: DataSourceExec: file_groups={3 groups: [[/testdata/clickbench/plans_range0-3/hits/0.parquet:.., /testdata/clickbench/plans_range0-3/hits/1.parquet:..], [/testdata/clickbench/plans_range0-3/hits/1.parquet:.., /testdata/clickbench/plans_range0-3/hits/2.parquet:..], [/testdata/clickbench/plans_range0-3/hits/2.parquet:..]]}, projection=[EventTime, SearchPhrase], file_type=parquet, predicate=SearchPhrase@39 != AND DynamicFilter [ empty ], pruning_predicate=SearchPhrase_null_count@2 != row_count@3 AND (SearchPhrase_min@0 != OR != SearchPhrase_max@1), required_guarantees=[SearchPhrase not in ()] └────────────────────────────────────────────────── "); Ok(()) @@ -553,7 +533,7 @@ mod tests { #[tokio::test] async fn test_clickbench_27() -> Result<()> { let display = test_clickbench_query("q27").await?; - assert_snapshot!(display, @" + assert_snapshot!(display, @r" ┌───── DistributedExec ── Tasks: t0:[p0] │ SortPreservingMergeExec: [l@1 DESC], fetch=25 │ [Stage 2] => NetworkCoalesceExec: output_partitions=6, input_tasks=2 @@ -566,11 +546,10 @@ mod tests { │ [Stage 1] => NetworkShuffleExec: output_partitions=3, input_tasks=3 └────────────────────────────────────────────────── ┌───── Stage 1 ── Tasks: t0:[p0..p5] t1:[p0..p5] t2:[p0..p5] - │ RepartitionExec: partitioning=Hash([CounterID@0], 6), input_partitions=2 + │ RepartitionExec: partitioning=Hash([CounterID@0], 6), input_partitions=3 │ AggregateExec: mode=Partial, gby=[CounterID@0 as CounterID], aggr=[avg(length(hits.URL)), count(Int64(1))] │ FilterExec: URL@1 != - │ PartitionIsolatorExec: tasks=3 partitions=5 - │ DataSourceExec: file_groups={5 groups: [[/testdata/clickbench/plans_range0-3/hits/0.parquet:..], [/testdata/clickbench/plans_range0-3/hits/1.parquet:..], [/testdata/clickbench/plans_range0-3/hits/1.parquet:..], [/testdata/clickbench/plans_range0-3/hits/2.parquet:..], [/testdata/clickbench/plans_range0-3/hits/2.parquet:..]]}, projection=[CounterID, URL], file_type=parquet, predicate=URL@13 != , pruning_predicate=URL_null_count@2 != row_count@3 AND (URL_min@0 != OR != URL_max@1), required_guarantees=[URL not in ()] + │ DistributedLeafExec: DataSourceExec: file_groups={3 groups: [[/testdata/clickbench/plans_range0-3/hits/0.parquet:.., /testdata/clickbench/plans_range0-3/hits/1.parquet:..], [/testdata/clickbench/plans_range0-3/hits/1.parquet:.., /testdata/clickbench/plans_range0-3/hits/2.parquet:..], [/testdata/clickbench/plans_range0-3/hits/2.parquet:..]]}, projection=[CounterID, URL], file_type=parquet, predicate=URL@13 != , pruning_predicate=URL_null_count@2 != row_count@3 AND (URL_min@0 != OR != URL_max@1), required_guarantees=[URL not in ()] └────────────────────────────────────────────────── "); Ok(()) @@ -592,11 +571,10 @@ mod tests { │ [Stage 1] => NetworkShuffleExec: output_partitions=3, input_tasks=3 └────────────────────────────────────────────────── ┌───── Stage 1 ── Tasks: t0:[p0..p5] t1:[p0..p5] t2:[p0..p5] - │ RepartitionExec: partitioning=Hash([regexp_replace(hits.Referer,Utf8("^https?://(?:www\.)?([^/]+)/.*$"),Utf8("\1"))@0], 6), input_partitions=2 + │ RepartitionExec: partitioning=Hash([regexp_replace(hits.Referer,Utf8("^https?://(?:www\.)?([^/]+)/.*$"),Utf8("\1"))@0], 6), input_partitions=3 │ AggregateExec: mode=Partial, gby=[regexp_replace(CAST(Referer@0 AS LargeUtf8), ^https?://(?:www\.)?([^/]+)/.*$, \1) as regexp_replace(hits.Referer,Utf8("^https?://(?:www\.)?([^/]+)/.*$"),Utf8("\1"))], aggr=[avg(length(hits.Referer)), count(Int64(1)), min(hits.Referer)] │ FilterExec: Referer@0 != - │ PartitionIsolatorExec: tasks=3 partitions=5 - │ DataSourceExec: file_groups={5 groups: [[/testdata/clickbench/plans_range0-3/hits/0.parquet:..], [/testdata/clickbench/plans_range0-3/hits/1.parquet:..], [/testdata/clickbench/plans_range0-3/hits/1.parquet:..], [/testdata/clickbench/plans_range0-3/hits/2.parquet:..], [/testdata/clickbench/plans_range0-3/hits/2.parquet:..]]}, projection=[Referer], file_type=parquet, predicate=Referer@14 != , pruning_predicate=Referer_null_count@2 != row_count@3 AND (Referer_min@0 != OR != Referer_max@1), required_guarantees=[Referer not in ()] + │ DistributedLeafExec: DataSourceExec: file_groups={3 groups: [[/testdata/clickbench/plans_range0-3/hits/0.parquet:.., /testdata/clickbench/plans_range0-3/hits/1.parquet:..], [/testdata/clickbench/plans_range0-3/hits/1.parquet:.., /testdata/clickbench/plans_range0-3/hits/2.parquet:..], [/testdata/clickbench/plans_range0-3/hits/2.parquet:..]]}, projection=[Referer], file_type=parquet, predicate=Referer@14 != , pruning_predicate=Referer_null_count@2 != row_count@3 AND (Referer_min@0 != OR != Referer_max@1), required_guarantees=[Referer not in ()] └────────────────────────────────────────────────── "#); Ok(()) @@ -605,16 +583,15 @@ mod tests { #[tokio::test] async fn test_clickbench_29() -> Result<()> { let display = test_clickbench_query("q29").await?; - assert_snapshot!(display, @" + assert_snapshot!(display, @r" ┌───── DistributedExec ── Tasks: t0:[p0] │ AggregateExec: mode=Final, gby=[], aggr=[sum(hits.ResolutionWidth), sum(hits.ResolutionWidth + Int64(1)), sum(hits.ResolutionWidth + Int64(2)), sum(hits.ResolutionWidth + Int64(3)), sum(hits.ResolutionWidth + Int64(4)), sum(hits.ResolutionWidth + Int64(5)), sum(hits.ResolutionWidth + Int64(6)), sum(hits.ResolutionWidth + Int64(7)), sum(hits.ResolutionWidth + Int64(8)), sum(hits.ResolutionWidth + Int64(9)), sum(hits.ResolutionWidth + Int64(10)), sum(hits.ResolutionWidth + Int64(11)), sum(hits.ResolutionWidth + Int64(12)), sum(hits.ResolutionWidth + Int64(13)), sum(hits.ResolutionWidth + Int64(14)), sum(hits.ResolutionWidth + Int64(15)), sum(hits.ResolutionWidth + Int64(16)), sum(hits.ResolutionWidth + Int64(17)), sum(hits.ResolutionWidth + Int64(18)), sum(hits.ResolutionWidth + Int64(19)), sum(hits.ResolutionWidth + Int64(20)), sum(hits.ResolutionWidth + Int64(21)), sum(hits.ResolutionWidth + Int64(22)), sum(hits.ResolutionWidth + Int64(23)), sum(hits.ResolutionWidth + Int64(24)), sum(hits.ResolutionWidth + Int64(25)), sum(hits.ResolutionWidth + Int64(26)), sum(hits.ResolutionWidth + Int64(27)), sum(hits.ResolutionWidth + Int64(28)), sum(hits.ResolutionWidth + Int64(29)), sum(hits.ResolutionWidth + Int64(30)), sum(hits.ResolutionWidth + Int64(31)), sum(hits.ResolutionWidth + Int64(32)), sum(hits.ResolutionWidth + Int64(33)), sum(hits.ResolutionWidth + Int64(34)), sum(hits.ResolutionWidth + Int64(35)), sum(hits.ResolutionWidth + Int64(36)), sum(hits.ResolutionWidth + Int64(37)), sum(hits.ResolutionWidth + Int64(38)), sum(hits.ResolutionWidth + Int64(39)), sum(hits.ResolutionWidth + Int64(40)), sum(hits.ResolutionWidth + Int64(41)), sum(hits.ResolutionWidth + Int64(42)), sum(hits.ResolutionWidth + Int64(43)), sum(hits.ResolutionWidth + Int64(44)), sum(hits.ResolutionWidth + Int64(45)), sum(hits.ResolutionWidth + Int64(46)), sum(hits.ResolutionWidth + Int64(47)), sum(hits.ResolutionWidth + Int64(48)), sum(hits.ResolutionWidth + Int64(49)), sum(hits.ResolutionWidth + Int64(50)), sum(hits.ResolutionWidth + Int64(51)), sum(hits.ResolutionWidth + Int64(52)), sum(hits.ResolutionWidth + Int64(53)), sum(hits.ResolutionWidth + Int64(54)), sum(hits.ResolutionWidth + Int64(55)), sum(hits.ResolutionWidth + Int64(56)), sum(hits.ResolutionWidth + Int64(57)), sum(hits.ResolutionWidth + Int64(58)), sum(hits.ResolutionWidth + Int64(59)), sum(hits.ResolutionWidth + Int64(60)), sum(hits.ResolutionWidth + Int64(61)), sum(hits.ResolutionWidth + Int64(62)), sum(hits.ResolutionWidth + Int64(63)), sum(hits.ResolutionWidth + Int64(64)), sum(hits.ResolutionWidth + Int64(65)), sum(hits.ResolutionWidth + Int64(66)), sum(hits.ResolutionWidth + Int64(67)), sum(hits.ResolutionWidth + Int64(68)), sum(hits.ResolutionWidth + Int64(69)), sum(hits.ResolutionWidth + Int64(70)), sum(hits.ResolutionWidth + Int64(71)), sum(hits.ResolutionWidth + Int64(72)), sum(hits.ResolutionWidth + Int64(73)), sum(hits.ResolutionWidth + Int64(74)), sum(hits.ResolutionWidth + Int64(75)), sum(hits.ResolutionWidth + Int64(76)), sum(hits.ResolutionWidth + Int64(77)), sum(hits.ResolutionWidth + Int64(78)), sum(hits.ResolutionWidth + Int64(79)), sum(hits.ResolutionWidth + Int64(80)), sum(hits.ResolutionWidth + Int64(81)), sum(hits.ResolutionWidth + Int64(82)), sum(hits.ResolutionWidth + Int64(83)), sum(hits.ResolutionWidth + Int64(84)), sum(hits.ResolutionWidth + Int64(85)), sum(hits.ResolutionWidth + Int64(86)), sum(hits.ResolutionWidth + Int64(87)), sum(hits.ResolutionWidth + Int64(88)), sum(hits.ResolutionWidth + Int64(89))] │ CoalescePartitionsExec - │ [Stage 1] => NetworkCoalesceExec: output_partitions=6, input_tasks=3 + │ [Stage 1] => NetworkCoalesceExec: output_partitions=9, input_tasks=3 └────────────────────────────────────────────────── - ┌───── Stage 1 ── Tasks: t0:[p0..p1] t1:[p2..p3] t2:[p4..p5] + ┌───── Stage 1 ── Tasks: t0:[p0..p2] t1:[p3..p5] t2:[p6..p8] │ AggregateExec: mode=Partial, gby=[], aggr=[sum(hits.ResolutionWidth), sum(hits.ResolutionWidth + Int64(1)), sum(hits.ResolutionWidth + Int64(2)), sum(hits.ResolutionWidth + Int64(3)), sum(hits.ResolutionWidth + Int64(4)), sum(hits.ResolutionWidth + Int64(5)), sum(hits.ResolutionWidth + Int64(6)), sum(hits.ResolutionWidth + Int64(7)), sum(hits.ResolutionWidth + Int64(8)), sum(hits.ResolutionWidth + Int64(9)), sum(hits.ResolutionWidth + Int64(10)), sum(hits.ResolutionWidth + Int64(11)), sum(hits.ResolutionWidth + Int64(12)), sum(hits.ResolutionWidth + Int64(13)), sum(hits.ResolutionWidth + Int64(14)), sum(hits.ResolutionWidth + Int64(15)), sum(hits.ResolutionWidth + Int64(16)), sum(hits.ResolutionWidth + Int64(17)), sum(hits.ResolutionWidth + Int64(18)), sum(hits.ResolutionWidth + Int64(19)), sum(hits.ResolutionWidth + Int64(20)), sum(hits.ResolutionWidth + Int64(21)), sum(hits.ResolutionWidth + Int64(22)), sum(hits.ResolutionWidth + Int64(23)), sum(hits.ResolutionWidth + Int64(24)), sum(hits.ResolutionWidth + Int64(25)), sum(hits.ResolutionWidth + Int64(26)), sum(hits.ResolutionWidth + Int64(27)), sum(hits.ResolutionWidth + Int64(28)), sum(hits.ResolutionWidth + Int64(29)), sum(hits.ResolutionWidth + Int64(30)), sum(hits.ResolutionWidth + Int64(31)), sum(hits.ResolutionWidth + Int64(32)), sum(hits.ResolutionWidth + Int64(33)), sum(hits.ResolutionWidth + Int64(34)), sum(hits.ResolutionWidth + Int64(35)), sum(hits.ResolutionWidth + Int64(36)), sum(hits.ResolutionWidth + Int64(37)), sum(hits.ResolutionWidth + Int64(38)), sum(hits.ResolutionWidth + Int64(39)), sum(hits.ResolutionWidth + Int64(40)), sum(hits.ResolutionWidth + Int64(41)), sum(hits.ResolutionWidth + Int64(42)), sum(hits.ResolutionWidth + Int64(43)), sum(hits.ResolutionWidth + Int64(44)), sum(hits.ResolutionWidth + Int64(45)), sum(hits.ResolutionWidth + Int64(46)), sum(hits.ResolutionWidth + Int64(47)), sum(hits.ResolutionWidth + Int64(48)), sum(hits.ResolutionWidth + Int64(49)), sum(hits.ResolutionWidth + Int64(50)), sum(hits.ResolutionWidth + Int64(51)), sum(hits.ResolutionWidth + Int64(52)), sum(hits.ResolutionWidth + Int64(53)), sum(hits.ResolutionWidth + Int64(54)), sum(hits.ResolutionWidth + Int64(55)), sum(hits.ResolutionWidth + Int64(56)), sum(hits.ResolutionWidth + Int64(57)), sum(hits.ResolutionWidth + Int64(58)), sum(hits.ResolutionWidth + Int64(59)), sum(hits.ResolutionWidth + Int64(60)), sum(hits.ResolutionWidth + Int64(61)), sum(hits.ResolutionWidth + Int64(62)), sum(hits.ResolutionWidth + Int64(63)), sum(hits.ResolutionWidth + Int64(64)), sum(hits.ResolutionWidth + Int64(65)), sum(hits.ResolutionWidth + Int64(66)), sum(hits.ResolutionWidth + Int64(67)), sum(hits.ResolutionWidth + Int64(68)), sum(hits.ResolutionWidth + Int64(69)), sum(hits.ResolutionWidth + Int64(70)), sum(hits.ResolutionWidth + Int64(71)), sum(hits.ResolutionWidth + Int64(72)), sum(hits.ResolutionWidth + Int64(73)), sum(hits.ResolutionWidth + Int64(74)), sum(hits.ResolutionWidth + Int64(75)), sum(hits.ResolutionWidth + Int64(76)), sum(hits.ResolutionWidth + Int64(77)), sum(hits.ResolutionWidth + Int64(78)), sum(hits.ResolutionWidth + Int64(79)), sum(hits.ResolutionWidth + Int64(80)), sum(hits.ResolutionWidth + Int64(81)), sum(hits.ResolutionWidth + Int64(82)), sum(hits.ResolutionWidth + Int64(83)), sum(hits.ResolutionWidth + Int64(84)), sum(hits.ResolutionWidth + Int64(85)), sum(hits.ResolutionWidth + Int64(86)), sum(hits.ResolutionWidth + Int64(87)), sum(hits.ResolutionWidth + Int64(88)), sum(hits.ResolutionWidth + Int64(89))] - │ PartitionIsolatorExec: tasks=3 partitions=5 - │ DataSourceExec: file_groups={5 groups: [[/testdata/clickbench/plans_range0-3/hits/0.parquet:..], [/testdata/clickbench/plans_range0-3/hits/1.parquet:..], [/testdata/clickbench/plans_range0-3/hits/1.parquet:..], [/testdata/clickbench/plans_range0-3/hits/2.parquet:..], [/testdata/clickbench/plans_range0-3/hits/2.parquet:..]]}, projection=[CAST(ResolutionWidth@20 AS Int64) as __common_expr_1], file_type=parquet + │ DistributedLeafExec: DataSourceExec: file_groups={3 groups: [[/testdata/clickbench/plans_range0-3/hits/0.parquet:.., /testdata/clickbench/plans_range0-3/hits/1.parquet:..], [/testdata/clickbench/plans_range0-3/hits/1.parquet:.., /testdata/clickbench/plans_range0-3/hits/2.parquet:..], [/testdata/clickbench/plans_range0-3/hits/2.parquet:..]]}, projection=[CAST(ResolutionWidth@20 AS Int64) as __common_expr_1], file_type=parquet └────────────────────────────────────────────────── "); Ok(()) @@ -623,7 +600,7 @@ mod tests { #[tokio::test] async fn test_clickbench_30() -> Result<()> { let display = test_clickbench_query("q30").await?; - assert_snapshot!(display, @" + assert_snapshot!(display, @r" ┌───── DistributedExec ── Tasks: t0:[p0] │ SortPreservingMergeExec: [c@2 DESC], fetch=10 │ [Stage 2] => NetworkCoalesceExec: output_partitions=6, input_tasks=2 @@ -635,11 +612,10 @@ mod tests { │ [Stage 1] => NetworkShuffleExec: output_partitions=3, input_tasks=3 └────────────────────────────────────────────────── ┌───── Stage 1 ── Tasks: t0:[p0..p5] t1:[p0..p5] t2:[p0..p5] - │ RepartitionExec: partitioning=Hash([SearchEngineID@0, ClientIP@1], 6), input_partitions=2 + │ RepartitionExec: partitioning=Hash([SearchEngineID@0, ClientIP@1], 6), input_partitions=3 │ AggregateExec: mode=Partial, gby=[SearchEngineID@3 as SearchEngineID, ClientIP@0 as ClientIP], aggr=[count(Int64(1)), sum(hits.IsRefresh), avg(hits.ResolutionWidth)] │ FilterExec: SearchPhrase@4 != , projection=[ClientIP@0, IsRefresh@1, ResolutionWidth@2, SearchEngineID@3] - │ PartitionIsolatorExec: tasks=3 partitions=5 - │ DataSourceExec: file_groups={5 groups: [[/testdata/clickbench/plans_range0-3/hits/0.parquet:..], [/testdata/clickbench/plans_range0-3/hits/1.parquet:..], [/testdata/clickbench/plans_range0-3/hits/1.parquet:..], [/testdata/clickbench/plans_range0-3/hits/2.parquet:..], [/testdata/clickbench/plans_range0-3/hits/2.parquet:..]]}, projection=[ClientIP, IsRefresh, ResolutionWidth, SearchEngineID, SearchPhrase], file_type=parquet, predicate=SearchPhrase@39 != , pruning_predicate=SearchPhrase_null_count@2 != row_count@3 AND (SearchPhrase_min@0 != OR != SearchPhrase_max@1), required_guarantees=[SearchPhrase not in ()] + │ DistributedLeafExec: DataSourceExec: file_groups={3 groups: [[/testdata/clickbench/plans_range0-3/hits/0.parquet:.., /testdata/clickbench/plans_range0-3/hits/1.parquet:..], [/testdata/clickbench/plans_range0-3/hits/1.parquet:.., /testdata/clickbench/plans_range0-3/hits/2.parquet:..], [/testdata/clickbench/plans_range0-3/hits/2.parquet:..]]}, projection=[ClientIP, IsRefresh, ResolutionWidth, SearchEngineID, SearchPhrase], file_type=parquet, predicate=SearchPhrase@39 != , pruning_predicate=SearchPhrase_null_count@2 != row_count@3 AND (SearchPhrase_min@0 != OR != SearchPhrase_max@1), required_guarantees=[SearchPhrase not in ()] └────────────────────────────────────────────────── "); Ok(()) @@ -664,7 +640,7 @@ mod tests { #[tokio::test] async fn test_clickbench_33() -> Result<()> { let display = test_clickbench_query("q33").await?; - assert_snapshot!(display, @" + assert_snapshot!(display, @r" ┌───── DistributedExec ── Tasks: t0:[p0] │ SortPreservingMergeExec: [c@1 DESC], fetch=10 │ [Stage 2] => NetworkCoalesceExec: output_partitions=6, input_tasks=2 @@ -676,10 +652,9 @@ mod tests { │ [Stage 1] => NetworkShuffleExec: output_partitions=3, input_tasks=3 └────────────────────────────────────────────────── ┌───── Stage 1 ── Tasks: t0:[p0..p5] t1:[p0..p5] t2:[p0..p5] - │ RepartitionExec: partitioning=Hash([URL@0], 6), input_partitions=2 + │ RepartitionExec: partitioning=Hash([URL@0], 6), input_partitions=3 │ AggregateExec: mode=Partial, gby=[URL@0 as URL], aggr=[count(Int64(1))] - │ PartitionIsolatorExec: tasks=3 partitions=5 - │ DataSourceExec: file_groups={5 groups: [[/testdata/clickbench/plans_range0-3/hits/0.parquet:..], [/testdata/clickbench/plans_range0-3/hits/1.parquet:..], [/testdata/clickbench/plans_range0-3/hits/1.parquet:..], [/testdata/clickbench/plans_range0-3/hits/2.parquet:..], [/testdata/clickbench/plans_range0-3/hits/2.parquet:..]]}, projection=[URL], file_type=parquet + │ DistributedLeafExec: DataSourceExec: file_groups={3 groups: [[/testdata/clickbench/plans_range0-3/hits/0.parquet:.., /testdata/clickbench/plans_range0-3/hits/1.parquet:..], [/testdata/clickbench/plans_range0-3/hits/1.parquet:.., /testdata/clickbench/plans_range0-3/hits/2.parquet:..], [/testdata/clickbench/plans_range0-3/hits/2.parquet:..]]}, projection=[URL], file_type=parquet └────────────────────────────────────────────────── "); Ok(()) @@ -688,7 +663,7 @@ mod tests { #[tokio::test] async fn test_clickbench_34() -> Result<()> { let display = test_clickbench_query("q34").await?; - assert_snapshot!(display, @" + assert_snapshot!(display, @r" ┌───── DistributedExec ── Tasks: t0:[p0] │ SortPreservingMergeExec: [c@2 DESC], fetch=10 │ [Stage 2] => NetworkCoalesceExec: output_partitions=6, input_tasks=2 @@ -700,10 +675,9 @@ mod tests { │ [Stage 1] => NetworkShuffleExec: output_partitions=3, input_tasks=3 └────────────────────────────────────────────────── ┌───── Stage 1 ── Tasks: t0:[p0..p5] t1:[p0..p5] t2:[p0..p5] - │ RepartitionExec: partitioning=Hash([URL@0], 6), input_partitions=2 + │ RepartitionExec: partitioning=Hash([URL@0], 6), input_partitions=3 │ AggregateExec: mode=Partial, gby=[URL@0 as URL], aggr=[count(Int64(1))] - │ PartitionIsolatorExec: tasks=3 partitions=5 - │ DataSourceExec: file_groups={5 groups: [[/testdata/clickbench/plans_range0-3/hits/0.parquet:..], [/testdata/clickbench/plans_range0-3/hits/1.parquet:..], [/testdata/clickbench/plans_range0-3/hits/1.parquet:..], [/testdata/clickbench/plans_range0-3/hits/2.parquet:..], [/testdata/clickbench/plans_range0-3/hits/2.parquet:..]]}, projection=[URL], file_type=parquet + │ DistributedLeafExec: DataSourceExec: file_groups={3 groups: [[/testdata/clickbench/plans_range0-3/hits/0.parquet:.., /testdata/clickbench/plans_range0-3/hits/1.parquet:..], [/testdata/clickbench/plans_range0-3/hits/1.parquet:.., /testdata/clickbench/plans_range0-3/hits/2.parquet:..], [/testdata/clickbench/plans_range0-3/hits/2.parquet:..]]}, projection=[URL], file_type=parquet └────────────────────────────────────────────────── "); Ok(()) @@ -712,7 +686,7 @@ mod tests { #[tokio::test] async fn test_clickbench_35() -> Result<()> { let display = test_clickbench_query("q35").await?; - assert_snapshot!(display, @" + assert_snapshot!(display, @r" ┌───── DistributedExec ── Tasks: t0:[p0] │ SortPreservingMergeExec: [c@4 DESC], fetch=10 │ [Stage 2] => NetworkCoalesceExec: output_partitions=6, input_tasks=2 @@ -724,10 +698,9 @@ mod tests { │ [Stage 1] => NetworkShuffleExec: output_partitions=3, input_tasks=3 └────────────────────────────────────────────────── ┌───── Stage 1 ── Tasks: t0:[p0..p5] t1:[p0..p5] t2:[p0..p5] - │ RepartitionExec: partitioning=Hash([ClientIP@0, hits.ClientIP - Int64(1)@1, hits.ClientIP - Int64(2)@2, hits.ClientIP - Int64(3)@3], 6), input_partitions=2 + │ RepartitionExec: partitioning=Hash([ClientIP@0, hits.ClientIP - Int64(1)@1, hits.ClientIP - Int64(2)@2, hits.ClientIP - Int64(3)@3], 6), input_partitions=3 │ AggregateExec: mode=Partial, gby=[ClientIP@1 as ClientIP, __common_expr_1@0 - 1 as hits.ClientIP - Int64(1), __common_expr_1@0 - 2 as hits.ClientIP - Int64(2), __common_expr_1@0 - 3 as hits.ClientIP - Int64(3)], aggr=[count(Int64(1))] - │ PartitionIsolatorExec: tasks=3 partitions=5 - │ DataSourceExec: file_groups={5 groups: [[/testdata/clickbench/plans_range0-3/hits/0.parquet:..], [/testdata/clickbench/plans_range0-3/hits/1.parquet:..], [/testdata/clickbench/plans_range0-3/hits/1.parquet:..], [/testdata/clickbench/plans_range0-3/hits/2.parquet:..], [/testdata/clickbench/plans_range0-3/hits/2.parquet:..]]}, projection=[CAST(ClientIP@7 AS Int64) as __common_expr_1, ClientIP], file_type=parquet + │ DistributedLeafExec: DataSourceExec: file_groups={3 groups: [[/testdata/clickbench/plans_range0-3/hits/0.parquet:.., /testdata/clickbench/plans_range0-3/hits/1.parquet:..], [/testdata/clickbench/plans_range0-3/hits/1.parquet:.., /testdata/clickbench/plans_range0-3/hits/2.parquet:..], [/testdata/clickbench/plans_range0-3/hits/2.parquet:..]]}, projection=[CAST(ClientIP@7 AS Int64) as __common_expr_1, ClientIP], file_type=parquet └────────────────────────────────────────────────── "); Ok(()) @@ -736,7 +709,7 @@ mod tests { #[tokio::test] async fn test_clickbench_36() -> Result<()> { let display = test_clickbench_query("q36").await?; - assert_snapshot!(display, @" + assert_snapshot!(display, @r" ┌───── DistributedExec ── Tasks: t0:[p0] │ SortPreservingMergeExec: [pageviews@1 DESC], fetch=10 │ [Stage 2] => NetworkCoalesceExec: output_partitions=6, input_tasks=2 @@ -748,11 +721,10 @@ mod tests { │ [Stage 1] => NetworkShuffleExec: output_partitions=3, input_tasks=3 └────────────────────────────────────────────────── ┌───── Stage 1 ── Tasks: t0:[p0..p5] t1:[p0..p5] t2:[p0..p5] - │ RepartitionExec: partitioning=Hash([URL@0], 6), input_partitions=2 + │ RepartitionExec: partitioning=Hash([URL@0], 6), input_partitions=3 │ AggregateExec: mode=Partial, gby=[URL@0 as URL], aggr=[count(Int64(1))] │ FilterExec: CounterID@1 = 62 AND CAST(EventDate@0 AS Utf8) >= 2013-07-01 AND CAST(EventDate@0 AS Utf8) <= 2013-07-31 AND DontCountHits@4 = 0 AND IsRefresh@3 = 0 AND URL@2 != , projection=[URL@2] - │ PartitionIsolatorExec: tasks=3 partitions=5 - │ DataSourceExec: file_groups={5 groups: [[/testdata/clickbench/plans_range0-3/hits/0.parquet:..], [/testdata/clickbench/plans_range0-3/hits/1.parquet:..], [/testdata/clickbench/plans_range0-3/hits/1.parquet:..], [/testdata/clickbench/plans_range0-3/hits/2.parquet:..], [/testdata/clickbench/plans_range0-3/hits/2.parquet:..]]}, projection=[EventDate, CounterID, URL, IsRefresh, DontCountHits], file_type=parquet, predicate=CounterID@6 = 62 AND CAST(EventDate@5 AS Utf8) >= 2013-07-01 AND CAST(EventDate@5 AS Utf8) <= 2013-07-31 AND DontCountHits@61 = 0 AND IsRefresh@15 = 0 AND URL@13 != , pruning_predicate=CounterID_null_count@2 != row_count@3 AND CounterID_min@0 <= 62 AND 62 <= CounterID_max@1 AND DontCountHits_null_count@6 != row_count@3 AND DontCountHits_min@4 <= 0 AND 0 <= DontCountHits_max@5 AND IsRefresh_null_count@9 != row_count@3 AND IsRefresh_min@7 <= 0 AND 0 <= IsRefresh_max@8 AND URL_null_count@12 != row_count@3 AND (URL_min@10 != OR != URL_max@11), required_guarantees=[CounterID in (62), DontCountHits in (0), IsRefresh in (0), URL not in ()] + │ DistributedLeafExec: DataSourceExec: file_groups={3 groups: [[/testdata/clickbench/plans_range0-3/hits/0.parquet:.., /testdata/clickbench/plans_range0-3/hits/1.parquet:..], [/testdata/clickbench/plans_range0-3/hits/1.parquet:.., /testdata/clickbench/plans_range0-3/hits/2.parquet:..], [/testdata/clickbench/plans_range0-3/hits/2.parquet:..]]}, projection=[EventDate, CounterID, URL, IsRefresh, DontCountHits], file_type=parquet, predicate=CounterID@6 = 62 AND CAST(EventDate@5 AS Utf8) >= 2013-07-01 AND CAST(EventDate@5 AS Utf8) <= 2013-07-31 AND DontCountHits@61 = 0 AND IsRefresh@15 = 0 AND URL@13 != , pruning_predicate=CounterID_null_count@2 != row_count@3 AND CounterID_min@0 <= 62 AND 62 <= CounterID_max@1 AND DontCountHits_null_count@6 != row_count@3 AND DontCountHits_min@4 <= 0 AND 0 <= DontCountHits_max@5 AND IsRefresh_null_count@9 != row_count@3 AND IsRefresh_min@7 <= 0 AND 0 <= IsRefresh_max@8 AND URL_null_count@12 != row_count@3 AND (URL_min@10 != OR != URL_max@11), required_guarantees=[CounterID in (62), DontCountHits in (0), IsRefresh in (0), URL not in ()] └────────────────────────────────────────────────── "); Ok(()) @@ -761,7 +733,7 @@ mod tests { #[tokio::test] async fn test_clickbench_37() -> Result<()> { let display = test_clickbench_query("q37").await?; - assert_snapshot!(display, @" + assert_snapshot!(display, @r" ┌───── DistributedExec ── Tasks: t0:[p0] │ SortPreservingMergeExec: [pageviews@1 DESC], fetch=10 │ [Stage 2] => NetworkCoalesceExec: output_partitions=6, input_tasks=2 @@ -773,11 +745,10 @@ mod tests { │ [Stage 1] => NetworkShuffleExec: output_partitions=3, input_tasks=3 └────────────────────────────────────────────────── ┌───── Stage 1 ── Tasks: t0:[p0..p5] t1:[p0..p5] t2:[p0..p5] - │ RepartitionExec: partitioning=Hash([Title@0], 6), input_partitions=2 + │ RepartitionExec: partitioning=Hash([Title@0], 6), input_partitions=3 │ AggregateExec: mode=Partial, gby=[Title@0 as Title], aggr=[count(Int64(1))] │ FilterExec: CounterID@2 = 62 AND CAST(EventDate@1 AS Utf8) >= 2013-07-01 AND CAST(EventDate@1 AS Utf8) <= 2013-07-31 AND DontCountHits@4 = 0 AND IsRefresh@3 = 0 AND Title@0 != , projection=[Title@0] - │ PartitionIsolatorExec: tasks=3 partitions=5 - │ DataSourceExec: file_groups={5 groups: [[/testdata/clickbench/plans_range0-3/hits/0.parquet:..], [/testdata/clickbench/plans_range0-3/hits/1.parquet:..], [/testdata/clickbench/plans_range0-3/hits/1.parquet:..], [/testdata/clickbench/plans_range0-3/hits/2.parquet:..], [/testdata/clickbench/plans_range0-3/hits/2.parquet:..]]}, projection=[Title, EventDate, CounterID, IsRefresh, DontCountHits], file_type=parquet, predicate=CounterID@6 = 62 AND CAST(EventDate@5 AS Utf8) >= 2013-07-01 AND CAST(EventDate@5 AS Utf8) <= 2013-07-31 AND DontCountHits@61 = 0 AND IsRefresh@15 = 0 AND Title@2 != , pruning_predicate=CounterID_null_count@2 != row_count@3 AND CounterID_min@0 <= 62 AND 62 <= CounterID_max@1 AND DontCountHits_null_count@6 != row_count@3 AND DontCountHits_min@4 <= 0 AND 0 <= DontCountHits_max@5 AND IsRefresh_null_count@9 != row_count@3 AND IsRefresh_min@7 <= 0 AND 0 <= IsRefresh_max@8 AND Title_null_count@12 != row_count@3 AND (Title_min@10 != OR != Title_max@11), required_guarantees=[CounterID in (62), DontCountHits in (0), IsRefresh in (0), Title not in ()] + │ DistributedLeafExec: DataSourceExec: file_groups={3 groups: [[/testdata/clickbench/plans_range0-3/hits/0.parquet:.., /testdata/clickbench/plans_range0-3/hits/1.parquet:..], [/testdata/clickbench/plans_range0-3/hits/1.parquet:.., /testdata/clickbench/plans_range0-3/hits/2.parquet:..], [/testdata/clickbench/plans_range0-3/hits/2.parquet:..]]}, projection=[Title, EventDate, CounterID, IsRefresh, DontCountHits], file_type=parquet, predicate=CounterID@6 = 62 AND CAST(EventDate@5 AS Utf8) >= 2013-07-01 AND CAST(EventDate@5 AS Utf8) <= 2013-07-31 AND DontCountHits@61 = 0 AND IsRefresh@15 = 0 AND Title@2 != , pruning_predicate=CounterID_null_count@2 != row_count@3 AND CounterID_min@0 <= 62 AND 62 <= CounterID_max@1 AND DontCountHits_null_count@6 != row_count@3 AND DontCountHits_min@4 <= 0 AND 0 <= DontCountHits_max@5 AND IsRefresh_null_count@9 != row_count@3 AND IsRefresh_min@7 <= 0 AND 0 <= IsRefresh_max@8 AND Title_null_count@12 != row_count@3 AND (Title_min@10 != OR != Title_max@11), required_guarantees=[CounterID in (62), DontCountHits in (0), IsRefresh in (0), Title not in ()] └────────────────────────────────────────────────── "); Ok(()) @@ -786,7 +757,7 @@ mod tests { #[tokio::test] async fn test_clickbench_38() -> Result<()> { let display = test_clickbench_query("q38").await?; - assert_snapshot!(display, @" + assert_snapshot!(display, @r" ┌───── DistributedExec ── Tasks: t0:[p0] │ GlobalLimitExec: skip=1000, fetch=10 │ SortPreservingMergeExec: [pageviews@1 DESC], fetch=1010 @@ -799,11 +770,10 @@ mod tests { │ [Stage 1] => NetworkShuffleExec: output_partitions=3, input_tasks=3 └────────────────────────────────────────────────── ┌───── Stage 1 ── Tasks: t0:[p0..p5] t1:[p0..p5] t2:[p0..p5] - │ RepartitionExec: partitioning=Hash([URL@0], 6), input_partitions=2 + │ RepartitionExec: partitioning=Hash([URL@0], 6), input_partitions=3 │ AggregateExec: mode=Partial, gby=[URL@0 as URL], aggr=[count(Int64(1))] │ FilterExec: CounterID@1 = 62 AND CAST(EventDate@0 AS Utf8) >= 2013-07-01 AND CAST(EventDate@0 AS Utf8) <= 2013-07-31 AND IsRefresh@3 = 0 AND IsLink@4 != 0 AND IsDownload@5 = 0, projection=[URL@2] - │ PartitionIsolatorExec: tasks=3 partitions=5 - │ DataSourceExec: file_groups={5 groups: [[/testdata/clickbench/plans_range0-3/hits/0.parquet:..], [/testdata/clickbench/plans_range0-3/hits/1.parquet:..], [/testdata/clickbench/plans_range0-3/hits/1.parquet:..], [/testdata/clickbench/plans_range0-3/hits/2.parquet:..], [/testdata/clickbench/plans_range0-3/hits/2.parquet:..]]}, projection=[EventDate, CounterID, URL, IsRefresh, IsLink, IsDownload], file_type=parquet, predicate=CounterID@6 = 62 AND CAST(EventDate@5 AS Utf8) >= 2013-07-01 AND CAST(EventDate@5 AS Utf8) <= 2013-07-31 AND IsRefresh@15 = 0 AND IsLink@52 != 0 AND IsDownload@53 = 0, pruning_predicate=CounterID_null_count@2 != row_count@3 AND CounterID_min@0 <= 62 AND 62 <= CounterID_max@1 AND IsRefresh_null_count@6 != row_count@3 AND IsRefresh_min@4 <= 0 AND 0 <= IsRefresh_max@5 AND IsLink_null_count@9 != row_count@3 AND (IsLink_min@7 != 0 OR 0 != IsLink_max@8) AND IsDownload_null_count@12 != row_count@3 AND IsDownload_min@10 <= 0 AND 0 <= IsDownload_max@11, required_guarantees=[CounterID in (62), IsDownload in (0), IsLink not in (0), IsRefresh in (0)] + │ DistributedLeafExec: DataSourceExec: file_groups={3 groups: [[/testdata/clickbench/plans_range0-3/hits/0.parquet:.., /testdata/clickbench/plans_range0-3/hits/1.parquet:..], [/testdata/clickbench/plans_range0-3/hits/1.parquet:.., /testdata/clickbench/plans_range0-3/hits/2.parquet:..], [/testdata/clickbench/plans_range0-3/hits/2.parquet:..]]}, projection=[EventDate, CounterID, URL, IsRefresh, IsLink, IsDownload], file_type=parquet, predicate=CounterID@6 = 62 AND CAST(EventDate@5 AS Utf8) >= 2013-07-01 AND CAST(EventDate@5 AS Utf8) <= 2013-07-31 AND IsRefresh@15 = 0 AND IsLink@52 != 0 AND IsDownload@53 = 0, pruning_predicate=CounterID_null_count@2 != row_count@3 AND CounterID_min@0 <= 62 AND 62 <= CounterID_max@1 AND IsRefresh_null_count@6 != row_count@3 AND IsRefresh_min@4 <= 0 AND 0 <= IsRefresh_max@5 AND IsLink_null_count@9 != row_count@3 AND (IsLink_min@7 != 0 OR 0 != IsLink_max@8) AND IsDownload_null_count@12 != row_count@3 AND IsDownload_min@10 <= 0 AND 0 <= IsDownload_max@11, required_guarantees=[CounterID in (62), IsDownload in (0), IsLink not in (0), IsRefresh in (0)] └────────────────────────────────────────────────── "); Ok(()) @@ -825,11 +795,10 @@ mod tests { │ [Stage 1] => NetworkShuffleExec: output_partitions=3, input_tasks=3 └────────────────────────────────────────────────── ┌───── Stage 1 ── Tasks: t0:[p0..p5] t1:[p0..p5] t2:[p0..p5] - │ RepartitionExec: partitioning=Hash([TraficSourceID@0, SearchEngineID@1, AdvEngineID@2, CASE WHEN hits.SearchEngineID = Int64(0) AND hits.AdvEngineID = Int64(0) THEN hits.Referer ELSE Utf8("") END@3, URL@4], 6), input_partitions=2 + │ RepartitionExec: partitioning=Hash([TraficSourceID@0, SearchEngineID@1, AdvEngineID@2, CASE WHEN hits.SearchEngineID = Int64(0) AND hits.AdvEngineID = Int64(0) THEN hits.Referer ELSE Utf8("") END@3, URL@4], 6), input_partitions=3 │ AggregateExec: mode=Partial, gby=[TraficSourceID@2 as TraficSourceID, SearchEngineID@3 as SearchEngineID, AdvEngineID@4 as AdvEngineID, CASE WHEN SearchEngineID@3 = 0 AND AdvEngineID@4 = 0 THEN Referer@1 ELSE END as CASE WHEN hits.SearchEngineID = Int64(0) AND hits.AdvEngineID = Int64(0) THEN hits.Referer ELSE Utf8("") END, URL@0 as URL], aggr=[count(Int64(1))] │ FilterExec: CounterID@1 = 62 AND CAST(EventDate@0 AS Utf8) >= 2013-07-01 AND CAST(EventDate@0 AS Utf8) <= 2013-07-31 AND IsRefresh@4 = 0, projection=[URL@2, Referer@3, TraficSourceID@5, SearchEngineID@6, AdvEngineID@7] - │ PartitionIsolatorExec: tasks=3 partitions=5 - │ DataSourceExec: file_groups={5 groups: [[/testdata/clickbench/plans_range0-3/hits/0.parquet:..], [/testdata/clickbench/plans_range0-3/hits/1.parquet:..], [/testdata/clickbench/plans_range0-3/hits/1.parquet:..], [/testdata/clickbench/plans_range0-3/hits/2.parquet:..], [/testdata/clickbench/plans_range0-3/hits/2.parquet:..]]}, projection=[EventDate, CounterID, URL, Referer, IsRefresh, TraficSourceID, SearchEngineID, AdvEngineID], file_type=parquet, predicate=CounterID@6 = 62 AND CAST(EventDate@5 AS Utf8) >= 2013-07-01 AND CAST(EventDate@5 AS Utf8) <= 2013-07-31 AND IsRefresh@15 = 0, pruning_predicate=CounterID_null_count@2 != row_count@3 AND CounterID_min@0 <= 62 AND 62 <= CounterID_max@1 AND IsRefresh_null_count@6 != row_count@3 AND IsRefresh_min@4 <= 0 AND 0 <= IsRefresh_max@5, required_guarantees=[CounterID in (62), IsRefresh in (0)] + │ DistributedLeafExec: DataSourceExec: file_groups={3 groups: [[/testdata/clickbench/plans_range0-3/hits/0.parquet:.., /testdata/clickbench/plans_range0-3/hits/1.parquet:..], [/testdata/clickbench/plans_range0-3/hits/1.parquet:.., /testdata/clickbench/plans_range0-3/hits/2.parquet:..], [/testdata/clickbench/plans_range0-3/hits/2.parquet:..]]}, projection=[EventDate, CounterID, URL, Referer, IsRefresh, TraficSourceID, SearchEngineID, AdvEngineID], file_type=parquet, predicate=CounterID@6 = 62 AND CAST(EventDate@5 AS Utf8) >= 2013-07-01 AND CAST(EventDate@5 AS Utf8) <= 2013-07-31 AND IsRefresh@15 = 0, pruning_predicate=CounterID_null_count@2 != row_count@3 AND CounterID_min@0 <= 62 AND 62 <= CounterID_max@1 AND IsRefresh_null_count@6 != row_count@3 AND IsRefresh_min@4 <= 0 AND 0 <= IsRefresh_max@5, required_guarantees=[CounterID in (62), IsRefresh in (0)] └────────────────────────────────────────────────── "#); Ok(()) @@ -838,7 +807,7 @@ mod tests { #[tokio::test] async fn test_clickbench_40() -> Result<()> { let display = test_clickbench_query("q40").await?; - assert_snapshot!(display, @" + assert_snapshot!(display, @r" ┌───── DistributedExec ── Tasks: t0:[p0] │ GlobalLimitExec: skip=100, fetch=10 │ SortPreservingMergeExec: [pageviews@2 DESC], fetch=110 @@ -851,11 +820,10 @@ mod tests { │ [Stage 1] => NetworkShuffleExec: output_partitions=3, input_tasks=3 └────────────────────────────────────────────────── ┌───── Stage 1 ── Tasks: t0:[p0..p5] t1:[p0..p5] t2:[p0..p5] - │ RepartitionExec: partitioning=Hash([URLHash@0, EventDate@1], 6), input_partitions=2 + │ RepartitionExec: partitioning=Hash([URLHash@0, EventDate@1], 6), input_partitions=3 │ AggregateExec: mode=Partial, gby=[URLHash@1 as URLHash, EventDate@0 as EventDate], aggr=[count(Int64(1))] │ FilterExec: CounterID@1 = 62 AND CAST(EventDate@0 AS Utf8) >= 2013-07-01 AND CAST(EventDate@0 AS Utf8) <= 2013-07-31 AND IsRefresh@2 = 0 AND (TraficSourceID@3 = -1 OR TraficSourceID@3 = 6) AND RefererHash@4 = 3594120000172545465, projection=[EventDate@0, URLHash@5] - │ PartitionIsolatorExec: tasks=3 partitions=5 - │ DataSourceExec: file_groups={5 groups: [[/testdata/clickbench/plans_range0-3/hits/0.parquet:..], [/testdata/clickbench/plans_range0-3/hits/1.parquet:..], [/testdata/clickbench/plans_range0-3/hits/1.parquet:..], [/testdata/clickbench/plans_range0-3/hits/2.parquet:..], [/testdata/clickbench/plans_range0-3/hits/2.parquet:..]]}, projection=[EventDate, CounterID, IsRefresh, TraficSourceID, RefererHash, URLHash], file_type=parquet, predicate=CounterID@6 = 62 AND CAST(EventDate@5 AS Utf8) >= 2013-07-01 AND CAST(EventDate@5 AS Utf8) <= 2013-07-31 AND IsRefresh@15 = 0 AND (TraficSourceID@37 = -1 OR TraficSourceID@37 = 6) AND RefererHash@102 = 3594120000172545465, pruning_predicate=CounterID_null_count@2 != row_count@3 AND CounterID_min@0 <= 62 AND 62 <= CounterID_max@1 AND IsRefresh_null_count@6 != row_count@3 AND IsRefresh_min@4 <= 0 AND 0 <= IsRefresh_max@5 AND (TraficSourceID_null_count@9 != row_count@3 AND TraficSourceID_min@7 <= -1 AND -1 <= TraficSourceID_max@8 OR TraficSourceID_null_count@9 != row_count@3 AND TraficSourceID_min@7 <= 6 AND 6 <= TraficSourceID_max@8) AND RefererHash_null_count@12 != row_count@3 AND RefererHash_min@10 <= 3594120000172545465 AND 3594120000172545465 <= RefererHash_max@11, required_guarantees=[CounterID in (62), IsRefresh in (0), RefererHash in (3594120000172545465), TraficSourceID in (-1, 6)] + │ DistributedLeafExec: DataSourceExec: file_groups={3 groups: [[/testdata/clickbench/plans_range0-3/hits/0.parquet:.., /testdata/clickbench/plans_range0-3/hits/1.parquet:..], [/testdata/clickbench/plans_range0-3/hits/1.parquet:.., /testdata/clickbench/plans_range0-3/hits/2.parquet:..], [/testdata/clickbench/plans_range0-3/hits/2.parquet:..]]}, projection=[EventDate, CounterID, IsRefresh, TraficSourceID, RefererHash, URLHash], file_type=parquet, predicate=CounterID@6 = 62 AND CAST(EventDate@5 AS Utf8) >= 2013-07-01 AND CAST(EventDate@5 AS Utf8) <= 2013-07-31 AND IsRefresh@15 = 0 AND (TraficSourceID@37 = -1 OR TraficSourceID@37 = 6) AND RefererHash@102 = 3594120000172545465, pruning_predicate=CounterID_null_count@2 != row_count@3 AND CounterID_min@0 <= 62 AND 62 <= CounterID_max@1 AND IsRefresh_null_count@6 != row_count@3 AND IsRefresh_min@4 <= 0 AND 0 <= IsRefresh_max@5 AND (TraficSourceID_null_count@9 != row_count@3 AND TraficSourceID_min@7 <= -1 AND -1 <= TraficSourceID_max@8 OR TraficSourceID_null_count@9 != row_count@3 AND TraficSourceID_min@7 <= 6 AND 6 <= TraficSourceID_max@8) AND RefererHash_null_count@12 != row_count@3 AND RefererHash_min@10 <= 3594120000172545465 AND 3594120000172545465 <= RefererHash_max@11, required_guarantees=[CounterID in (62), IsRefresh in (0), RefererHash in (3594120000172545465), TraficSourceID in (-1, 6)] └────────────────────────────────────────────────── "); Ok(()) @@ -864,7 +832,7 @@ mod tests { #[tokio::test] async fn test_clickbench_41() -> Result<()> { let display = test_clickbench_query("q41").await?; - assert_snapshot!(display, @" + assert_snapshot!(display, @r" ┌───── DistributedExec ── Tasks: t0:[p0] │ GlobalLimitExec: skip=10000, fetch=10 │ SortPreservingMergeExec: [pageviews@2 DESC], fetch=10010 @@ -877,11 +845,10 @@ mod tests { │ [Stage 1] => NetworkShuffleExec: output_partitions=3, input_tasks=3 └────────────────────────────────────────────────── ┌───── Stage 1 ── Tasks: t0:[p0..p5] t1:[p0..p5] t2:[p0..p5] - │ RepartitionExec: partitioning=Hash([WindowClientWidth@0, WindowClientHeight@1], 6), input_partitions=2 + │ RepartitionExec: partitioning=Hash([WindowClientWidth@0, WindowClientHeight@1], 6), input_partitions=3 │ AggregateExec: mode=Partial, gby=[WindowClientWidth@0 as WindowClientWidth, WindowClientHeight@1 as WindowClientHeight], aggr=[count(Int64(1))] │ FilterExec: CounterID@1 = 62 AND CAST(EventDate@0 AS Utf8) >= 2013-07-01 AND CAST(EventDate@0 AS Utf8) <= 2013-07-31 AND IsRefresh@2 = 0 AND DontCountHits@5 = 0 AND URLHash@6 = 2868770270353813622, projection=[WindowClientWidth@3, WindowClientHeight@4] - │ PartitionIsolatorExec: tasks=3 partitions=5 - │ DataSourceExec: file_groups={5 groups: [[/testdata/clickbench/plans_range0-3/hits/0.parquet:..], [/testdata/clickbench/plans_range0-3/hits/1.parquet:..], [/testdata/clickbench/plans_range0-3/hits/1.parquet:..], [/testdata/clickbench/plans_range0-3/hits/2.parquet:..], [/testdata/clickbench/plans_range0-3/hits/2.parquet:..]]}, projection=[EventDate, CounterID, IsRefresh, WindowClientWidth, WindowClientHeight, DontCountHits, URLHash], file_type=parquet, predicate=CounterID@6 = 62 AND CAST(EventDate@5 AS Utf8) >= 2013-07-01 AND CAST(EventDate@5 AS Utf8) <= 2013-07-31 AND IsRefresh@15 = 0 AND DontCountHits@61 = 0 AND URLHash@103 = 2868770270353813622, pruning_predicate=CounterID_null_count@2 != row_count@3 AND CounterID_min@0 <= 62 AND 62 <= CounterID_max@1 AND IsRefresh_null_count@6 != row_count@3 AND IsRefresh_min@4 <= 0 AND 0 <= IsRefresh_max@5 AND DontCountHits_null_count@9 != row_count@3 AND DontCountHits_min@7 <= 0 AND 0 <= DontCountHits_max@8 AND URLHash_null_count@12 != row_count@3 AND URLHash_min@10 <= 2868770270353813622 AND 2868770270353813622 <= URLHash_max@11, required_guarantees=[CounterID in (62), DontCountHits in (0), IsRefresh in (0), URLHash in (2868770270353813622)] + │ DistributedLeafExec: DataSourceExec: file_groups={3 groups: [[/testdata/clickbench/plans_range0-3/hits/0.parquet:.., /testdata/clickbench/plans_range0-3/hits/1.parquet:..], [/testdata/clickbench/plans_range0-3/hits/1.parquet:.., /testdata/clickbench/plans_range0-3/hits/2.parquet:..], [/testdata/clickbench/plans_range0-3/hits/2.parquet:..]]}, projection=[EventDate, CounterID, IsRefresh, WindowClientWidth, WindowClientHeight, DontCountHits, URLHash], file_type=parquet, predicate=CounterID@6 = 62 AND CAST(EventDate@5 AS Utf8) >= 2013-07-01 AND CAST(EventDate@5 AS Utf8) <= 2013-07-31 AND IsRefresh@15 = 0 AND DontCountHits@61 = 0 AND URLHash@103 = 2868770270353813622, pruning_predicate=CounterID_null_count@2 != row_count@3 AND CounterID_min@0 <= 62 AND 62 <= CounterID_max@1 AND IsRefresh_null_count@6 != row_count@3 AND IsRefresh_min@4 <= 0 AND 0 <= IsRefresh_max@5 AND DontCountHits_null_count@9 != row_count@3 AND DontCountHits_min@7 <= 0 AND 0 <= DontCountHits_max@8 AND URLHash_null_count@12 != row_count@3 AND URLHash_min@10 <= 2868770270353813622 AND 2868770270353813622 <= URLHash_max@11, required_guarantees=[CounterID in (62), DontCountHits in (0), IsRefresh in (0), URLHash in (2868770270353813622)] └────────────────────────────────────────────────── "); Ok(()) @@ -903,11 +870,10 @@ mod tests { │ [Stage 1] => NetworkShuffleExec: output_partitions=3, input_tasks=3 └────────────────────────────────────────────────── ┌───── Stage 1 ── Tasks: t0:[p0..p5] t1:[p0..p5] t2:[p0..p5] - │ RepartitionExec: partitioning=Hash([date_trunc(Utf8("minute"),to_timestamp_seconds(hits.EventTime))@0], 6), input_partitions=2 + │ RepartitionExec: partitioning=Hash([date_trunc(Utf8("minute"),to_timestamp_seconds(hits.EventTime))@0], 6), input_partitions=3 │ AggregateExec: mode=Partial, gby=[date_trunc(minute, to_timestamp_seconds(EventTime@0)) as date_trunc(Utf8("minute"),to_timestamp_seconds(hits.EventTime))], aggr=[count(Int64(1))] │ FilterExec: CounterID@2 = 62 AND CAST(EventDate@1 AS Utf8) >= 2013-07-14 AND CAST(EventDate@1 AS Utf8) <= 2013-07-15 AND IsRefresh@3 = 0 AND DontCountHits@4 = 0, projection=[EventTime@0] - │ PartitionIsolatorExec: tasks=3 partitions=5 - │ DataSourceExec: file_groups={5 groups: [[/testdata/clickbench/plans_range0-3/hits/0.parquet:..], [/testdata/clickbench/plans_range0-3/hits/1.parquet:..], [/testdata/clickbench/plans_range0-3/hits/1.parquet:..], [/testdata/clickbench/plans_range0-3/hits/2.parquet:..], [/testdata/clickbench/plans_range0-3/hits/2.parquet:..]]}, projection=[EventTime, EventDate, CounterID, IsRefresh, DontCountHits], file_type=parquet, predicate=CounterID@6 = 62 AND CAST(EventDate@5 AS Utf8) >= 2013-07-14 AND CAST(EventDate@5 AS Utf8) <= 2013-07-15 AND IsRefresh@15 = 0 AND DontCountHits@61 = 0, pruning_predicate=CounterID_null_count@2 != row_count@3 AND CounterID_min@0 <= 62 AND 62 <= CounterID_max@1 AND IsRefresh_null_count@6 != row_count@3 AND IsRefresh_min@4 <= 0 AND 0 <= IsRefresh_max@5 AND DontCountHits_null_count@9 != row_count@3 AND DontCountHits_min@7 <= 0 AND 0 <= DontCountHits_max@8, required_guarantees=[CounterID in (62), DontCountHits in (0), IsRefresh in (0)] + │ DistributedLeafExec: DataSourceExec: file_groups={3 groups: [[/testdata/clickbench/plans_range0-3/hits/0.parquet:.., /testdata/clickbench/plans_range0-3/hits/1.parquet:..], [/testdata/clickbench/plans_range0-3/hits/1.parquet:.., /testdata/clickbench/plans_range0-3/hits/2.parquet:..], [/testdata/clickbench/plans_range0-3/hits/2.parquet:..]]}, projection=[EventTime, EventDate, CounterID, IsRefresh, DontCountHits], file_type=parquet, predicate=CounterID@6 = 62 AND CAST(EventDate@5 AS Utf8) >= 2013-07-14 AND CAST(EventDate@5 AS Utf8) <= 2013-07-15 AND IsRefresh@15 = 0 AND DontCountHits@61 = 0, pruning_predicate=CounterID_null_count@2 != row_count@3 AND CounterID_min@0 <= 62 AND 62 <= CounterID_max@1 AND IsRefresh_null_count@6 != row_count@3 AND IsRefresh_min@4 <= 0 AND 0 <= IsRefresh_max@5 AND DontCountHits_null_count@9 != row_count@3 AND DontCountHits_min@7 <= 0 AND 0 <= DontCountHits_max@8, required_guarantees=[CounterID in (62), DontCountHits in (0), IsRefresh in (0)] └────────────────────────────────────────────────── "#); Ok(()) diff --git a/tests/distributed_aggregation.rs b/tests/distributed_aggregation.rs index fc51626e..0bab37df 100644 --- a/tests/distributed_aggregation.rs +++ b/tests/distributed_aggregation.rs @@ -48,7 +48,7 @@ mod tests { ); assert_snapshot!(physical_distributed_str, - @" + @r" ┌───── DistributedExec ── Tasks: t0:[p0] │ ProjectionExec: expr=[count(*)@0 as count(*), RainToday@1 as RainToday] │ SortPreservingMergeExec: [count(Int64(1))@2 ASC NULLS LAST] @@ -61,10 +61,9 @@ mod tests { │ [Stage 1] => NetworkShuffleExec: output_partitions=3, input_tasks=3 └────────────────────────────────────────────────── ┌───── Stage 1 ── Tasks: t0:[p0..p5] t1:[p0..p5] t2:[p0..p5] - │ RepartitionExec: partitioning=Hash([RainToday@0], 6), input_partitions=1 + │ RepartitionExec: partitioning=Hash([RainToday@0], 6), input_partitions=3 │ AggregateExec: mode=Partial, gby=[RainToday@0 as RainToday], aggr=[count(Int64(1))] - │ PartitionIsolatorExec: tasks=3 partitions=3 - │ DataSourceExec: file_groups={3 groups: [[/testdata/weather/result-000000.parquet], [/testdata/weather/result-000001.parquet], [/testdata/weather/result-000002.parquet]]}, projection=[RainToday], file_type=parquet + │ DistributedLeafExec: DataSourceExec: file_groups={3 groups: [[/testdata/weather/result-000000.parquet], [/testdata/weather/result-000001.parquet], [/testdata/weather/result-000002.parquet]]}, projection=[RainToday], file_type=parquet └────────────────────────────────────────────────── ", ); @@ -130,7 +129,7 @@ mod tests { ); assert_snapshot!(physical_distributed_str, - @" + @r" ┌───── DistributedExec ── Tasks: t0:[p0] │ CoalescePartitionsExec │ [Stage 2] => NetworkCoalesceExec: output_partitions=6, input_tasks=2 @@ -141,10 +140,9 @@ mod tests { │ [Stage 1] => NetworkShuffleExec: output_partitions=3, input_tasks=3 └────────────────────────────────────────────────── ┌───── Stage 1 ── Tasks: t0:[p0..p5] t1:[p0..p5] t2:[p0..p5] - │ RepartitionExec: partitioning=Hash([RainToday@0], 6), input_partitions=1 + │ RepartitionExec: partitioning=Hash([RainToday@0], 6), input_partitions=3 │ AggregateExec: mode=Partial, gby=[RainToday@0 as RainToday], aggr=[count(Int64(1))] - │ PartitionIsolatorExec: tasks=3 partitions=3 - │ DataSourceExec: file_groups={3 groups: [[/testdata/weather/result-000000.parquet], [/testdata/weather/result-000001.parquet], [/testdata/weather/result-000002.parquet]]}, projection=[RainToday], file_type=parquet + │ DistributedLeafExec: DataSourceExec: file_groups={3 groups: [[/testdata/weather/result-000000.parquet], [/testdata/weather/result-000001.parquet], [/testdata/weather/result-000002.parquet]]}, projection=[RainToday], file_type=parquet └────────────────────────────────────────────────── ", ); diff --git a/tests/distributed_unions.rs b/tests/distributed_unions.rs index 51fdc7b8..fbc2a868 100644 --- a/tests/distributed_unions.rs +++ b/tests/distributed_unions.rs @@ -37,7 +37,7 @@ mod tests { let physical_distributed_str = display_plan_ascii(physical_distributed.as_ref(), false); assert_snapshot!(physical_distributed_str, - @" + @r" ┌───── DistributedExec ── Tasks: t0:[p0] │ SortPreservingMergeExec: [MinTemp@0 ASC NULLS LAST, RainToday@1 ASC NULLS LAST] │ [Stage 1] => NetworkCoalesceExec: output_partitions=9, input_tasks=3 @@ -46,8 +46,7 @@ mod tests { │ DistributedUnionExec: t0:[c0(0/2)] t1:[c0(1/2)] t2:[c1] │ SortExec: expr=[MinTemp@0 ASC NULLS LAST, RainToday@1 ASC NULLS LAST], preserve_partitioning=[true] │ FilterExec: MinTemp@0 > 10 - │ PartitionIsolatorExec: tasks=2 partitions=3 - │ DataSourceExec: file_groups={3 groups: [[/testdata/weather/result-000000.parquet], [/testdata/weather/result-000001.parquet], [/testdata/weather/result-000002.parquet]]}, projection=[MinTemp, RainToday], file_type=parquet, predicate=MinTemp@0 > 10, pruning_predicate=MinTemp_null_count@1 != row_count@2 AND MinTemp_max@0 > 10, required_guarantees=[] + │ DistributedLeafExec: DataSourceExec: file_groups={3 groups: [[/testdata/weather/result-000000.parquet], [/testdata/weather/result-000001.parquet], [/testdata/weather/result-000002.parquet]]}, projection=[MinTemp, RainToday], file_type=parquet, predicate=MinTemp@0 > 10, pruning_predicate=MinTemp_null_count@1 != row_count@2 AND MinTemp_max@0 > 10, required_guarantees=[] │ SortExec: expr=[MinTemp@0 ASC NULLS LAST, RainToday@1 ASC NULLS LAST], preserve_partitioning=[true] │ ProjectionExec: expr=[MaxTemp@0 as MinTemp, RainToday@1 as RainToday] │ FilterExec: MaxTemp@0 < 30 @@ -234,39 +233,35 @@ mod tests { ┌───── Stage 3 ── Tasks: t0:[p0] t1:[p1] │ DistributedUnionExec: t0:[c0] t1:[c1] │ CoalescePartitionsExec: fetch=1000000 - │ [Stage 1] => NetworkCoalesceExec: output_partitions=3, input_tasks=3 + │ [Stage 1] => NetworkCoalesceExec: output_partitions=9, input_tasks=3 │ ProjectionExec: expr=[MaxTemp@0 as MinTemp, RainToday@1 as RainToday] │ CoalescePartitionsExec: fetch=1000000 - │ [Stage 2] => NetworkCoalesceExec: output_partitions=3, input_tasks=3 + │ [Stage 2] => NetworkCoalesceExec: output_partitions=9, input_tasks=3 └────────────────────────────────────────────────── - ┌───── Stage 1 ── Tasks: t0:[p0] t1:[p1] t2:[p2] + ┌───── Stage 1 ── Tasks: t0:[p0..p2] t1:[p3..p5] t2:[p6..p8] │ FilterExec: MinTemp@0 > 10, fetch=1000000 - │ PartitionIsolatorExec: tasks=3 partitions=3 - │ DataSourceExec: file_groups={3 groups: [[/testdata/weather/result-000000.parquet], [/testdata/weather/result-000001.parquet], [/testdata/weather/result-000002.parquet]]}, projection=[MinTemp, RainToday], file_type=parquet, predicate=MinTemp@0 > 10 AND DynamicFilter [ empty ], pruning_predicate=MinTemp_null_count@1 != row_count@2 AND MinTemp_max@0 > 10, required_guarantees=[] + │ DistributedLeafExec: DataSourceExec: file_groups={3 groups: [[/testdata/weather/result-000000.parquet], [/testdata/weather/result-000001.parquet], [/testdata/weather/result-000002.parquet]]}, projection=[MinTemp, RainToday], file_type=parquet, predicate=MinTemp@0 > 10 AND DynamicFilter [ empty ], pruning_predicate=MinTemp_null_count@1 != row_count@2 AND MinTemp_max@0 > 10, required_guarantees=[] └────────────────────────────────────────────────── - ┌───── Stage 2 ── Tasks: t0:[p0] t1:[p1] t2:[p2] + ┌───── Stage 2 ── Tasks: t0:[p0..p2] t1:[p3..p5] t2:[p6..p8] │ FilterExec: MaxTemp@0 < 30, fetch=1000000 - │ PartitionIsolatorExec: tasks=3 partitions=3 - │ DataSourceExec: file_groups={3 groups: [[/testdata/weather/result-000000.parquet], [/testdata/weather/result-000001.parquet], [/testdata/weather/result-000002.parquet]]}, projection=[MaxTemp, RainToday], file_type=parquet, predicate=MaxTemp@1 < 30 AND DynamicFilter [ empty ], pruning_predicate=MaxTemp_null_count@1 != row_count@2 AND MaxTemp_min@0 < 30, required_guarantees=[] + │ DistributedLeafExec: DataSourceExec: file_groups={3 groups: [[/testdata/weather/result-000000.parquet], [/testdata/weather/result-000001.parquet], [/testdata/weather/result-000002.parquet]]}, projection=[MaxTemp, RainToday], file_type=parquet, predicate=MaxTemp@1 < 30 AND DynamicFilter [ empty ], pruning_predicate=MaxTemp_null_count@1 != row_count@2 AND MaxTemp_min@0 < 30, required_guarantees=[] └────────────────────────────────────────────────── ┌───── Stage 6 ── Tasks: t0:[p0] t1:[p1] │ LocalLimitExec: fetch=1000000 │ DistributedUnionExec: t0:[c0] t1:[c1] │ CoalescePartitionsExec: fetch=1000000 - │ [Stage 4] => NetworkCoalesceExec: output_partitions=3, input_tasks=3 + │ [Stage 4] => NetworkCoalesceExec: output_partitions=9, input_tasks=3 │ ProjectionExec: expr=[Temp3pm@0 as Temp9am, RainToday@1 as RainToday] │ CoalescePartitionsExec: fetch=1000000 - │ [Stage 5] => NetworkCoalesceExec: output_partitions=3, input_tasks=3 + │ [Stage 5] => NetworkCoalesceExec: output_partitions=9, input_tasks=3 └────────────────────────────────────────────────── - ┌───── Stage 4 ── Tasks: t0:[p0] t1:[p1] t2:[p2] + ┌───── Stage 4 ── Tasks: t0:[p0..p2] t1:[p3..p5] t2:[p6..p8] │ FilterExec: Temp9am@0 > 15, fetch=1000000 - │ PartitionIsolatorExec: tasks=3 partitions=3 - │ DataSourceExec: file_groups={3 groups: [[/testdata/weather/result-000000.parquet], [/testdata/weather/result-000001.parquet], [/testdata/weather/result-000002.parquet]]}, projection=[Temp9am, RainToday], file_type=parquet, predicate=Temp9am@17 > 15, pruning_predicate=Temp9am_null_count@1 != row_count@2 AND Temp9am_max@0 > 15, required_guarantees=[] + │ DistributedLeafExec: DataSourceExec: file_groups={3 groups: [[/testdata/weather/result-000000.parquet], [/testdata/weather/result-000001.parquet], [/testdata/weather/result-000002.parquet]]}, projection=[Temp9am, RainToday], file_type=parquet, predicate=Temp9am@17 > 15, pruning_predicate=Temp9am_null_count@1 != row_count@2 AND Temp9am_max@0 > 15, required_guarantees=[] └────────────────────────────────────────────────── - ┌───── Stage 5 ── Tasks: t0:[p0] t1:[p1] t2:[p2] + ┌───── Stage 5 ── Tasks: t0:[p0..p2] t1:[p3..p5] t2:[p6..p8] │ FilterExec: Temp3pm@0 < 25, fetch=1000000 - │ PartitionIsolatorExec: tasks=3 partitions=3 - │ DataSourceExec: file_groups={3 groups: [[/testdata/weather/result-000000.parquet], [/testdata/weather/result-000001.parquet], [/testdata/weather/result-000002.parquet]]}, projection=[Temp3pm, RainToday], file_type=parquet, predicate=Temp3pm@18 < 25, pruning_predicate=Temp3pm_null_count@1 != row_count@2 AND Temp3pm_min@0 < 25, required_guarantees=[] + │ DistributedLeafExec: DataSourceExec: file_groups={3 groups: [[/testdata/weather/result-000000.parquet], [/testdata/weather/result-000001.parquet], [/testdata/weather/result-000002.parquet]]}, projection=[Temp3pm, RainToday], file_type=parquet, predicate=Temp3pm@18 < 25, pruning_predicate=Temp3pm_null_count@1 != row_count@2 AND Temp3pm_min@0 < 25, required_guarantees=[] └────────────────────────────────────────────────── ", ); diff --git a/tests/join.rs b/tests/join.rs index 88a9ae71..a340cc7a 100644 --- a/tests/join.rs +++ b/tests/join.rs @@ -107,19 +107,17 @@ mod tests { // Ensure the distributed plan matches our target plan, registering // hive-style partitioning and avoiding data-shuffling repartitions. assert_snapshot!(&distributed_plan, - @" + @r" ┌───── DistributedExec ── Tasks: t0:[p0] │ SortPreservingMergeExec: [f_dkey@0 ASC NULLS LAST, timestamp@1 ASC NULLS LAST] - │ [Stage 1] => NetworkCoalesceExec: output_partitions=4, input_tasks=2 + │ [Stage 1] => NetworkCoalesceExec: output_partitions=8, input_tasks=2 └────────────────────────────────────────────────── - ┌───── Stage 1 ── Tasks: t0:[p0..p1] t1:[p2..p3] + ┌───── Stage 1 ── Tasks: t0:[p0..p3] t1:[p0..p3] │ ProjectionExec: expr=[f_dkey@5 as f_dkey, timestamp@3 as timestamp, value@4 as value, env@0 as env, service@1 as service, host@2 as host] │ HashJoinExec: mode=Partitioned, join_type=Inner, on=[(d_dkey@3, f_dkey@2)], projection=[env@0, service@1, host@2, timestamp@4, value@5, f_dkey@6] │ FilterExec: service@1 = log - │ PartitionIsolatorExec: tasks=2 partitions=4 - │ DataSourceExec: file_groups={4 groups: [[/testdata/join/parquet/dim/d_dkey=A/data0.parquet], [/testdata/join/parquet/dim/d_dkey=B/data0.parquet], [/testdata/join/parquet/dim/d_dkey=C/data0.parquet], [/testdata/join/parquet/dim/d_dkey=D/data0.parquet]]}, projection=[env, service, host, d_dkey], file_type=parquet, predicate=service@1 = log, pruning_predicate=service_null_count@2 != row_count@3 AND service_min@0 <= log AND log <= service_max@1, required_guarantees=[service in (log)] - │ PartitionIsolatorExec: tasks=2 partitions=4 - │ DataSourceExec: file_groups={4 groups: [[/testdata/join/parquet/fact/f_dkey=A/data0.parquet], [/testdata/join/parquet/fact/f_dkey=B/data0.parquet], [/testdata/join/parquet/fact/f_dkey=C/data0.parquet], [/testdata/join/parquet/fact/f_dkey=D/data0.parquet]]}, projection=[timestamp, value, f_dkey], output_ordering=[f_dkey@2 ASC NULLS LAST, timestamp@0 ASC NULLS LAST], file_type=parquet + │ DistributedLeafExec: DataSourceExec: file_groups={4 groups: [[/testdata/join/parquet/dim/d_dkey=A/data0.parquet], [/testdata/join/parquet/dim/d_dkey=B/data0.parquet], [/testdata/join/parquet/dim/d_dkey=C/data0.parquet], [/testdata/join/parquet/dim/d_dkey=D/data0.parquet]]}, projection=[env, service, host, d_dkey], file_type=parquet, predicate=service@1 = log, pruning_predicate=service_null_count@2 != row_count@3 AND service_min@0 <= log AND log <= service_max@1, required_guarantees=[service in (log)] + │ DistributedLeafExec: DataSourceExec: file_groups={4 groups: [[/testdata/join/parquet/fact/f_dkey=A/data0.parquet], [/testdata/join/parquet/fact/f_dkey=B/data0.parquet], [/testdata/join/parquet/fact/f_dkey=C/data0.parquet], [/testdata/join/parquet/fact/f_dkey=D/data0.parquet]]}, projection=[timestamp, value, f_dkey], output_ordering=[f_dkey@2 ASC NULLS LAST, timestamp@0 ASC NULLS LAST], file_type=parquet └────────────────────────────────────────────────── "); @@ -188,18 +186,16 @@ mod tests { assert_snapshot!(&distributed_plan, @r#" ┌───── DistributedExec ── Tasks: t0:[p0] │ SortPreservingMergeExec: [f_dkey@0 ASC NULLS LAST, time_bin@1 ASC NULLS LAST] - │ [Stage 1] => NetworkCoalesceExec: output_partitions=4, input_tasks=2 + │ [Stage 1] => NetworkCoalesceExec: output_partitions=8, input_tasks=2 └────────────────────────────────────────────────── - ┌───── Stage 1 ── Tasks: t0:[p0..p1] t1:[p2..p3] + ┌───── Stage 1 ── Tasks: t0:[p0..p3] t1:[p0..p3] │ ProjectionExec: expr=[f_dkey@0 as f_dkey, date_bin(IntervalMonthDayNano("IntervalMonthDayNano { months: 0, days: 0, nanoseconds: 30000000000 }"),j.timestamp)@1 as time_bin, env@2 as env, max(j.value)@3 as max_bin_value] │ AggregateExec: mode=SinglePartitioned, gby=[f_dkey@0 as f_dkey, date_bin(IntervalMonthDayNano { months: 0, days: 0, nanoseconds: 30000000000 }, timestamp@2) as date_bin(IntervalMonthDayNano("IntervalMonthDayNano { months: 0, days: 0, nanoseconds: 30000000000 }"),j.timestamp), env@1 as env], aggr=[max(j.value)], ordering_mode=PartiallySorted([0, 1]) │ ProjectionExec: expr=[f_dkey@3 as f_dkey, env@0 as env, timestamp@1 as timestamp, value@2 as value] │ HashJoinExec: mode=Partitioned, join_type=Inner, on=[(d_dkey@1, f_dkey@2)], projection=[env@0, timestamp@2, value@3, f_dkey@4] │ FilterExec: service@1 = log, projection=[env@0, d_dkey@2] - │ PartitionIsolatorExec: tasks=2 partitions=4 - │ DataSourceExec: file_groups={4 groups: [[/testdata/join/parquet/dim/d_dkey=A/data0.parquet], [/testdata/join/parquet/dim/d_dkey=B/data0.parquet], [/testdata/join/parquet/dim/d_dkey=C/data0.parquet], [/testdata/join/parquet/dim/d_dkey=D/data0.parquet]]}, projection=[env, service, d_dkey], file_type=parquet, predicate=service@1 = log, pruning_predicate=service_null_count@2 != row_count@3 AND service_min@0 <= log AND log <= service_max@1, required_guarantees=[service in (log)] - │ PartitionIsolatorExec: tasks=2 partitions=4 - │ DataSourceExec: file_groups={4 groups: [[/testdata/join/parquet/fact/f_dkey=A/data0.parquet], [/testdata/join/parquet/fact/f_dkey=B/data0.parquet], [/testdata/join/parquet/fact/f_dkey=C/data0.parquet], [/testdata/join/parquet/fact/f_dkey=D/data0.parquet]]}, projection=[timestamp, value, f_dkey], output_ordering=[f_dkey@2 ASC NULLS LAST, timestamp@0 ASC NULLS LAST], file_type=parquet + │ DistributedLeafExec: DataSourceExec: file_groups={4 groups: [[/testdata/join/parquet/dim/d_dkey=A/data0.parquet], [/testdata/join/parquet/dim/d_dkey=B/data0.parquet], [/testdata/join/parquet/dim/d_dkey=C/data0.parquet], [/testdata/join/parquet/dim/d_dkey=D/data0.parquet]]}, projection=[env, service, d_dkey], file_type=parquet, predicate=service@1 = log, pruning_predicate=service_null_count@2 != row_count@3 AND service_min@0 <= log AND log <= service_max@1, required_guarantees=[service in (log)] + │ DistributedLeafExec: DataSourceExec: file_groups={4 groups: [[/testdata/join/parquet/fact/f_dkey=A/data0.parquet], [/testdata/join/parquet/fact/f_dkey=B/data0.parquet], [/testdata/join/parquet/fact/f_dkey=C/data0.parquet], [/testdata/join/parquet/fact/f_dkey=D/data0.parquet]]}, projection=[timestamp, value, f_dkey], output_ordering=[f_dkey@2 ASC NULLS LAST, timestamp@0 ASC NULLS LAST], file_type=parquet └────────────────────────────────────────────────── "#); @@ -270,17 +266,15 @@ mod tests { │ [Stage 1] => NetworkShuffleExec: output_partitions=4, input_tasks=2 └────────────────────────────────────────────────── ┌───── Stage 1 ── Tasks: t0:[p0..p3] t1:[p0..p3] - │ RepartitionExec: partitioning=Hash([env@0, time_bin@1], 4), input_partitions=2 + │ RepartitionExec: partitioning=Hash([env@0, time_bin@1], 4), input_partitions=4 │ AggregateExec: mode=Partial, gby=[env@1 as env, time_bin@0 as time_bin], aggr=[avg(a.max_bin_value)] │ ProjectionExec: expr=[date_bin(IntervalMonthDayNano("IntervalMonthDayNano { months: 0, days: 0, nanoseconds: 30000000000 }"),j.timestamp)@1 as time_bin, env@2 as env, max(j.value)@3 as max_bin_value] │ AggregateExec: mode=SinglePartitioned, gby=[f_dkey@0 as f_dkey, date_bin(IntervalMonthDayNano { months: 0, days: 0, nanoseconds: 30000000000 }, timestamp@2) as date_bin(IntervalMonthDayNano("IntervalMonthDayNano { months: 0, days: 0, nanoseconds: 30000000000 }"),j.timestamp), env@1 as env], aggr=[max(j.value)], ordering_mode=PartiallySorted([0, 1]) │ ProjectionExec: expr=[f_dkey@3 as f_dkey, env@0 as env, timestamp@1 as timestamp, value@2 as value] │ HashJoinExec: mode=Partitioned, join_type=Inner, on=[(d_dkey@1, f_dkey@2)], projection=[env@0, timestamp@2, value@3, f_dkey@4] │ FilterExec: service@1 = log, projection=[env@0, d_dkey@2] - │ PartitionIsolatorExec: tasks=2 partitions=4 - │ DataSourceExec: file_groups={4 groups: [[/testdata/join/parquet/dim/d_dkey=A/data0.parquet], [/testdata/join/parquet/dim/d_dkey=B/data0.parquet], [/testdata/join/parquet/dim/d_dkey=C/data0.parquet], [/testdata/join/parquet/dim/d_dkey=D/data0.parquet]]}, projection=[env, service, d_dkey], file_type=parquet, predicate=service@1 = log, pruning_predicate=service_null_count@2 != row_count@3 AND service_min@0 <= log AND log <= service_max@1, required_guarantees=[service in (log)] - │ PartitionIsolatorExec: tasks=2 partitions=4 - │ DataSourceExec: file_groups={4 groups: [[/testdata/join/parquet/fact/f_dkey=A/data0.parquet], [/testdata/join/parquet/fact/f_dkey=B/data0.parquet], [/testdata/join/parquet/fact/f_dkey=C/data0.parquet], [/testdata/join/parquet/fact/f_dkey=D/data0.parquet]]}, projection=[timestamp, value, f_dkey], output_ordering=[f_dkey@2 ASC NULLS LAST, timestamp@0 ASC NULLS LAST], file_type=parquet + │ DistributedLeafExec: DataSourceExec: file_groups={4 groups: [[/testdata/join/parquet/dim/d_dkey=A/data0.parquet], [/testdata/join/parquet/dim/d_dkey=B/data0.parquet], [/testdata/join/parquet/dim/d_dkey=C/data0.parquet], [/testdata/join/parquet/dim/d_dkey=D/data0.parquet]]}, projection=[env, service, d_dkey], file_type=parquet, predicate=service@1 = log, pruning_predicate=service_null_count@2 != row_count@3 AND service_min@0 <= log AND log <= service_max@1, required_guarantees=[service in (log)] + │ DistributedLeafExec: DataSourceExec: file_groups={4 groups: [[/testdata/join/parquet/fact/f_dkey=A/data0.parquet], [/testdata/join/parquet/fact/f_dkey=B/data0.parquet], [/testdata/join/parquet/fact/f_dkey=C/data0.parquet], [/testdata/join/parquet/fact/f_dkey=D/data0.parquet]]}, projection=[timestamp, value, f_dkey], output_ordering=[f_dkey@2 ASC NULLS LAST, timestamp@0 ASC NULLS LAST], file_type=parquet └────────────────────────────────────────────────── "#); diff --git a/tests/metrics_collection.rs b/tests/metrics_collection.rs index 1bc283c3..70bb98f7 100644 --- a/tests/metrics_collection.rs +++ b/tests/metrics_collection.rs @@ -1,6 +1,5 @@ #[cfg(all(feature = "integration", test))] mod tests { - use datafusion::catalog::memory::DataSourceExec; use datafusion::common::Result; use datafusion::common::assert_not_contains; use datafusion::common::tree_node::{Transformed, TreeNode, TreeNodeRecursion}; @@ -15,8 +14,8 @@ mod tests { TestWorkUnitFeedTaskEstimator, }; use datafusion_distributed::{ - DefaultSessionBuilder, DistributedExt, DistributedMetricsFormat, NetworkCoalesceExec, - NetworkShuffleExec, WorkerQueryContext, display_plan_ascii, + DefaultSessionBuilder, DistributedExt, DistributedLeafExec, DistributedMetricsFormat, + NetworkCoalesceExec, NetworkShuffleExec, WorkerQueryContext, display_plan_ascii, rewrite_distributed_plan_with_metrics, }; use futures::TryStreamExt; @@ -40,7 +39,7 @@ mod tests { println!("{}", display_plan_ascii(s_physical.as_ref(), true)); println!("{}", display_plan_ascii(d_physical.as_ref(), true)); - assert_metrics_equal::( + assert_metrics_equal::( ["output_rows", "output_bytes"], &s_physical, &d_physical, @@ -89,7 +88,7 @@ mod tests { println!("{}", display_plan_ascii(d_physical.as_ref(), true)); for data_source_index in 0..2 { - assert_metrics_equal::( + assert_metrics_equal::( ["output_rows", "output_bytes"], &s_physical, &d_physical, @@ -128,7 +127,7 @@ mod tests { println!("{}", display_plan_ascii(d_physical.as_ref(), true)); for data_source_index in 0..5 { - assert_metrics_equal::( + assert_metrics_equal::( ["output_rows", "output_bytes"], &s_physical, &d_physical, diff --git a/tests/stateful_execution_plan.rs b/tests/stateful_execution_plan.rs index 2a051be3..5b9febaa 100644 --- a/tests/stateful_execution_plan.rs +++ b/tests/stateful_execution_plan.rs @@ -64,17 +64,16 @@ mod tests { let plan_str = display_plan_ascii(plan.as_ref(), false); assert_snapshot!(plan_str, - @" + @r" ┌───── DistributedExec ── Tasks: t0:[p0] │ SortPreservingMergeExec: [MinTemp@0 DESC] - │ [Stage 1] => NetworkCoalesceExec: output_partitions=3, input_tasks=3 + │ [Stage 1] => NetworkCoalesceExec: output_partitions=9, input_tasks=3 └────────────────────────────────────────────────── - ┌───── Stage 1 ── Tasks: t0:[p0] t1:[p1] t2:[p2] + ┌───── Stage 1 ── Tasks: t0:[p0..p2] t1:[p3..p5] t2:[p6..p8] │ SortExec: expr=[MinTemp@0 DESC], preserve_partitioning=[true] │ FilterExec: MinTemp@0 > 20 - │ PartitionIsolatorExec: tasks=3 partitions=3 - │ StatefulPassThroughExec - │ DataSourceExec: file_groups={3 groups: [[/testdata/weather/result-000000.parquet], [/testdata/weather/result-000001.parquet], [/testdata/weather/result-000002.parquet]]}, projection=[MinTemp, RainToday], file_type=parquet, predicate=MinTemp@0 > 20, pruning_predicate=MinTemp_null_count@1 != row_count@2 AND MinTemp_max@0 > 20, required_guarantees=[] + │ StatefulPassThroughExec + │ DistributedLeafExec: DataSourceExec: file_groups={3 groups: [[/testdata/weather/result-000000.parquet], [/testdata/weather/result-000001.parquet], [/testdata/weather/result-000002.parquet]]}, projection=[MinTemp, RainToday], file_type=parquet, predicate=MinTemp@0 > 20, pruning_predicate=MinTemp_null_count@1 != row_count@2 AND MinTemp_max@0 > 20, required_guarantees=[] └────────────────────────────────────────────────── ", ); diff --git a/tests/task_estimator_test.rs b/tests/task_estimator_test.rs index e3b031af..649a56c3 100644 --- a/tests/task_estimator_test.rs +++ b/tests/task_estimator_test.rs @@ -24,15 +24,14 @@ mod tests { .await?; assert_snapshot!(plan + &results, - @" + @r" ┌───── DistributedExec ── Tasks: t0:[p0] │ SortPreservingMergeExec: [task_index@1 ASC NULLS LAST] │ [Stage 1] => NetworkCoalesceExec: output_partitions=5, input_tasks=5 └────────────────────────────────────────────────── ┌───── Stage 1 ── Tasks: t0:[p0] t1:[p1] t2:[p2] t3:[p3] t4:[p4] │ SortExec: expr=[task_index@1 ASC NULLS LAST], preserve_partitioning=[true] - │ PartitionIsolatorExec: tasks=5 partitions=5 - │ URLEmitterExec: tasks=5 partitions=5 tag=logs + │ DistributedLeafExec: URLEmitterExec: tasks=5 partitions=1 tag=logs └────────────────────────────────────────────────── +------------+------------+------+--------------+ | task_count | task_index | tag | worker_url | @@ -60,15 +59,14 @@ mod tests { .await?; assert_snapshot!(plan + &results, - @" + @r" ┌───── DistributedExec ── Tasks: t0:[p0] │ SortPreservingMergeExec: [task_index@1 ASC NULLS LAST] │ [Stage 1] => NetworkCoalesceExec: output_partitions=10, input_tasks=5 └────────────────────────────────────────────────── ┌───── Stage 1 ── Tasks: t0:[p0..p1] t1:[p2..p3] t2:[p4..p5] t3:[p6..p7] t4:[p8..p9] │ SortExec: expr=[task_index@1 ASC NULLS LAST], preserve_partitioning=[true] - │ PartitionIsolatorExec: tasks=5 partitions=8 - │ URLEmitterExec: tasks=5 partitions=8 tag=logs + │ DistributedLeafExec: URLEmitterExec: tasks=5 partitions=2 tag=logs └────────────────────────────────────────────────── +------------+------------+------+--------------+ | task_count | task_index | tag | worker_url | @@ -99,15 +97,14 @@ mod tests { .await?; assert_snapshot!(plan + &results, - @" + @r" ┌───── DistributedExec ── Tasks: t0:[p0] │ SortPreservingMergeExec: [task_index@1 ASC NULLS LAST] │ [Stage 1] => NetworkCoalesceExec: output_partitions=5, input_tasks=5 └────────────────────────────────────────────────── ┌───── Stage 1 ── Tasks: t0:[p0] t1:[p1] t2:[p2] t3:[p3] t4:[p4] │ SortExec: expr=[task_index@1 ASC NULLS LAST], preserve_partitioning=[true] - │ PartitionIsolatorExec: tasks=5 partitions=3 - │ URLEmitterExec: tasks=5 partitions=3 tag=logs + │ DistributedLeafExec: URLEmitterExec: tasks=5 partitions=1 tag=logs └────────────────────────────────────────────────── +------------+------------+------+--------------+ | task_count | task_index | tag | worker_url | @@ -150,10 +147,8 @@ mod tests { │ RepartitionExec: partitioning=Hash([task_count@0, task_index@1, tag@2, worker_url@3], 12), input_partitions=3 │ AggregateExec: mode=Partial, gby=[task_count@0 as task_count, task_index@1 as task_index, tag@2 as tag, worker_url@3 as worker_url], aggr=[] │ DistributedUnionExec: t0:[c0(0/3)] t1:[c0(1/3)] t2:[c0(2/3)] t3:[c1(0/2)] t4:[c1(1/2)] - │ PartitionIsolatorExec: tasks=3 partitions=5 - │ URLEmitterExec: tasks=5 partitions=5 tag=left - │ PartitionIsolatorExec: tasks=2 partitions=5 - │ URLEmitterExec: tasks=5 partitions=5 tag=right + │ DistributedLeafExec: URLEmitterExec: tasks=5 partitions=2 tag=left + │ DistributedLeafExec: URLEmitterExec: tasks=5 partitions=3 tag=right └────────────────────────────────────────────────── +------------+------------+-------+--------------+ | task_count | task_index | tag | worker_url | @@ -198,10 +193,8 @@ mod tests { │ RepartitionExec: partitioning=Hash([task_count@0, task_index@1, tag@2, worker_url@3], 12), input_partitions=4 │ AggregateExec: mode=Partial, gby=[task_count@0 as task_count, task_index@1 as task_index, tag@2 as tag, worker_url@3 as worker_url], aggr=[] │ DistributedUnionExec: t0:[c0(0/4)] t1:[c0(1/4)] t2:[c0(2/4)] t3:[c0(3/4)] t4:[c1] - │ PartitionIsolatorExec: tasks=4 partitions=3 - │ URLEmitterExec: tasks=8 partitions=3 tag=left - │ PartitionIsolatorExec: tasks=1 partitions=4 - │ URLEmitterExec: tasks=2 partitions=4 tag=right + │ DistributedLeafExec: URLEmitterExec: tasks=8 partitions=1 tag=left + │ DistributedLeafExec: URLEmitterExec: tasks=2 partitions=4 tag=right └────────────────────────────────────────────────── +------------+------------+-------+--------------+ | task_count | task_index | tag | worker_url | @@ -237,7 +230,7 @@ mod tests { .await?; assert_snapshot!(plan + &results, - @" + @r" ┌───── DistributedExec ── Tasks: t0:[p0] │ ProjectionExec: expr=[task_count@0 as task_count, left_index@1 as left_index, left_tag@2 as left_tag, worker_left@3 as worker_left, right_index@4 as right_index, right_tag@5 as right_tag, worker_right@6 as worker_right] │ SortPreservingMergeExec: [task_index@7 ASC NULLS LAST] @@ -252,13 +245,11 @@ mod tests { └────────────────────────────────────────────────── ┌───── Stage 1 ── Tasks: t0:[p0..p14] t1:[p0..p14] t2:[p0..p14] t3:[p0..p14] t4:[p0..p14] │ RepartitionExec: partitioning=Hash([task_index@1], 15), input_partitions=1 - │ PartitionIsolatorExec: tasks=5 partitions=5 - │ URLEmitterExec: tasks=5 partitions=5 tag=left + │ DistributedLeafExec: URLEmitterExec: tasks=5 partitions=1 tag=left └────────────────────────────────────────────────── ┌───── Stage 2 ── Tasks: t0:[p0..p14] t1:[p0..p14] t2:[p0..p14] t3:[p0..p14] t4:[p0..p14] │ RepartitionExec: partitioning=Hash([task_index@0], 15), input_partitions=1 - │ PartitionIsolatorExec: tasks=5 partitions=5 - │ URLEmitterExec: tasks=5 partitions=5 tag=right + │ DistributedLeafExec: URLEmitterExec: tasks=5 partitions=1 tag=right └────────────────────────────────────────────────── +------------+------------+----------+--------------+-------------+-----------+--------------+ | task_count | left_index | left_tag | worker_left | right_index | right_tag | worker_right | @@ -295,7 +286,7 @@ mod tests { .await?; assert_snapshot!(plan + &results, - @" + @r" ┌───── DistributedExec ── Tasks: t0:[p0] │ ProjectionExec: expr=[task_count@0 as task_count, left_index@1 as left_index, left_tag@2 as left_tag, worker_left@3 as worker_left, right_index@4 as right_index, right_tag@5 as right_tag, worker_right@6 as worker_right] │ SortPreservingMergeExec: [task_index@7 ASC NULLS LAST] @@ -310,13 +301,11 @@ mod tests { └────────────────────────────────────────────────── ┌───── Stage 1 ── Tasks: t0:[p0..p14] t1:[p0..p14] t2:[p0..p14] t3:[p0..p14] t4:[p0..p14] │ RepartitionExec: partitioning=Hash([task_index@1], 15), input_partitions=3 - │ PartitionIsolatorExec: tasks=5 partitions=12 - │ URLEmitterExec: tasks=9 partitions=12 tag=left + │ DistributedLeafExec: URLEmitterExec: tasks=9 partitions=3 tag=left └────────────────────────────────────────────────── ┌───── Stage 2 ── Tasks: t0:[p0..p14] t1:[p0..p14] t2:[p0..p14] t3:[p0..p14] t4:[p0..p14] │ RepartitionExec: partitioning=Hash([task_index@0], 15), input_partitions=2 - │ PartitionIsolatorExec: tasks=5 partitions=7 - │ URLEmitterExec: tasks=10 partitions=7 tag=right + │ DistributedLeafExec: URLEmitterExec: tasks=10 partitions=2 tag=right └────────────────────────────────────────────────── +------------+------------+----------+--------------+-------------+-----------+--------------+ | task_count | left_index | left_tag | worker_left | right_index | right_tag | worker_right | diff --git a/tests/tpcds_plans_test.rs b/tests/tpcds_plans_test.rs index 21352ecb..72c296a4 100644 --- a/tests/tpcds_plans_test.rs +++ b/tests/tpcds_plans_test.rs @@ -20,7 +20,7 @@ mod tests { #[tokio::test] async fn test_tpcds_1() -> Result<()> { let display = test_tpcds_query("q1").await?; - assert_snapshot!(display, @" + assert_snapshot!(display, @r" ┌───── DistributedExec ── Tasks: t0:[p0] │ SortPreservingMergeExec: [c_customer_id@0 ASC NULLS LAST], fetch=100 │ [Stage 9] => NetworkCoalesceExec: output_partitions=6, input_tasks=2 @@ -41,8 +41,7 @@ mod tests { │ [Stage 4] => NetworkBroadcastExec: partitions_per_consumer=3, stage_partitions=6, input_tasks=2 │ ProjectionExec: expr=[c_customer_sk@0 as c_customer_sk, c_customer_id@1 as c_customer_id, CAST(c_customer_sk@0 AS Float64) as CAST(customer.c_customer_sk AS Float64)] │ RepartitionExec: partitioning=RoundRobinBatch(3), input_partitions=2 - │ PartitionIsolatorExec: tasks=2 partitions=4 - │ DataSourceExec: file_groups={4 groups: [[/testdata/tpcds/plans_sf1_partitions4/customer/part-0.parquet], [/testdata/tpcds/plans_sf1_partitions4/customer/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/customer/part-2.parquet], [/testdata/tpcds/plans_sf1_partitions4/customer/part-3.parquet]]}, projection=[c_customer_sk, c_customer_id], file_type=parquet, predicate=DynamicFilter [ empty ] AND DynamicFilter [ empty ] + │ DistributedLeafExec: DataSourceExec: file_groups={2 groups: [[/testdata/tpcds/plans_sf1_partitions4/customer/part-0.parquet, /testdata/tpcds/plans_sf1_partitions4/customer/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/customer/part-2.parquet, /testdata/tpcds/plans_sf1_partitions4/customer/part-3.parquet]]}, projection=[c_customer_sk, c_customer_id], file_type=parquet, predicate=DynamicFilter [ empty ] AND DynamicFilter [ empty ] └────────────────────────────────────────────────── ┌───── Stage 4 ── Tasks: t0:[p0..p5] t1:[p6..p11] │ BroadcastExec: input_partitions=3, consumer_tasks=2, output_partitions=6 @@ -58,25 +57,22 @@ mod tests { │ ProjectionExec: expr=[s_store_sk@0 as s_store_sk, CAST(s_store_sk@0 AS Float64) as CAST(store.s_store_sk AS Float64)] │ FilterExec: s_state@1 = TN, projection=[s_store_sk@0] │ RepartitionExec: partitioning=RoundRobinBatch(3), input_partitions=2 - │ PartitionIsolatorExec: tasks=2 partitions=4 - │ DataSourceExec: file_groups={4 groups: [[/testdata/tpcds/plans_sf1_partitions4/store/part-0.parquet], [/testdata/tpcds/plans_sf1_partitions4/store/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/store/part-2.parquet], [/testdata/tpcds/plans_sf1_partitions4/store/part-3.parquet]]}, projection=[s_store_sk, s_state], file_type=parquet, predicate=s_state@24 = TN, pruning_predicate=s_state_null_count@2 != row_count@3 AND s_state_min@0 <= TN AND TN <= s_state_max@1, required_guarantees=[s_state in (TN)] + │ DistributedLeafExec: DataSourceExec: file_groups={2 groups: [[/testdata/tpcds/plans_sf1_partitions4/store/part-0.parquet, /testdata/tpcds/plans_sf1_partitions4/store/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/store/part-2.parquet, /testdata/tpcds/plans_sf1_partitions4/store/part-3.parquet]]}, projection=[s_store_sk, s_state], file_type=parquet, predicate=s_state@24 = TN, pruning_predicate=s_state_null_count@2 != row_count@3 AND s_state_min@0 <= TN AND TN <= s_state_max@1, required_guarantees=[s_state in (TN)] └────────────────────────────────────────────────── ┌───── Stage 3 ── Tasks: t0:[p0..p5] t1:[p0..p5] t2:[p0..p5] - │ RepartitionExec: partitioning=Hash([sr_customer_sk@0, sr_store_sk@1], 6), input_partitions=2 + │ RepartitionExec: partitioning=Hash([sr_customer_sk@0, sr_store_sk@1], 6), input_partitions=3 │ AggregateExec: mode=Partial, gby=[sr_customer_sk@0 as sr_customer_sk, sr_store_sk@1 as sr_store_sk], aggr=[sum(store_returns.sr_return_amt)] │ HashJoinExec: mode=CollectLeft, join_type=Inner, on=[(CAST(date_dim.d_date_sk AS Float64)@1, sr_returned_date_sk@0)], projection=[sr_customer_sk@3, sr_store_sk@4, sr_return_amt@5] │ CoalescePartitionsExec │ [Stage 2] => NetworkBroadcastExec: partitions_per_consumer=3, stage_partitions=9, input_tasks=2 - │ PartitionIsolatorExec: tasks=3 partitions=6 - │ DataSourceExec: file_groups={6 groups: [[/testdata/tpcds/plans_sf1_partitions4/store_returns/part-0.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_returns/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_returns/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_returns/part-2.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_returns/part-2.parquet:..], ...]}, projection=[sr_returned_date_sk, sr_customer_sk, sr_store_sk, sr_return_amt], file_type=parquet, predicate=DynamicFilter [ empty ] AND DynamicFilter [ empty ] + │ DistributedLeafExec: DataSourceExec: file_groups={3 groups: [[/testdata/tpcds/plans_sf1_partitions4/store_returns/part-0.parquet:.., /testdata/tpcds/plans_sf1_partitions4/store_returns/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_returns/part-1.parquet:.., /testdata/tpcds/plans_sf1_partitions4/store_returns/part-2.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_returns/part-2.parquet:.., /testdata/tpcds/plans_sf1_partitions4/store_returns/part-3.parquet:..]]}, projection=[sr_returned_date_sk, sr_customer_sk, sr_store_sk, sr_return_amt], file_type=parquet, predicate=DynamicFilter [ empty ] AND DynamicFilter [ empty ] └────────────────────────────────────────────────── ┌───── Stage 2 ── Tasks: t0:[p0..p8] t1:[p9..p17] │ BroadcastExec: input_partitions=3, consumer_tasks=3, output_partitions=9 │ ProjectionExec: expr=[d_date_sk@0 as d_date_sk, CAST(d_date_sk@0 AS Float64) as CAST(date_dim.d_date_sk AS Float64)] │ FilterExec: d_year@1 = 2000, projection=[d_date_sk@0] │ RepartitionExec: partitioning=RoundRobinBatch(3), input_partitions=2 - │ PartitionIsolatorExec: tasks=2 partitions=4 - │ DataSourceExec: file_groups={4 groups: [[/testdata/tpcds/plans_sf1_partitions4/date_dim/part-0.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-2.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-3.parquet]]}, projection=[d_date_sk, d_year], file_type=parquet, predicate=d_year@6 = 2000, pruning_predicate=d_year_null_count@2 != row_count@3 AND d_year_min@0 <= 2000 AND 2000 <= d_year_max@1, required_guarantees=[d_year in (2000)] + │ DistributedLeafExec: DataSourceExec: file_groups={2 groups: [[/testdata/tpcds/plans_sf1_partitions4/date_dim/part-0.parquet, /testdata/tpcds/plans_sf1_partitions4/date_dim/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-2.parquet, /testdata/tpcds/plans_sf1_partitions4/date_dim/part-3.parquet]]}, projection=[d_date_sk, d_year], file_type=parquet, predicate=d_year@6 = 2000, pruning_predicate=d_year_null_count@2 != row_count@3 AND d_year_min@0 <= 2000 AND 2000 <= d_year_max@1, required_guarantees=[d_year in (2000)] └────────────────────────────────────────────────── ┌───── Stage 8 ── Tasks: t0:[p0..p5] t1:[p0..p5] │ RepartitionExec: partitioning=Hash([ctr_store_sk@0], 6), input_partitions=3 @@ -86,21 +82,19 @@ mod tests { │ [Stage 7] => NetworkShuffleExec: output_partitions=3, input_tasks=3 └────────────────────────────────────────────────── ┌───── Stage 7 ── Tasks: t0:[p0..p5] t1:[p0..p5] t2:[p0..p5] - │ RepartitionExec: partitioning=Hash([sr_customer_sk@0, sr_store_sk@1], 6), input_partitions=2 + │ RepartitionExec: partitioning=Hash([sr_customer_sk@0, sr_store_sk@1], 6), input_partitions=3 │ AggregateExec: mode=Partial, gby=[sr_customer_sk@0 as sr_customer_sk, sr_store_sk@1 as sr_store_sk], aggr=[sum(store_returns.sr_return_amt)] │ HashJoinExec: mode=CollectLeft, join_type=Inner, on=[(CAST(date_dim.d_date_sk AS Float64)@1, sr_returned_date_sk@0)], projection=[sr_customer_sk@3, sr_store_sk@4, sr_return_amt@5] │ CoalescePartitionsExec │ [Stage 6] => NetworkBroadcastExec: partitions_per_consumer=3, stage_partitions=9, input_tasks=2 - │ PartitionIsolatorExec: tasks=3 partitions=6 - │ DataSourceExec: file_groups={6 groups: [[/testdata/tpcds/plans_sf1_partitions4/store_returns/part-0.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_returns/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_returns/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_returns/part-2.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_returns/part-2.parquet:..], ...]}, projection=[sr_returned_date_sk, sr_customer_sk, sr_store_sk, sr_return_amt], file_type=parquet, predicate=DynamicFilter [ empty ] AND DynamicFilter [ empty ] + │ DistributedLeafExec: DataSourceExec: file_groups={3 groups: [[/testdata/tpcds/plans_sf1_partitions4/store_returns/part-0.parquet:.., /testdata/tpcds/plans_sf1_partitions4/store_returns/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_returns/part-1.parquet:.., /testdata/tpcds/plans_sf1_partitions4/store_returns/part-2.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_returns/part-2.parquet:.., /testdata/tpcds/plans_sf1_partitions4/store_returns/part-3.parquet:..]]}, projection=[sr_returned_date_sk, sr_customer_sk, sr_store_sk, sr_return_amt], file_type=parquet, predicate=DynamicFilter [ empty ] AND DynamicFilter [ empty ] └────────────────────────────────────────────────── ┌───── Stage 6 ── Tasks: t0:[p0..p8] t1:[p9..p17] │ BroadcastExec: input_partitions=3, consumer_tasks=3, output_partitions=9 │ ProjectionExec: expr=[d_date_sk@0 as d_date_sk, CAST(d_date_sk@0 AS Float64) as CAST(date_dim.d_date_sk AS Float64)] │ FilterExec: d_year@1 = 2000, projection=[d_date_sk@0] │ RepartitionExec: partitioning=RoundRobinBatch(3), input_partitions=2 - │ PartitionIsolatorExec: tasks=2 partitions=4 - │ DataSourceExec: file_groups={4 groups: [[/testdata/tpcds/plans_sf1_partitions4/date_dim/part-0.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-2.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-3.parquet]]}, projection=[d_date_sk, d_year], file_type=parquet, predicate=d_year@6 = 2000, pruning_predicate=d_year_null_count@2 != row_count@3 AND d_year_min@0 <= 2000 AND 2000 <= d_year_max@1, required_guarantees=[d_year in (2000)] + │ DistributedLeafExec: DataSourceExec: file_groups={2 groups: [[/testdata/tpcds/plans_sf1_partitions4/date_dim/part-0.parquet, /testdata/tpcds/plans_sf1_partitions4/date_dim/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-2.parquet, /testdata/tpcds/plans_sf1_partitions4/date_dim/part-3.parquet]]}, projection=[d_date_sk, d_year], file_type=parquet, predicate=d_year@6 = 2000, pruning_predicate=d_year_null_count@2 != row_count@3 AND d_year_min@0 <= 2000 AND 2000 <= d_year_max@1, required_guarantees=[d_year in (2000)] └────────────────────────────────────────────────── "); Ok(()) @@ -142,8 +136,7 @@ mod tests { │ BroadcastExec: input_partitions=3, consumer_tasks=2, output_partitions=6 │ FilterExec: d_year@1 = 2001, projection=[d_week_seq@0] │ RepartitionExec: partitioning=RoundRobinBatch(3), input_partitions=2 - │ PartitionIsolatorExec: tasks=2 partitions=4 - │ DataSourceExec: file_groups={4 groups: [[/testdata/tpcds/plans_sf1_partitions4/date_dim/part-0.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-2.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-3.parquet]]}, projection=[d_week_seq, d_year], file_type=parquet, predicate=d_year@6 = 2001, pruning_predicate=d_year_null_count@2 != row_count@3 AND d_year_min@0 <= 2001 AND 2001 <= d_year_max@1, required_guarantees=[d_year in (2001)] + │ DistributedLeafExec: DataSourceExec: file_groups={2 groups: [[/testdata/tpcds/plans_sf1_partitions4/date_dim/part-0.parquet, /testdata/tpcds/plans_sf1_partitions4/date_dim/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-2.parquet, /testdata/tpcds/plans_sf1_partitions4/date_dim/part-3.parquet]]}, projection=[d_week_seq, d_year], file_type=parquet, predicate=d_year@6 = 2001, pruning_predicate=d_year_null_count@2 != row_count@3 AND d_year_min@0 <= 2001 AND 2001 <= d_year_max@1, required_guarantees=[d_year in (2001)] └────────────────────────────────────────────────── ┌───── Stage 3 ── Tasks: t0:[p0..p5] t1:[p0..p5] t2:[p0..p5] t3:[p0..p5] │ RepartitionExec: partitioning=Hash([d_week_seq@0], 6), input_partitions=3 @@ -153,24 +146,20 @@ mod tests { │ CoalescePartitionsExec │ [Stage 2] => NetworkBroadcastExec: partitions_per_consumer=3, stage_partitions=12, input_tasks=2 │ DistributedUnionExec: t0:[c0(0/2)] t1:[c0(1/2)] t2:[c1(0/2)] t3:[c1(1/2)] - │ PartitionIsolatorExec: tasks=2 partitions=6 - │ DataSourceExec: file_groups={6 groups: [[/testdata/tpcds/plans_sf1_partitions4/web_sales/part-0.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/web_sales/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/web_sales/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/web_sales/part-2.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/web_sales/part-2.parquet:..], ...]}, projection=[ws_sold_date_sk@0 as sold_date_sk, ws_ext_sales_price@23 as sales_price], file_type=parquet, predicate=DynamicFilter [ empty ] - │ PartitionIsolatorExec: tasks=2 partitions=6 - │ DataSourceExec: file_groups={6 groups: [[/testdata/tpcds/plans_sf1_partitions4/catalog_sales/part-0.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/catalog_sales/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/catalog_sales/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/catalog_sales/part-2.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/catalog_sales/part-2.parquet:..], ...]}, projection=[cs_sold_date_sk@0 as sold_date_sk, cs_ext_sales_price@23 as sales_price], file_type=parquet, predicate=DynamicFilter [ empty ] + │ DistributedLeafExec: DataSourceExec: file_groups={3 groups: [[/testdata/tpcds/plans_sf1_partitions4/web_sales/part-0.parquet:.., /testdata/tpcds/plans_sf1_partitions4/web_sales/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/web_sales/part-1.parquet:.., /testdata/tpcds/plans_sf1_partitions4/web_sales/part-2.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/web_sales/part-2.parquet:.., /testdata/tpcds/plans_sf1_partitions4/web_sales/part-3.parquet:..]]}, projection=[ws_sold_date_sk@0 as sold_date_sk, ws_ext_sales_price@23 as sales_price], file_type=parquet, predicate=DynamicFilter [ empty ] + │ DistributedLeafExec: DataSourceExec: file_groups={3 groups: [[/testdata/tpcds/plans_sf1_partitions4/catalog_sales/part-0.parquet:.., /testdata/tpcds/plans_sf1_partitions4/catalog_sales/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/catalog_sales/part-1.parquet:.., /testdata/tpcds/plans_sf1_partitions4/catalog_sales/part-2.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/catalog_sales/part-2.parquet:.., /testdata/tpcds/plans_sf1_partitions4/catalog_sales/part-3.parquet:..]]}, projection=[cs_sold_date_sk@0 as sold_date_sk, cs_ext_sales_price@23 as sales_price], file_type=parquet, predicate=DynamicFilter [ empty ] └────────────────────────────────────────────────── ┌───── Stage 2 ── Tasks: t0:[p0..p11] t1:[p12..p23] │ BroadcastExec: input_partitions=3, consumer_tasks=4, output_partitions=12 │ ProjectionExec: expr=[d_date_sk@0 as d_date_sk, d_week_seq@1 as d_week_seq, d_day_name@2 as d_day_name, CAST(d_date_sk@0 AS Float64) as CAST(date_dim.d_date_sk AS Float64)] │ RepartitionExec: partitioning=RoundRobinBatch(3), input_partitions=2 - │ PartitionIsolatorExec: tasks=2 partitions=4 - │ DataSourceExec: file_groups={4 groups: [[/testdata/tpcds/plans_sf1_partitions4/date_dim/part-0.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-2.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-3.parquet]]}, projection=[d_date_sk, d_week_seq, d_day_name], file_type=parquet + │ DistributedLeafExec: DataSourceExec: file_groups={2 groups: [[/testdata/tpcds/plans_sf1_partitions4/date_dim/part-0.parquet, /testdata/tpcds/plans_sf1_partitions4/date_dim/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-2.parquet, /testdata/tpcds/plans_sf1_partitions4/date_dim/part-3.parquet]]}, projection=[d_date_sk, d_week_seq, d_day_name], file_type=parquet └────────────────────────────────────────────────── ┌───── Stage 5 ── Tasks: t0:[p0..p5] t1:[p6..p11] │ BroadcastExec: input_partitions=3, consumer_tasks=2, output_partitions=6 │ FilterExec: d_year@1 = 2002, projection=[d_week_seq@0] │ RepartitionExec: partitioning=RoundRobinBatch(3), input_partitions=2 - │ PartitionIsolatorExec: tasks=2 partitions=4 - │ DataSourceExec: file_groups={4 groups: [[/testdata/tpcds/plans_sf1_partitions4/date_dim/part-0.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-2.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-3.parquet]]}, projection=[d_week_seq, d_year], file_type=parquet, predicate=d_year@6 = 2002, pruning_predicate=d_year_null_count@2 != row_count@3 AND d_year_min@0 <= 2002 AND 2002 <= d_year_max@1, required_guarantees=[d_year in (2002)] + │ DistributedLeafExec: DataSourceExec: file_groups={2 groups: [[/testdata/tpcds/plans_sf1_partitions4/date_dim/part-0.parquet, /testdata/tpcds/plans_sf1_partitions4/date_dim/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-2.parquet, /testdata/tpcds/plans_sf1_partitions4/date_dim/part-3.parquet]]}, projection=[d_week_seq, d_year], file_type=parquet, predicate=d_year@6 = 2002, pruning_predicate=d_year_null_count@2 != row_count@3 AND d_year_min@0 <= 2002 AND 2002 <= d_year_max@1, required_guarantees=[d_year in (2002)] └────────────────────────────────────────────────── ┌───── Stage 7 ── Tasks: t0:[p0..p5] t1:[p0..p5] t2:[p0..p5] t3:[p0..p5] │ RepartitionExec: partitioning=Hash([d_week_seq@0], 6), input_partitions=3 @@ -180,17 +169,14 @@ mod tests { │ CoalescePartitionsExec │ [Stage 6] => NetworkBroadcastExec: partitions_per_consumer=3, stage_partitions=12, input_tasks=2 │ DistributedUnionExec: t0:[c0(0/2)] t1:[c0(1/2)] t2:[c1(0/2)] t3:[c1(1/2)] - │ PartitionIsolatorExec: tasks=2 partitions=6 - │ DataSourceExec: file_groups={6 groups: [[/testdata/tpcds/plans_sf1_partitions4/web_sales/part-0.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/web_sales/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/web_sales/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/web_sales/part-2.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/web_sales/part-2.parquet:..], ...]}, projection=[ws_sold_date_sk@0 as sold_date_sk, ws_ext_sales_price@23 as sales_price], file_type=parquet, predicate=DynamicFilter [ empty ] - │ PartitionIsolatorExec: tasks=2 partitions=6 - │ DataSourceExec: file_groups={6 groups: [[/testdata/tpcds/plans_sf1_partitions4/catalog_sales/part-0.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/catalog_sales/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/catalog_sales/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/catalog_sales/part-2.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/catalog_sales/part-2.parquet:..], ...]}, projection=[cs_sold_date_sk@0 as sold_date_sk, cs_ext_sales_price@23 as sales_price], file_type=parquet, predicate=DynamicFilter [ empty ] + │ DistributedLeafExec: DataSourceExec: file_groups={3 groups: [[/testdata/tpcds/plans_sf1_partitions4/web_sales/part-0.parquet:.., /testdata/tpcds/plans_sf1_partitions4/web_sales/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/web_sales/part-1.parquet:.., /testdata/tpcds/plans_sf1_partitions4/web_sales/part-2.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/web_sales/part-2.parquet:.., /testdata/tpcds/plans_sf1_partitions4/web_sales/part-3.parquet:..]]}, projection=[ws_sold_date_sk@0 as sold_date_sk, ws_ext_sales_price@23 as sales_price], file_type=parquet, predicate=DynamicFilter [ empty ] + │ DistributedLeafExec: DataSourceExec: file_groups={3 groups: [[/testdata/tpcds/plans_sf1_partitions4/catalog_sales/part-0.parquet:.., /testdata/tpcds/plans_sf1_partitions4/catalog_sales/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/catalog_sales/part-1.parquet:.., /testdata/tpcds/plans_sf1_partitions4/catalog_sales/part-2.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/catalog_sales/part-2.parquet:.., /testdata/tpcds/plans_sf1_partitions4/catalog_sales/part-3.parquet:..]]}, projection=[cs_sold_date_sk@0 as sold_date_sk, cs_ext_sales_price@23 as sales_price], file_type=parquet, predicate=DynamicFilter [ empty ] └────────────────────────────────────────────────── ┌───── Stage 6 ── Tasks: t0:[p0..p11] t1:[p12..p23] │ BroadcastExec: input_partitions=3, consumer_tasks=4, output_partitions=12 │ ProjectionExec: expr=[d_date_sk@0 as d_date_sk, d_week_seq@1 as d_week_seq, d_day_name@2 as d_day_name, CAST(d_date_sk@0 AS Float64) as CAST(date_dim.d_date_sk AS Float64)] │ RepartitionExec: partitioning=RoundRobinBatch(3), input_partitions=2 - │ PartitionIsolatorExec: tasks=2 partitions=4 - │ DataSourceExec: file_groups={4 groups: [[/testdata/tpcds/plans_sf1_partitions4/date_dim/part-0.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-2.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-3.parquet]]}, projection=[d_date_sk, d_week_seq, d_day_name], file_type=parquet + │ DistributedLeafExec: DataSourceExec: file_groups={2 groups: [[/testdata/tpcds/plans_sf1_partitions4/date_dim/part-0.parquet, /testdata/tpcds/plans_sf1_partitions4/date_dim/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-2.parquet, /testdata/tpcds/plans_sf1_partitions4/date_dim/part-3.parquet]]}, projection=[d_date_sk, d_week_seq, d_day_name], file_type=parquet └────────────────────────────────────────────────── "#); Ok(()) @@ -198,7 +184,7 @@ mod tests { #[tokio::test] async fn test_tpcds_3() -> Result<()> { let display = test_tpcds_query("q3").await?; - assert_snapshot!(display, @" + assert_snapshot!(display, @r" ┌───── DistributedExec ── Tasks: t0:[p0] │ SortPreservingMergeExec: [d_year@0 ASC NULLS LAST, sum_agg@3 DESC, brand_id@1 ASC NULLS LAST], fetch=100 │ [Stage 4] => NetworkCoalesceExec: output_partitions=6, input_tasks=2 @@ -214,27 +200,24 @@ mod tests { │ AggregateExec: mode=Partial, gby=[d_year@0 as d_year, i_brand@3 as i_brand, i_brand_id@2 as i_brand_id], aggr=[sum(store_sales.ss_ext_sales_price)] │ HashJoinExec: mode=CollectLeft, join_type=Inner, on=[(ss_item_sk@1, i_item_sk@0)], projection=[d_year@0, ss_ext_sales_price@2, i_brand_id@4, i_brand@5] │ CoalescePartitionsExec - │ [Stage 2] => NetworkBroadcastExec: partitions_per_consumer=2, stage_partitions=6, input_tasks=3 + │ [Stage 2] => NetworkBroadcastExec: partitions_per_consumer=3, stage_partitions=9, input_tasks=3 │ FilterExec: i_manufact_id@3 = 128, projection=[i_item_sk@0, i_brand_id@1, i_brand@2] │ RepartitionExec: partitioning=RoundRobinBatch(3), input_partitions=2 - │ PartitionIsolatorExec: tasks=3 partitions=4 - │ DataSourceExec: file_groups={4 groups: [[/testdata/tpcds/plans_sf1_partitions4/item/part-0.parquet], [/testdata/tpcds/plans_sf1_partitions4/item/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/item/part-2.parquet], [/testdata/tpcds/plans_sf1_partitions4/item/part-3.parquet]]}, projection=[i_item_sk, i_brand_id, i_brand, i_manufact_id], file_type=parquet, predicate=i_manufact_id@13 = 128 AND DynamicFilter [ empty ], pruning_predicate=i_manufact_id_null_count@2 != row_count@3 AND i_manufact_id_min@0 <= 128 AND 128 <= i_manufact_id_max@1, required_guarantees=[i_manufact_id in (128)] + │ DistributedLeafExec: DataSourceExec: file_groups={2 groups: [[/testdata/tpcds/plans_sf1_partitions4/item/part-0.parquet, /testdata/tpcds/plans_sf1_partitions4/item/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/item/part-2.parquet, /testdata/tpcds/plans_sf1_partitions4/item/part-3.parquet]]}, projection=[i_item_sk, i_brand_id, i_brand, i_manufact_id], file_type=parquet, predicate=i_manufact_id@13 = 128 AND DynamicFilter [ empty ], pruning_predicate=i_manufact_id_null_count@2 != row_count@3 AND i_manufact_id_min@0 <= 128 AND 128 <= i_manufact_id_max@1, required_guarantees=[i_manufact_id in (128)] └────────────────────────────────────────────────── - ┌───── Stage 2 ── Tasks: t0:[p0..p5] t1:[p6..p11] t2:[p12..p17] - │ BroadcastExec: input_partitions=2, consumer_tasks=3, output_partitions=6 + ┌───── Stage 2 ── Tasks: t0:[p0..p8] t1:[p9..p17] t2:[p18..p26] + │ BroadcastExec: input_partitions=3, consumer_tasks=3, output_partitions=9 │ HashJoinExec: mode=CollectLeft, join_type=Inner, on=[(CAST(dt.d_date_sk AS Float64)@2, ss_sold_date_sk@0)], projection=[d_year@1, ss_item_sk@4, ss_ext_sales_price@5] │ CoalescePartitionsExec │ [Stage 1] => NetworkBroadcastExec: partitions_per_consumer=3, stage_partitions=9, input_tasks=2 - │ PartitionIsolatorExec: tasks=3 partitions=6 - │ DataSourceExec: file_groups={6 groups: [[/testdata/tpcds/plans_sf1_partitions4/store_sales/part-0.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_sales/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_sales/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_sales/part-2.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_sales/part-2.parquet:..], ...]}, projection=[ss_sold_date_sk, ss_item_sk, ss_ext_sales_price], file_type=parquet, predicate=DynamicFilter [ empty ] + │ DistributedLeafExec: DataSourceExec: file_groups={3 groups: [[/testdata/tpcds/plans_sf1_partitions4/store_sales/part-0.parquet:.., /testdata/tpcds/plans_sf1_partitions4/store_sales/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_sales/part-1.parquet:.., /testdata/tpcds/plans_sf1_partitions4/store_sales/part-2.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_sales/part-2.parquet:.., /testdata/tpcds/plans_sf1_partitions4/store_sales/part-3.parquet:..]]}, projection=[ss_sold_date_sk, ss_item_sk, ss_ext_sales_price], file_type=parquet, predicate=DynamicFilter [ empty ] └────────────────────────────────────────────────── ┌───── Stage 1 ── Tasks: t0:[p0..p8] t1:[p9..p17] │ BroadcastExec: input_partitions=3, consumer_tasks=3, output_partitions=9 │ ProjectionExec: expr=[d_date_sk@0 as d_date_sk, d_year@1 as d_year, CAST(d_date_sk@0 AS Float64) as CAST(dt.d_date_sk AS Float64)] │ FilterExec: d_moy@2 = 11, projection=[d_date_sk@0, d_year@1] │ RepartitionExec: partitioning=RoundRobinBatch(3), input_partitions=2 - │ PartitionIsolatorExec: tasks=2 partitions=4 - │ DataSourceExec: file_groups={4 groups: [[/testdata/tpcds/plans_sf1_partitions4/date_dim/part-0.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-2.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-3.parquet]]}, projection=[d_date_sk, d_year, d_moy], file_type=parquet, predicate=d_moy@8 = 11, pruning_predicate=d_moy_null_count@2 != row_count@3 AND d_moy_min@0 <= 11 AND 11 <= d_moy_max@1, required_guarantees=[d_moy in (11)] + │ DistributedLeafExec: DataSourceExec: file_groups={2 groups: [[/testdata/tpcds/plans_sf1_partitions4/date_dim/part-0.parquet, /testdata/tpcds/plans_sf1_partitions4/date_dim/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-2.parquet, /testdata/tpcds/plans_sf1_partitions4/date_dim/part-3.parquet]]}, projection=[d_date_sk, d_year, d_moy], file_type=parquet, predicate=d_moy@8 = 11, pruning_predicate=d_moy_null_count@2 != row_count@3 AND d_moy_min@0 <= 11 AND 11 <= d_moy_max@1, required_guarantees=[d_moy in (11)] └────────────────────────────────────────────────── "); Ok(()) @@ -242,7 +225,7 @@ mod tests { #[tokio::test] async fn test_tpcds_4() -> Result<()> { let display = test_tpcds_query("q4").await?; - assert_snapshot!(display, @" + assert_snapshot!(display, @r" ┌───── DistributedExec ── Tasks: t0:[p0] │ SortPreservingMergeExec: [customer_id@0 ASC, customer_first_name@1 ASC, customer_last_name@2 ASC, customer_preferred_cust_flag@3 ASC], fetch=100 │ [Stage 24] => NetworkCoalesceExec: output_partitions=6, input_tasks=2 @@ -278,7 +261,7 @@ mod tests { │ [Stage 3] => NetworkShuffleExec: output_partitions=3, input_tasks=3 └────────────────────────────────────────────────── ┌───── Stage 3 ── Tasks: t0:[p0..p5] t1:[p0..p5] t2:[p0..p5] - │ RepartitionExec: partitioning=Hash([c_customer_id@0, c_first_name@1, c_last_name@2, c_preferred_cust_flag@3, c_birth_country@4, c_login@5, c_email_address@6, d_year@7], 6), input_partitions=2 + │ RepartitionExec: partitioning=Hash([c_customer_id@0, c_first_name@1, c_last_name@2, c_preferred_cust_flag@3, c_birth_country@4, c_login@5, c_email_address@6, d_year@7], 6), input_partitions=3 │ AggregateExec: mode=Partial, gby=[c_customer_id@0 as c_customer_id, c_first_name@1 as c_first_name, c_last_name@2 as c_last_name, c_preferred_cust_flag@3 as c_preferred_cust_flag, c_birth_country@4 as c_birth_country, c_login@5 as c_login, c_email_address@6 as c_email_address, d_year@11 as d_year], aggr=[sum(web_sales.ws_ext_list_price - web_sales.ws_ext_wholesale_cost - web_sales.ws_ext_discount_amt + web_sales.ws_ext_sales_price / Int64(2))], ordering_mode=PartiallySorted([7]) │ ProjectionExec: expr=[c_customer_id@1 as c_customer_id, c_first_name@2 as c_first_name, c_last_name@3 as c_last_name, c_preferred_cust_flag@4 as c_preferred_cust_flag, c_birth_country@5 as c_birth_country, c_login@6 as c_login, c_email_address@7 as c_email_address, ws_ext_discount_amt@8 as ws_ext_discount_amt, ws_ext_sales_price@9 as ws_ext_sales_price, ws_ext_wholesale_cost@10 as ws_ext_wholesale_cost, ws_ext_list_price@11 as ws_ext_list_price, d_year@0 as d_year] │ HashJoinExec: mode=CollectLeft, join_type=Inner, on=[(CAST(date_dim.d_date_sk AS Float64)@2, ws_sold_date_sk@7)], projection=[d_year@1, c_customer_id@3, c_first_name@4, c_last_name@5, c_preferred_cust_flag@6, c_birth_country@7, c_login@8, c_email_address@9, ws_ext_discount_amt@11, ws_ext_sales_price@12, ws_ext_wholesale_cost@13, ws_ext_list_price@14] @@ -287,23 +270,20 @@ mod tests { │ HashJoinExec: mode=CollectLeft, join_type=Inner, on=[(CAST(customer.c_customer_sk AS Float64)@8, ws_bill_customer_sk@1)], projection=[c_customer_id@1, c_first_name@2, c_last_name@3, c_preferred_cust_flag@4, c_birth_country@5, c_login@6, c_email_address@7, ws_sold_date_sk@9, ws_ext_discount_amt@11, ws_ext_sales_price@12, ws_ext_wholesale_cost@13, ws_ext_list_price@14] │ CoalescePartitionsExec │ [Stage 2] => NetworkBroadcastExec: partitions_per_consumer=3, stage_partitions=9, input_tasks=2 - │ PartitionIsolatorExec: tasks=3 partitions=6 - │ DataSourceExec: file_groups={6 groups: [[/testdata/tpcds/plans_sf1_partitions4/web_sales/part-0.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/web_sales/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/web_sales/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/web_sales/part-2.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/web_sales/part-2.parquet:..], ...]}, projection=[ws_sold_date_sk, ws_bill_customer_sk, ws_ext_discount_amt, ws_ext_sales_price, ws_ext_wholesale_cost, ws_ext_list_price], file_type=parquet, predicate=DynamicFilter [ empty ] AND DynamicFilter [ empty ] + │ DistributedLeafExec: DataSourceExec: file_groups={3 groups: [[/testdata/tpcds/plans_sf1_partitions4/web_sales/part-0.parquet:.., /testdata/tpcds/plans_sf1_partitions4/web_sales/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/web_sales/part-1.parquet:.., /testdata/tpcds/plans_sf1_partitions4/web_sales/part-2.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/web_sales/part-2.parquet:.., /testdata/tpcds/plans_sf1_partitions4/web_sales/part-3.parquet:..]]}, projection=[ws_sold_date_sk, ws_bill_customer_sk, ws_ext_discount_amt, ws_ext_sales_price, ws_ext_wholesale_cost, ws_ext_list_price], file_type=parquet, predicate=DynamicFilter [ empty ] AND DynamicFilter [ empty ] └────────────────────────────────────────────────── ┌───── Stage 1 ── Tasks: t0:[p0..p8] t1:[p9..p17] │ BroadcastExec: input_partitions=3, consumer_tasks=3, output_partitions=9 │ ProjectionExec: expr=[d_date_sk@0 as d_date_sk, d_year@1 as d_year, CAST(d_date_sk@0 AS Float64) as CAST(date_dim.d_date_sk AS Float64)] │ FilterExec: d_year@1 = 2001 │ RepartitionExec: partitioning=RoundRobinBatch(3), input_partitions=2 - │ PartitionIsolatorExec: tasks=2 partitions=4 - │ DataSourceExec: file_groups={4 groups: [[/testdata/tpcds/plans_sf1_partitions4/date_dim/part-0.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-2.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-3.parquet]]}, projection=[d_date_sk, d_year], file_type=parquet, predicate=d_year@6 = 2001, pruning_predicate=d_year_null_count@2 != row_count@3 AND d_year_min@0 <= 2001 AND 2001 <= d_year_max@1, required_guarantees=[d_year in (2001)] + │ DistributedLeafExec: DataSourceExec: file_groups={2 groups: [[/testdata/tpcds/plans_sf1_partitions4/date_dim/part-0.parquet, /testdata/tpcds/plans_sf1_partitions4/date_dim/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-2.parquet, /testdata/tpcds/plans_sf1_partitions4/date_dim/part-3.parquet]]}, projection=[d_date_sk, d_year], file_type=parquet, predicate=d_year@6 = 2001, pruning_predicate=d_year_null_count@2 != row_count@3 AND d_year_min@0 <= 2001 AND 2001 <= d_year_max@1, required_guarantees=[d_year in (2001)] └────────────────────────────────────────────────── ┌───── Stage 2 ── Tasks: t0:[p0..p8] t1:[p9..p17] │ BroadcastExec: input_partitions=3, consumer_tasks=3, output_partitions=9 │ ProjectionExec: expr=[c_customer_sk@0 as c_customer_sk, c_customer_id@1 as c_customer_id, c_first_name@2 as c_first_name, c_last_name@3 as c_last_name, c_preferred_cust_flag@4 as c_preferred_cust_flag, c_birth_country@5 as c_birth_country, c_login@6 as c_login, c_email_address@7 as c_email_address, CAST(c_customer_sk@0 AS Float64) as CAST(customer.c_customer_sk AS Float64)] │ RepartitionExec: partitioning=RoundRobinBatch(3), input_partitions=2 - │ PartitionIsolatorExec: tasks=2 partitions=4 - │ DataSourceExec: file_groups={4 groups: [[/testdata/tpcds/plans_sf1_partitions4/customer/part-0.parquet], [/testdata/tpcds/plans_sf1_partitions4/customer/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/customer/part-2.parquet], [/testdata/tpcds/plans_sf1_partitions4/customer/part-3.parquet]]}, projection=[c_customer_sk, c_customer_id, c_first_name, c_last_name, c_preferred_cust_flag, c_birth_country, c_login, c_email_address], file_type=parquet + │ DistributedLeafExec: DataSourceExec: file_groups={2 groups: [[/testdata/tpcds/plans_sf1_partitions4/customer/part-0.parquet, /testdata/tpcds/plans_sf1_partitions4/customer/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/customer/part-2.parquet, /testdata/tpcds/plans_sf1_partitions4/customer/part-3.parquet]]}, projection=[c_customer_sk, c_customer_id, c_first_name, c_last_name, c_preferred_cust_flag, c_birth_country, c_login, c_email_address], file_type=parquet └────────────────────────────────────────────────── ┌───── Stage 16 ── Tasks: t0:[p0..p5] t1:[p6..p11] │ BroadcastExec: input_partitions=3, consumer_tasks=2, output_partitions=6 @@ -327,7 +307,7 @@ mod tests { │ [Stage 7] => NetworkShuffleExec: output_partitions=3, input_tasks=3 └────────────────────────────────────────────────── ┌───── Stage 7 ── Tasks: t0:[p0..p5] t1:[p0..p5] t2:[p0..p5] - │ RepartitionExec: partitioning=Hash([c_customer_id@0, c_first_name@1, c_last_name@2, c_preferred_cust_flag@3, c_birth_country@4, c_login@5, c_email_address@6, d_year@7], 6), input_partitions=2 + │ RepartitionExec: partitioning=Hash([c_customer_id@0, c_first_name@1, c_last_name@2, c_preferred_cust_flag@3, c_birth_country@4, c_login@5, c_email_address@6, d_year@7], 6), input_partitions=3 │ AggregateExec: mode=Partial, gby=[c_customer_id@0 as c_customer_id, c_first_name@1 as c_first_name, c_last_name@2 as c_last_name, c_preferred_cust_flag@3 as c_preferred_cust_flag, c_birth_country@4 as c_birth_country, c_login@5 as c_login, c_email_address@6 as c_email_address, d_year@11 as d_year], aggr=[sum(catalog_sales.cs_ext_list_price - catalog_sales.cs_ext_wholesale_cost - catalog_sales.cs_ext_discount_amt + catalog_sales.cs_ext_sales_price / Int64(2))], ordering_mode=PartiallySorted([7]) │ ProjectionExec: expr=[c_customer_id@1 as c_customer_id, c_first_name@2 as c_first_name, c_last_name@3 as c_last_name, c_preferred_cust_flag@4 as c_preferred_cust_flag, c_birth_country@5 as c_birth_country, c_login@6 as c_login, c_email_address@7 as c_email_address, cs_ext_discount_amt@8 as cs_ext_discount_amt, cs_ext_sales_price@9 as cs_ext_sales_price, cs_ext_wholesale_cost@10 as cs_ext_wholesale_cost, cs_ext_list_price@11 as cs_ext_list_price, d_year@0 as d_year] │ HashJoinExec: mode=CollectLeft, join_type=Inner, on=[(CAST(date_dim.d_date_sk AS Float64)@2, cs_sold_date_sk@7)], projection=[d_year@1, c_customer_id@3, c_first_name@4, c_last_name@5, c_preferred_cust_flag@6, c_birth_country@7, c_login@8, c_email_address@9, cs_ext_discount_amt@11, cs_ext_sales_price@12, cs_ext_wholesale_cost@13, cs_ext_list_price@14] @@ -336,23 +316,20 @@ mod tests { │ HashJoinExec: mode=CollectLeft, join_type=Inner, on=[(CAST(customer.c_customer_sk AS Float64)@8, cs_bill_customer_sk@1)], projection=[c_customer_id@1, c_first_name@2, c_last_name@3, c_preferred_cust_flag@4, c_birth_country@5, c_login@6, c_email_address@7, cs_sold_date_sk@9, cs_ext_discount_amt@11, cs_ext_sales_price@12, cs_ext_wholesale_cost@13, cs_ext_list_price@14] │ CoalescePartitionsExec │ [Stage 6] => NetworkBroadcastExec: partitions_per_consumer=3, stage_partitions=9, input_tasks=2 - │ PartitionIsolatorExec: tasks=3 partitions=6 - │ DataSourceExec: file_groups={6 groups: [[/testdata/tpcds/plans_sf1_partitions4/catalog_sales/part-0.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/catalog_sales/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/catalog_sales/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/catalog_sales/part-2.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/catalog_sales/part-2.parquet:..], ...]}, projection=[cs_sold_date_sk, cs_bill_customer_sk, cs_ext_discount_amt, cs_ext_sales_price, cs_ext_wholesale_cost, cs_ext_list_price], file_type=parquet, predicate=DynamicFilter [ empty ] AND DynamicFilter [ empty ] + │ DistributedLeafExec: DataSourceExec: file_groups={3 groups: [[/testdata/tpcds/plans_sf1_partitions4/catalog_sales/part-0.parquet:.., /testdata/tpcds/plans_sf1_partitions4/catalog_sales/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/catalog_sales/part-1.parquet:.., /testdata/tpcds/plans_sf1_partitions4/catalog_sales/part-2.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/catalog_sales/part-2.parquet:.., /testdata/tpcds/plans_sf1_partitions4/catalog_sales/part-3.parquet:..]]}, projection=[cs_sold_date_sk, cs_bill_customer_sk, cs_ext_discount_amt, cs_ext_sales_price, cs_ext_wholesale_cost, cs_ext_list_price], file_type=parquet, predicate=DynamicFilter [ empty ] AND DynamicFilter [ empty ] └────────────────────────────────────────────────── ┌───── Stage 5 ── Tasks: t0:[p0..p8] t1:[p9..p17] │ BroadcastExec: input_partitions=3, consumer_tasks=3, output_partitions=9 │ ProjectionExec: expr=[d_date_sk@0 as d_date_sk, d_year@1 as d_year, CAST(d_date_sk@0 AS Float64) as CAST(date_dim.d_date_sk AS Float64)] │ FilterExec: d_year@1 = 2001 │ RepartitionExec: partitioning=RoundRobinBatch(3), input_partitions=2 - │ PartitionIsolatorExec: tasks=2 partitions=4 - │ DataSourceExec: file_groups={4 groups: [[/testdata/tpcds/plans_sf1_partitions4/date_dim/part-0.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-2.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-3.parquet]]}, projection=[d_date_sk, d_year], file_type=parquet, predicate=d_year@6 = 2001, pruning_predicate=d_year_null_count@2 != row_count@3 AND d_year_min@0 <= 2001 AND 2001 <= d_year_max@1, required_guarantees=[d_year in (2001)] + │ DistributedLeafExec: DataSourceExec: file_groups={2 groups: [[/testdata/tpcds/plans_sf1_partitions4/date_dim/part-0.parquet, /testdata/tpcds/plans_sf1_partitions4/date_dim/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-2.parquet, /testdata/tpcds/plans_sf1_partitions4/date_dim/part-3.parquet]]}, projection=[d_date_sk, d_year], file_type=parquet, predicate=d_year@6 = 2001, pruning_predicate=d_year_null_count@2 != row_count@3 AND d_year_min@0 <= 2001 AND 2001 <= d_year_max@1, required_guarantees=[d_year in (2001)] └────────────────────────────────────────────────── ┌───── Stage 6 ── Tasks: t0:[p0..p8] t1:[p9..p17] │ BroadcastExec: input_partitions=3, consumer_tasks=3, output_partitions=9 │ ProjectionExec: expr=[c_customer_sk@0 as c_customer_sk, c_customer_id@1 as c_customer_id, c_first_name@2 as c_first_name, c_last_name@3 as c_last_name, c_preferred_cust_flag@4 as c_preferred_cust_flag, c_birth_country@5 as c_birth_country, c_login@6 as c_login, c_email_address@7 as c_email_address, CAST(c_customer_sk@0 AS Float64) as CAST(customer.c_customer_sk AS Float64)] │ RepartitionExec: partitioning=RoundRobinBatch(3), input_partitions=2 - │ PartitionIsolatorExec: tasks=2 partitions=4 - │ DataSourceExec: file_groups={4 groups: [[/testdata/tpcds/plans_sf1_partitions4/customer/part-0.parquet], [/testdata/tpcds/plans_sf1_partitions4/customer/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/customer/part-2.parquet], [/testdata/tpcds/plans_sf1_partitions4/customer/part-3.parquet]]}, projection=[c_customer_sk, c_customer_id, c_first_name, c_last_name, c_preferred_cust_flag, c_birth_country, c_login, c_email_address], file_type=parquet + │ DistributedLeafExec: DataSourceExec: file_groups={2 groups: [[/testdata/tpcds/plans_sf1_partitions4/customer/part-0.parquet, /testdata/tpcds/plans_sf1_partitions4/customer/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/customer/part-2.parquet, /testdata/tpcds/plans_sf1_partitions4/customer/part-3.parquet]]}, projection=[c_customer_sk, c_customer_id, c_first_name, c_last_name, c_preferred_cust_flag, c_birth_country, c_login, c_email_address], file_type=parquet └────────────────────────────────────────────────── ┌───── Stage 12 ── Tasks: t0:[p0..p5] t1:[p6..p11] │ BroadcastExec: input_partitions=3, consumer_tasks=2, output_partitions=6 @@ -363,7 +340,7 @@ mod tests { │ [Stage 11] => NetworkShuffleExec: output_partitions=3, input_tasks=3 └────────────────────────────────────────────────── ┌───── Stage 11 ── Tasks: t0:[p0..p5] t1:[p0..p5] t2:[p0..p5] - │ RepartitionExec: partitioning=Hash([c_customer_id@0, c_first_name@1, c_last_name@2, c_preferred_cust_flag@3, c_birth_country@4, c_login@5, c_email_address@6, d_year@7], 6), input_partitions=2 + │ RepartitionExec: partitioning=Hash([c_customer_id@0, c_first_name@1, c_last_name@2, c_preferred_cust_flag@3, c_birth_country@4, c_login@5, c_email_address@6, d_year@7], 6), input_partitions=3 │ AggregateExec: mode=Partial, gby=[c_customer_id@0 as c_customer_id, c_first_name@1 as c_first_name, c_last_name@2 as c_last_name, c_preferred_cust_flag@3 as c_preferred_cust_flag, c_birth_country@4 as c_birth_country, c_login@5 as c_login, c_email_address@6 as c_email_address, d_year@11 as d_year], aggr=[sum(store_sales.ss_ext_list_price - store_sales.ss_ext_wholesale_cost - store_sales.ss_ext_discount_amt + store_sales.ss_ext_sales_price / Int64(2))], ordering_mode=PartiallySorted([7]) │ ProjectionExec: expr=[c_customer_id@1 as c_customer_id, c_first_name@2 as c_first_name, c_last_name@3 as c_last_name, c_preferred_cust_flag@4 as c_preferred_cust_flag, c_birth_country@5 as c_birth_country, c_login@6 as c_login, c_email_address@7 as c_email_address, ss_ext_discount_amt@8 as ss_ext_discount_amt, ss_ext_sales_price@9 as ss_ext_sales_price, ss_ext_wholesale_cost@10 as ss_ext_wholesale_cost, ss_ext_list_price@11 as ss_ext_list_price, d_year@0 as d_year] │ HashJoinExec: mode=CollectLeft, join_type=Inner, on=[(CAST(date_dim.d_date_sk AS Float64)@2, ss_sold_date_sk@7)], projection=[d_year@1, c_customer_id@3, c_first_name@4, c_last_name@5, c_preferred_cust_flag@6, c_birth_country@7, c_login@8, c_email_address@9, ss_ext_discount_amt@11, ss_ext_sales_price@12, ss_ext_wholesale_cost@13, ss_ext_list_price@14] @@ -372,26 +349,23 @@ mod tests { │ HashJoinExec: mode=CollectLeft, join_type=Inner, on=[(CAST(customer.c_customer_sk AS Float64)@8, ss_customer_sk@1)], projection=[c_customer_id@1, c_first_name@2, c_last_name@3, c_preferred_cust_flag@4, c_birth_country@5, c_login@6, c_email_address@7, ss_sold_date_sk@9, ss_ext_discount_amt@11, ss_ext_sales_price@12, ss_ext_wholesale_cost@13, ss_ext_list_price@14] │ CoalescePartitionsExec │ [Stage 10] => NetworkBroadcastExec: partitions_per_consumer=3, stage_partitions=9, input_tasks=2 - │ PartitionIsolatorExec: tasks=3 partitions=6 - │ DataSourceExec: file_groups={6 groups: [[/testdata/tpcds/plans_sf1_partitions4/store_sales/part-0.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_sales/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_sales/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_sales/part-2.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_sales/part-2.parquet:..], ...]}, projection=[ss_sold_date_sk, ss_customer_sk, ss_ext_discount_amt, ss_ext_sales_price, ss_ext_wholesale_cost, ss_ext_list_price], file_type=parquet, predicate=DynamicFilter [ empty ] AND DynamicFilter [ empty ] + │ DistributedLeafExec: DataSourceExec: file_groups={3 groups: [[/testdata/tpcds/plans_sf1_partitions4/store_sales/part-0.parquet:.., /testdata/tpcds/plans_sf1_partitions4/store_sales/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_sales/part-1.parquet:.., /testdata/tpcds/plans_sf1_partitions4/store_sales/part-2.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_sales/part-2.parquet:.., /testdata/tpcds/plans_sf1_partitions4/store_sales/part-3.parquet:..]]}, projection=[ss_sold_date_sk, ss_customer_sk, ss_ext_discount_amt, ss_ext_sales_price, ss_ext_wholesale_cost, ss_ext_list_price], file_type=parquet, predicate=DynamicFilter [ empty ] AND DynamicFilter [ empty ] └────────────────────────────────────────────────── ┌───── Stage 9 ── Tasks: t0:[p0..p8] t1:[p9..p17] │ BroadcastExec: input_partitions=3, consumer_tasks=3, output_partitions=9 │ ProjectionExec: expr=[d_date_sk@0 as d_date_sk, d_year@1 as d_year, CAST(d_date_sk@0 AS Float64) as CAST(date_dim.d_date_sk AS Float64)] │ FilterExec: d_year@1 = 2001 │ RepartitionExec: partitioning=RoundRobinBatch(3), input_partitions=2 - │ PartitionIsolatorExec: tasks=2 partitions=4 - │ DataSourceExec: file_groups={4 groups: [[/testdata/tpcds/plans_sf1_partitions4/date_dim/part-0.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-2.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-3.parquet]]}, projection=[d_date_sk, d_year], file_type=parquet, predicate=d_year@6 = 2001, pruning_predicate=d_year_null_count@2 != row_count@3 AND d_year_min@0 <= 2001 AND 2001 <= d_year_max@1, required_guarantees=[d_year in (2001)] + │ DistributedLeafExec: DataSourceExec: file_groups={2 groups: [[/testdata/tpcds/plans_sf1_partitions4/date_dim/part-0.parquet, /testdata/tpcds/plans_sf1_partitions4/date_dim/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-2.parquet, /testdata/tpcds/plans_sf1_partitions4/date_dim/part-3.parquet]]}, projection=[d_date_sk, d_year], file_type=parquet, predicate=d_year@6 = 2001, pruning_predicate=d_year_null_count@2 != row_count@3 AND d_year_min@0 <= 2001 AND 2001 <= d_year_max@1, required_guarantees=[d_year in (2001)] └────────────────────────────────────────────────── ┌───── Stage 10 ── Tasks: t0:[p0..p8] t1:[p9..p17] │ BroadcastExec: input_partitions=3, consumer_tasks=3, output_partitions=9 │ ProjectionExec: expr=[c_customer_sk@0 as c_customer_sk, c_customer_id@1 as c_customer_id, c_first_name@2 as c_first_name, c_last_name@3 as c_last_name, c_preferred_cust_flag@4 as c_preferred_cust_flag, c_birth_country@5 as c_birth_country, c_login@6 as c_login, c_email_address@7 as c_email_address, CAST(c_customer_sk@0 AS Float64) as CAST(customer.c_customer_sk AS Float64)] │ RepartitionExec: partitioning=RoundRobinBatch(3), input_partitions=2 - │ PartitionIsolatorExec: tasks=2 partitions=4 - │ DataSourceExec: file_groups={4 groups: [[/testdata/tpcds/plans_sf1_partitions4/customer/part-0.parquet], [/testdata/tpcds/plans_sf1_partitions4/customer/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/customer/part-2.parquet], [/testdata/tpcds/plans_sf1_partitions4/customer/part-3.parquet]]}, projection=[c_customer_sk, c_customer_id, c_first_name, c_last_name, c_preferred_cust_flag, c_birth_country, c_login, c_email_address], file_type=parquet, predicate=DynamicFilter [ empty ] AND DynamicFilter [ empty ] + │ DistributedLeafExec: DataSourceExec: file_groups={2 groups: [[/testdata/tpcds/plans_sf1_partitions4/customer/part-0.parquet, /testdata/tpcds/plans_sf1_partitions4/customer/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/customer/part-2.parquet, /testdata/tpcds/plans_sf1_partitions4/customer/part-3.parquet]]}, projection=[c_customer_sk, c_customer_id, c_first_name, c_last_name, c_preferred_cust_flag, c_birth_country, c_login, c_email_address], file_type=parquet, predicate=DynamicFilter [ empty ] AND DynamicFilter [ empty ] └────────────────────────────────────────────────── ┌───── Stage 15 ── Tasks: t0:[p0..p5] t1:[p0..p5] t2:[p0..p5] - │ RepartitionExec: partitioning=Hash([c_customer_id@0, c_first_name@1, c_last_name@2, c_preferred_cust_flag@3, c_birth_country@4, c_login@5, c_email_address@6, d_year@7], 6), input_partitions=2 + │ RepartitionExec: partitioning=Hash([c_customer_id@0, c_first_name@1, c_last_name@2, c_preferred_cust_flag@3, c_birth_country@4, c_login@5, c_email_address@6, d_year@7], 6), input_partitions=3 │ AggregateExec: mode=Partial, gby=[c_customer_id@0 as c_customer_id, c_first_name@1 as c_first_name, c_last_name@2 as c_last_name, c_preferred_cust_flag@3 as c_preferred_cust_flag, c_birth_country@4 as c_birth_country, c_login@5 as c_login, c_email_address@6 as c_email_address, d_year@11 as d_year], aggr=[sum(store_sales.ss_ext_list_price - store_sales.ss_ext_wholesale_cost - store_sales.ss_ext_discount_amt + store_sales.ss_ext_sales_price / Int64(2))], ordering_mode=PartiallySorted([7]) │ ProjectionExec: expr=[c_customer_id@1 as c_customer_id, c_first_name@2 as c_first_name, c_last_name@3 as c_last_name, c_preferred_cust_flag@4 as c_preferred_cust_flag, c_birth_country@5 as c_birth_country, c_login@6 as c_login, c_email_address@7 as c_email_address, ss_ext_discount_amt@8 as ss_ext_discount_amt, ss_ext_sales_price@9 as ss_ext_sales_price, ss_ext_wholesale_cost@10 as ss_ext_wholesale_cost, ss_ext_list_price@11 as ss_ext_list_price, d_year@0 as d_year] │ HashJoinExec: mode=CollectLeft, join_type=Inner, on=[(CAST(date_dim.d_date_sk AS Float64)@2, ss_sold_date_sk@7)], projection=[d_year@1, c_customer_id@3, c_first_name@4, c_last_name@5, c_preferred_cust_flag@6, c_birth_country@7, c_login@8, c_email_address@9, ss_ext_discount_amt@11, ss_ext_sales_price@12, ss_ext_wholesale_cost@13, ss_ext_list_price@14] @@ -400,26 +374,23 @@ mod tests { │ HashJoinExec: mode=CollectLeft, join_type=Inner, on=[(CAST(customer.c_customer_sk AS Float64)@8, ss_customer_sk@1)], projection=[c_customer_id@1, c_first_name@2, c_last_name@3, c_preferred_cust_flag@4, c_birth_country@5, c_login@6, c_email_address@7, ss_sold_date_sk@9, ss_ext_discount_amt@11, ss_ext_sales_price@12, ss_ext_wholesale_cost@13, ss_ext_list_price@14] │ CoalescePartitionsExec │ [Stage 14] => NetworkBroadcastExec: partitions_per_consumer=3, stage_partitions=9, input_tasks=2 - │ PartitionIsolatorExec: tasks=3 partitions=6 - │ DataSourceExec: file_groups={6 groups: [[/testdata/tpcds/plans_sf1_partitions4/store_sales/part-0.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_sales/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_sales/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_sales/part-2.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_sales/part-2.parquet:..], ...]}, projection=[ss_sold_date_sk, ss_customer_sk, ss_ext_discount_amt, ss_ext_sales_price, ss_ext_wholesale_cost, ss_ext_list_price], file_type=parquet, predicate=DynamicFilter [ empty ] AND DynamicFilter [ empty ] + │ DistributedLeafExec: DataSourceExec: file_groups={3 groups: [[/testdata/tpcds/plans_sf1_partitions4/store_sales/part-0.parquet:.., /testdata/tpcds/plans_sf1_partitions4/store_sales/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_sales/part-1.parquet:.., /testdata/tpcds/plans_sf1_partitions4/store_sales/part-2.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_sales/part-2.parquet:.., /testdata/tpcds/plans_sf1_partitions4/store_sales/part-3.parquet:..]]}, projection=[ss_sold_date_sk, ss_customer_sk, ss_ext_discount_amt, ss_ext_sales_price, ss_ext_wholesale_cost, ss_ext_list_price], file_type=parquet, predicate=DynamicFilter [ empty ] AND DynamicFilter [ empty ] └────────────────────────────────────────────────── ┌───── Stage 13 ── Tasks: t0:[p0..p8] t1:[p9..p17] │ BroadcastExec: input_partitions=3, consumer_tasks=3, output_partitions=9 │ ProjectionExec: expr=[d_date_sk@0 as d_date_sk, d_year@1 as d_year, CAST(d_date_sk@0 AS Float64) as CAST(date_dim.d_date_sk AS Float64)] │ FilterExec: d_year@1 = 2002 │ RepartitionExec: partitioning=RoundRobinBatch(3), input_partitions=2 - │ PartitionIsolatorExec: tasks=2 partitions=4 - │ DataSourceExec: file_groups={4 groups: [[/testdata/tpcds/plans_sf1_partitions4/date_dim/part-0.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-2.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-3.parquet]]}, projection=[d_date_sk, d_year], file_type=parquet, predicate=d_year@6 = 2002, pruning_predicate=d_year_null_count@2 != row_count@3 AND d_year_min@0 <= 2002 AND 2002 <= d_year_max@1, required_guarantees=[d_year in (2002)] + │ DistributedLeafExec: DataSourceExec: file_groups={2 groups: [[/testdata/tpcds/plans_sf1_partitions4/date_dim/part-0.parquet, /testdata/tpcds/plans_sf1_partitions4/date_dim/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-2.parquet, /testdata/tpcds/plans_sf1_partitions4/date_dim/part-3.parquet]]}, projection=[d_date_sk, d_year], file_type=parquet, predicate=d_year@6 = 2002, pruning_predicate=d_year_null_count@2 != row_count@3 AND d_year_min@0 <= 2002 AND 2002 <= d_year_max@1, required_guarantees=[d_year in (2002)] └────────────────────────────────────────────────── ┌───── Stage 14 ── Tasks: t0:[p0..p8] t1:[p9..p17] │ BroadcastExec: input_partitions=3, consumer_tasks=3, output_partitions=9 │ ProjectionExec: expr=[c_customer_sk@0 as c_customer_sk, c_customer_id@1 as c_customer_id, c_first_name@2 as c_first_name, c_last_name@3 as c_last_name, c_preferred_cust_flag@4 as c_preferred_cust_flag, c_birth_country@5 as c_birth_country, c_login@6 as c_login, c_email_address@7 as c_email_address, CAST(c_customer_sk@0 AS Float64) as CAST(customer.c_customer_sk AS Float64)] │ RepartitionExec: partitioning=RoundRobinBatch(3), input_partitions=2 - │ PartitionIsolatorExec: tasks=2 partitions=4 - │ DataSourceExec: file_groups={4 groups: [[/testdata/tpcds/plans_sf1_partitions4/customer/part-0.parquet], [/testdata/tpcds/plans_sf1_partitions4/customer/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/customer/part-2.parquet], [/testdata/tpcds/plans_sf1_partitions4/customer/part-3.parquet]]}, projection=[c_customer_sk, c_customer_id, c_first_name, c_last_name, c_preferred_cust_flag, c_birth_country, c_login, c_email_address], file_type=parquet, predicate=DynamicFilter [ empty ] + │ DistributedLeafExec: DataSourceExec: file_groups={2 groups: [[/testdata/tpcds/plans_sf1_partitions4/customer/part-0.parquet, /testdata/tpcds/plans_sf1_partitions4/customer/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/customer/part-2.parquet, /testdata/tpcds/plans_sf1_partitions4/customer/part-3.parquet]]}, projection=[c_customer_sk, c_customer_id, c_first_name, c_last_name, c_preferred_cust_flag, c_birth_country, c_login, c_email_address], file_type=parquet, predicate=DynamicFilter [ empty ] └────────────────────────────────────────────────── ┌───── Stage 19 ── Tasks: t0:[p0..p5] t1:[p0..p5] t2:[p0..p5] - │ RepartitionExec: partitioning=Hash([c_customer_id@0, c_first_name@1, c_last_name@2, c_preferred_cust_flag@3, c_birth_country@4, c_login@5, c_email_address@6, d_year@7], 6), input_partitions=2 + │ RepartitionExec: partitioning=Hash([c_customer_id@0, c_first_name@1, c_last_name@2, c_preferred_cust_flag@3, c_birth_country@4, c_login@5, c_email_address@6, d_year@7], 6), input_partitions=3 │ AggregateExec: mode=Partial, gby=[c_customer_id@0 as c_customer_id, c_first_name@1 as c_first_name, c_last_name@2 as c_last_name, c_preferred_cust_flag@3 as c_preferred_cust_flag, c_birth_country@4 as c_birth_country, c_login@5 as c_login, c_email_address@6 as c_email_address, d_year@11 as d_year], aggr=[sum(catalog_sales.cs_ext_list_price - catalog_sales.cs_ext_wholesale_cost - catalog_sales.cs_ext_discount_amt + catalog_sales.cs_ext_sales_price / Int64(2))], ordering_mode=PartiallySorted([7]) │ ProjectionExec: expr=[c_customer_id@1 as c_customer_id, c_first_name@2 as c_first_name, c_last_name@3 as c_last_name, c_preferred_cust_flag@4 as c_preferred_cust_flag, c_birth_country@5 as c_birth_country, c_login@6 as c_login, c_email_address@7 as c_email_address, cs_ext_discount_amt@8 as cs_ext_discount_amt, cs_ext_sales_price@9 as cs_ext_sales_price, cs_ext_wholesale_cost@10 as cs_ext_wholesale_cost, cs_ext_list_price@11 as cs_ext_list_price, d_year@0 as d_year] │ HashJoinExec: mode=CollectLeft, join_type=Inner, on=[(CAST(date_dim.d_date_sk AS Float64)@2, cs_sold_date_sk@7)], projection=[d_year@1, c_customer_id@3, c_first_name@4, c_last_name@5, c_preferred_cust_flag@6, c_birth_country@7, c_login@8, c_email_address@9, cs_ext_discount_amt@11, cs_ext_sales_price@12, cs_ext_wholesale_cost@13, cs_ext_list_price@14] @@ -428,26 +399,23 @@ mod tests { │ HashJoinExec: mode=CollectLeft, join_type=Inner, on=[(CAST(customer.c_customer_sk AS Float64)@8, cs_bill_customer_sk@1)], projection=[c_customer_id@1, c_first_name@2, c_last_name@3, c_preferred_cust_flag@4, c_birth_country@5, c_login@6, c_email_address@7, cs_sold_date_sk@9, cs_ext_discount_amt@11, cs_ext_sales_price@12, cs_ext_wholesale_cost@13, cs_ext_list_price@14] │ CoalescePartitionsExec │ [Stage 18] => NetworkBroadcastExec: partitions_per_consumer=3, stage_partitions=9, input_tasks=2 - │ PartitionIsolatorExec: tasks=3 partitions=6 - │ DataSourceExec: file_groups={6 groups: [[/testdata/tpcds/plans_sf1_partitions4/catalog_sales/part-0.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/catalog_sales/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/catalog_sales/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/catalog_sales/part-2.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/catalog_sales/part-2.parquet:..], ...]}, projection=[cs_sold_date_sk, cs_bill_customer_sk, cs_ext_discount_amt, cs_ext_sales_price, cs_ext_wholesale_cost, cs_ext_list_price], file_type=parquet, predicate=DynamicFilter [ empty ] AND DynamicFilter [ empty ] + │ DistributedLeafExec: DataSourceExec: file_groups={3 groups: [[/testdata/tpcds/plans_sf1_partitions4/catalog_sales/part-0.parquet:.., /testdata/tpcds/plans_sf1_partitions4/catalog_sales/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/catalog_sales/part-1.parquet:.., /testdata/tpcds/plans_sf1_partitions4/catalog_sales/part-2.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/catalog_sales/part-2.parquet:.., /testdata/tpcds/plans_sf1_partitions4/catalog_sales/part-3.parquet:..]]}, projection=[cs_sold_date_sk, cs_bill_customer_sk, cs_ext_discount_amt, cs_ext_sales_price, cs_ext_wholesale_cost, cs_ext_list_price], file_type=parquet, predicate=DynamicFilter [ empty ] AND DynamicFilter [ empty ] └────────────────────────────────────────────────── ┌───── Stage 17 ── Tasks: t0:[p0..p8] t1:[p9..p17] │ BroadcastExec: input_partitions=3, consumer_tasks=3, output_partitions=9 │ ProjectionExec: expr=[d_date_sk@0 as d_date_sk, d_year@1 as d_year, CAST(d_date_sk@0 AS Float64) as CAST(date_dim.d_date_sk AS Float64)] │ FilterExec: d_year@1 = 2002 │ RepartitionExec: partitioning=RoundRobinBatch(3), input_partitions=2 - │ PartitionIsolatorExec: tasks=2 partitions=4 - │ DataSourceExec: file_groups={4 groups: [[/testdata/tpcds/plans_sf1_partitions4/date_dim/part-0.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-2.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-3.parquet]]}, projection=[d_date_sk, d_year], file_type=parquet, predicate=d_year@6 = 2002, pruning_predicate=d_year_null_count@2 != row_count@3 AND d_year_min@0 <= 2002 AND 2002 <= d_year_max@1, required_guarantees=[d_year in (2002)] + │ DistributedLeafExec: DataSourceExec: file_groups={2 groups: [[/testdata/tpcds/plans_sf1_partitions4/date_dim/part-0.parquet, /testdata/tpcds/plans_sf1_partitions4/date_dim/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-2.parquet, /testdata/tpcds/plans_sf1_partitions4/date_dim/part-3.parquet]]}, projection=[d_date_sk, d_year], file_type=parquet, predicate=d_year@6 = 2002, pruning_predicate=d_year_null_count@2 != row_count@3 AND d_year_min@0 <= 2002 AND 2002 <= d_year_max@1, required_guarantees=[d_year in (2002)] └────────────────────────────────────────────────── ┌───── Stage 18 ── Tasks: t0:[p0..p8] t1:[p9..p17] │ BroadcastExec: input_partitions=3, consumer_tasks=3, output_partitions=9 │ ProjectionExec: expr=[c_customer_sk@0 as c_customer_sk, c_customer_id@1 as c_customer_id, c_first_name@2 as c_first_name, c_last_name@3 as c_last_name, c_preferred_cust_flag@4 as c_preferred_cust_flag, c_birth_country@5 as c_birth_country, c_login@6 as c_login, c_email_address@7 as c_email_address, CAST(c_customer_sk@0 AS Float64) as CAST(customer.c_customer_sk AS Float64)] │ RepartitionExec: partitioning=RoundRobinBatch(3), input_partitions=2 - │ PartitionIsolatorExec: tasks=2 partitions=4 - │ DataSourceExec: file_groups={4 groups: [[/testdata/tpcds/plans_sf1_partitions4/customer/part-0.parquet], [/testdata/tpcds/plans_sf1_partitions4/customer/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/customer/part-2.parquet], [/testdata/tpcds/plans_sf1_partitions4/customer/part-3.parquet]]}, projection=[c_customer_sk, c_customer_id, c_first_name, c_last_name, c_preferred_cust_flag, c_birth_country, c_login, c_email_address], file_type=parquet, predicate=DynamicFilter [ empty ] + │ DistributedLeafExec: DataSourceExec: file_groups={2 groups: [[/testdata/tpcds/plans_sf1_partitions4/customer/part-0.parquet, /testdata/tpcds/plans_sf1_partitions4/customer/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/customer/part-2.parquet, /testdata/tpcds/plans_sf1_partitions4/customer/part-3.parquet]]}, projection=[c_customer_sk, c_customer_id, c_first_name, c_last_name, c_preferred_cust_flag, c_birth_country, c_login, c_email_address], file_type=parquet, predicate=DynamicFilter [ empty ] └────────────────────────────────────────────────── ┌───── Stage 23 ── Tasks: t0:[p0..p5] t1:[p0..p5] t2:[p0..p5] - │ RepartitionExec: partitioning=Hash([c_customer_id@0, c_first_name@1, c_last_name@2, c_preferred_cust_flag@3, c_birth_country@4, c_login@5, c_email_address@6, d_year@7], 6), input_partitions=2 + │ RepartitionExec: partitioning=Hash([c_customer_id@0, c_first_name@1, c_last_name@2, c_preferred_cust_flag@3, c_birth_country@4, c_login@5, c_email_address@6, d_year@7], 6), input_partitions=3 │ AggregateExec: mode=Partial, gby=[c_customer_id@0 as c_customer_id, c_first_name@1 as c_first_name, c_last_name@2 as c_last_name, c_preferred_cust_flag@3 as c_preferred_cust_flag, c_birth_country@4 as c_birth_country, c_login@5 as c_login, c_email_address@6 as c_email_address, d_year@11 as d_year], aggr=[sum(web_sales.ws_ext_list_price - web_sales.ws_ext_wholesale_cost - web_sales.ws_ext_discount_amt + web_sales.ws_ext_sales_price / Int64(2))], ordering_mode=PartiallySorted([7]) │ ProjectionExec: expr=[c_customer_id@1 as c_customer_id, c_first_name@2 as c_first_name, c_last_name@3 as c_last_name, c_preferred_cust_flag@4 as c_preferred_cust_flag, c_birth_country@5 as c_birth_country, c_login@6 as c_login, c_email_address@7 as c_email_address, ws_ext_discount_amt@8 as ws_ext_discount_amt, ws_ext_sales_price@9 as ws_ext_sales_price, ws_ext_wholesale_cost@10 as ws_ext_wholesale_cost, ws_ext_list_price@11 as ws_ext_list_price, d_year@0 as d_year] │ HashJoinExec: mode=CollectLeft, join_type=Inner, on=[(CAST(date_dim.d_date_sk AS Float64)@2, ws_sold_date_sk@7)], projection=[d_year@1, c_customer_id@3, c_first_name@4, c_last_name@5, c_preferred_cust_flag@6, c_birth_country@7, c_login@8, c_email_address@9, ws_ext_discount_amt@11, ws_ext_sales_price@12, ws_ext_wholesale_cost@13, ws_ext_list_price@14] @@ -456,23 +424,20 @@ mod tests { │ HashJoinExec: mode=CollectLeft, join_type=Inner, on=[(CAST(customer.c_customer_sk AS Float64)@8, ws_bill_customer_sk@1)], projection=[c_customer_id@1, c_first_name@2, c_last_name@3, c_preferred_cust_flag@4, c_birth_country@5, c_login@6, c_email_address@7, ws_sold_date_sk@9, ws_ext_discount_amt@11, ws_ext_sales_price@12, ws_ext_wholesale_cost@13, ws_ext_list_price@14] │ CoalescePartitionsExec │ [Stage 22] => NetworkBroadcastExec: partitions_per_consumer=3, stage_partitions=9, input_tasks=2 - │ PartitionIsolatorExec: tasks=3 partitions=6 - │ DataSourceExec: file_groups={6 groups: [[/testdata/tpcds/plans_sf1_partitions4/web_sales/part-0.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/web_sales/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/web_sales/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/web_sales/part-2.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/web_sales/part-2.parquet:..], ...]}, projection=[ws_sold_date_sk, ws_bill_customer_sk, ws_ext_discount_amt, ws_ext_sales_price, ws_ext_wholesale_cost, ws_ext_list_price], file_type=parquet, predicate=DynamicFilter [ empty ] AND DynamicFilter [ empty ] + │ DistributedLeafExec: DataSourceExec: file_groups={3 groups: [[/testdata/tpcds/plans_sf1_partitions4/web_sales/part-0.parquet:.., /testdata/tpcds/plans_sf1_partitions4/web_sales/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/web_sales/part-1.parquet:.., /testdata/tpcds/plans_sf1_partitions4/web_sales/part-2.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/web_sales/part-2.parquet:.., /testdata/tpcds/plans_sf1_partitions4/web_sales/part-3.parquet:..]]}, projection=[ws_sold_date_sk, ws_bill_customer_sk, ws_ext_discount_amt, ws_ext_sales_price, ws_ext_wholesale_cost, ws_ext_list_price], file_type=parquet, predicate=DynamicFilter [ empty ] AND DynamicFilter [ empty ] └────────────────────────────────────────────────── ┌───── Stage 21 ── Tasks: t0:[p0..p8] t1:[p9..p17] │ BroadcastExec: input_partitions=3, consumer_tasks=3, output_partitions=9 │ ProjectionExec: expr=[d_date_sk@0 as d_date_sk, d_year@1 as d_year, CAST(d_date_sk@0 AS Float64) as CAST(date_dim.d_date_sk AS Float64)] │ FilterExec: d_year@1 = 2002 │ RepartitionExec: partitioning=RoundRobinBatch(3), input_partitions=2 - │ PartitionIsolatorExec: tasks=2 partitions=4 - │ DataSourceExec: file_groups={4 groups: [[/testdata/tpcds/plans_sf1_partitions4/date_dim/part-0.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-2.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-3.parquet]]}, projection=[d_date_sk, d_year], file_type=parquet, predicate=d_year@6 = 2002, pruning_predicate=d_year_null_count@2 != row_count@3 AND d_year_min@0 <= 2002 AND 2002 <= d_year_max@1, required_guarantees=[d_year in (2002)] + │ DistributedLeafExec: DataSourceExec: file_groups={2 groups: [[/testdata/tpcds/plans_sf1_partitions4/date_dim/part-0.parquet, /testdata/tpcds/plans_sf1_partitions4/date_dim/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-2.parquet, /testdata/tpcds/plans_sf1_partitions4/date_dim/part-3.parquet]]}, projection=[d_date_sk, d_year], file_type=parquet, predicate=d_year@6 = 2002, pruning_predicate=d_year_null_count@2 != row_count@3 AND d_year_min@0 <= 2002 AND 2002 <= d_year_max@1, required_guarantees=[d_year in (2002)] └────────────────────────────────────────────────── ┌───── Stage 22 ── Tasks: t0:[p0..p8] t1:[p9..p17] │ BroadcastExec: input_partitions=3, consumer_tasks=3, output_partitions=9 │ ProjectionExec: expr=[c_customer_sk@0 as c_customer_sk, c_customer_id@1 as c_customer_id, c_first_name@2 as c_first_name, c_last_name@3 as c_last_name, c_preferred_cust_flag@4 as c_preferred_cust_flag, c_birth_country@5 as c_birth_country, c_login@6 as c_login, c_email_address@7 as c_email_address, CAST(c_customer_sk@0 AS Float64) as CAST(customer.c_customer_sk AS Float64)] │ RepartitionExec: partitioning=RoundRobinBatch(3), input_partitions=2 - │ PartitionIsolatorExec: tasks=2 partitions=4 - │ DataSourceExec: file_groups={4 groups: [[/testdata/tpcds/plans_sf1_partitions4/customer/part-0.parquet], [/testdata/tpcds/plans_sf1_partitions4/customer/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/customer/part-2.parquet], [/testdata/tpcds/plans_sf1_partitions4/customer/part-3.parquet]]}, projection=[c_customer_sk, c_customer_id, c_first_name, c_last_name, c_preferred_cust_flag, c_birth_country, c_login, c_email_address], file_type=parquet, predicate=DynamicFilter [ empty ] + │ DistributedLeafExec: DataSourceExec: file_groups={2 groups: [[/testdata/tpcds/plans_sf1_partitions4/customer/part-0.parquet, /testdata/tpcds/plans_sf1_partitions4/customer/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/customer/part-2.parquet, /testdata/tpcds/plans_sf1_partitions4/customer/part-3.parquet]]}, projection=[c_customer_sk, c_customer_id, c_first_name, c_last_name, c_preferred_cust_flag, c_birth_country, c_login, c_email_address], file_type=parquet, predicate=DynamicFilter [ empty ] └────────────────────────────────────────────────── "); Ok(()) @@ -513,23 +478,19 @@ mod tests { │ CoalescePartitionsExec │ [Stage 2] => NetworkBroadcastExec: partitions_per_consumer=3, stage_partitions=12, input_tasks=2 │ DistributedUnionExec: t0:[c0(0/2)] t1:[c0(1/2)] t2:[c1(0/2)] t3:[c1(1/2)] - │ PartitionIsolatorExec: tasks=2 partitions=6 - │ DataSourceExec: file_groups={6 groups: [[/testdata/tpcds/plans_sf1_partitions4/store_sales/part-0.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_sales/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_sales/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_sales/part-2.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_sales/part-2.parquet:..], ...]}, projection=[ss_store_sk@7 as store_sk, ss_sold_date_sk@0 as date_sk, ss_ext_sales_price@15 as sales_price, CAST(ss_net_profit@22 AS Decimal128(7, 2)) as profit, Some(0),7,2 as return_amt, Some(0),7,2 as net_loss], file_type=parquet, predicate=DynamicFilter [ empty ] AND DynamicFilter [ empty ] - │ PartitionIsolatorExec: tasks=2 partitions=6 - │ DataSourceExec: file_groups={6 groups: [[/testdata/tpcds/plans_sf1_partitions4/store_returns/part-0.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_returns/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_returns/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_returns/part-2.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_returns/part-2.parquet:..], ...]}, projection=[sr_store_sk@7 as store_sk, sr_returned_date_sk@0 as date_sk, Some(0),7,2 as sales_price, Some(0),7,2 as profit, sr_return_amt@11 as return_amt, CAST(sr_net_loss@19 AS Decimal128(7, 2)) as net_loss], file_type=parquet, predicate=DynamicFilter [ empty ] AND DynamicFilter [ empty ] + │ DistributedLeafExec: DataSourceExec: file_groups={3 groups: [[/testdata/tpcds/plans_sf1_partitions4/store_sales/part-0.parquet:.., /testdata/tpcds/plans_sf1_partitions4/store_sales/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_sales/part-1.parquet:.., /testdata/tpcds/plans_sf1_partitions4/store_sales/part-2.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_sales/part-2.parquet:.., /testdata/tpcds/plans_sf1_partitions4/store_sales/part-3.parquet:..]]}, projection=[ss_store_sk@7 as store_sk, ss_sold_date_sk@0 as date_sk, ss_ext_sales_price@15 as sales_price, CAST(ss_net_profit@22 AS Decimal128(7, 2)) as profit, Some(0),7,2 as return_amt, Some(0),7,2 as net_loss], file_type=parquet, predicate=DynamicFilter [ empty ] AND DynamicFilter [ empty ] + │ DistributedLeafExec: DataSourceExec: file_groups={3 groups: [[/testdata/tpcds/plans_sf1_partitions4/store_returns/part-0.parquet:.., /testdata/tpcds/plans_sf1_partitions4/store_returns/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_returns/part-1.parquet:.., /testdata/tpcds/plans_sf1_partitions4/store_returns/part-2.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_returns/part-2.parquet:.., /testdata/tpcds/plans_sf1_partitions4/store_returns/part-3.parquet:..]]}, projection=[sr_store_sk@7 as store_sk, sr_returned_date_sk@0 as date_sk, Some(0),7,2 as sales_price, Some(0),7,2 as profit, sr_return_amt@11 as return_amt, CAST(sr_net_loss@19 AS Decimal128(7, 2)) as net_loss], file_type=parquet, predicate=DynamicFilter [ empty ] AND DynamicFilter [ empty ] └────────────────────────────────────────────────── ┌───── Stage 1 ── Tasks: t0:[p0..p7] t1:[p8..p15] │ BroadcastExec: input_partitions=2, consumer_tasks=4, output_partitions=8 - │ PartitionIsolatorExec: tasks=2 partitions=4 - │ DataSourceExec: file_groups={4 groups: [[/testdata/tpcds/plans_sf1_partitions4/store/part-0.parquet], [/testdata/tpcds/plans_sf1_partitions4/store/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/store/part-2.parquet], [/testdata/tpcds/plans_sf1_partitions4/store/part-3.parquet]]}, projection=[s_store_sk, s_store_id, CAST(s_store_sk@0 AS Float64) as CAST(store.s_store_sk AS Float64)], file_type=parquet + │ DistributedLeafExec: DataSourceExec: file_groups={2 groups: [[/testdata/tpcds/plans_sf1_partitions4/store/part-0.parquet, /testdata/tpcds/plans_sf1_partitions4/store/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/store/part-2.parquet, /testdata/tpcds/plans_sf1_partitions4/store/part-3.parquet]]}, projection=[s_store_sk, s_store_id, CAST(s_store_sk@0 AS Float64) as CAST(store.s_store_sk AS Float64)], file_type=parquet └────────────────────────────────────────────────── ┌───── Stage 2 ── Tasks: t0:[p0..p11] t1:[p12..p23] │ BroadcastExec: input_partitions=3, consumer_tasks=4, output_partitions=12 │ ProjectionExec: expr=[d_date_sk@0 as d_date_sk, CAST(d_date_sk@0 AS Float64) as CAST(date_dim.d_date_sk AS Float64)] │ FilterExec: d_date@1 >= 2000-08-23 AND d_date@1 <= 2000-09-06, projection=[d_date_sk@0] │ RepartitionExec: partitioning=RoundRobinBatch(3), input_partitions=2 - │ PartitionIsolatorExec: tasks=2 partitions=4 - │ DataSourceExec: file_groups={4 groups: [[/testdata/tpcds/plans_sf1_partitions4/date_dim/part-0.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-2.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-3.parquet]]}, projection=[d_date_sk, d_date], file_type=parquet, predicate=d_date@2 >= 2000-08-23 AND d_date@2 <= 2000-09-06, pruning_predicate=d_date_null_count@1 != row_count@2 AND d_date_max@0 >= 2000-08-23 AND d_date_null_count@1 != row_count@2 AND d_date_min@3 <= 2000-09-06, required_guarantees=[] + │ DistributedLeafExec: DataSourceExec: file_groups={2 groups: [[/testdata/tpcds/plans_sf1_partitions4/date_dim/part-0.parquet, /testdata/tpcds/plans_sf1_partitions4/date_dim/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-2.parquet, /testdata/tpcds/plans_sf1_partitions4/date_dim/part-3.parquet]]}, projection=[d_date_sk, d_date], file_type=parquet, predicate=d_date@2 >= 2000-08-23 AND d_date@2 <= 2000-09-06, pruning_predicate=d_date_null_count@1 != row_count@2 AND d_date_max@0 >= 2000-08-23 AND d_date_null_count@1 != row_count@2 AND d_date_min@3 <= 2000-09-06, required_guarantees=[] └────────────────────────────────────────────────── ┌───── Stage 6 ── Tasks: t0:[p0..p2] t1:[p0..p2] t2:[p0..p2] t3:[p0..p2] │ RepartitionExec: partitioning=Hash([cp_catalog_page_id@0], 3), input_partitions=3 @@ -542,25 +503,21 @@ mod tests { │ CoalescePartitionsExec │ [Stage 5] => NetworkBroadcastExec: partitions_per_consumer=3, stage_partitions=12, input_tasks=2 │ DistributedUnionExec: t0:[c0(0/2)] t1:[c0(1/2)] t2:[c1(0/2)] t3:[c1(1/2)] - │ PartitionIsolatorExec: tasks=2 partitions=6 - │ DataSourceExec: file_groups={6 groups: [[/testdata/tpcds/plans_sf1_partitions4/catalog_sales/part-0.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/catalog_sales/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/catalog_sales/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/catalog_sales/part-2.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/catalog_sales/part-2.parquet:..], ...]}, projection=[cs_catalog_page_sk@12 as page_sk, cs_sold_date_sk@0 as date_sk, cs_ext_sales_price@23 as sales_price, cs_net_profit@33 as profit, Some(0),7,2 as return_amt, Some(0),7,2 as net_loss], file_type=parquet, predicate=DynamicFilter [ empty ] AND DynamicFilter [ empty ] - │ PartitionIsolatorExec: tasks=2 partitions=6 - │ DataSourceExec: file_groups={6 groups: [[/testdata/tpcds/plans_sf1_partitions4/catalog_returns/part-0.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/catalog_returns/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/catalog_returns/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/catalog_returns/part-2.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/catalog_returns/part-2.parquet:..], ...]}, projection=[cr_catalog_page_sk@12 as page_sk, CAST(cr_returned_date_sk@0 AS Float64) as date_sk, Some(0),7,2 as sales_price, Some(0),7,2 as profit, cr_return_amount@18 as return_amt, cr_net_loss@26 as net_loss], file_type=parquet, predicate=DynamicFilter [ empty ] AND DynamicFilter [ empty ] + │ DistributedLeafExec: DataSourceExec: file_groups={3 groups: [[/testdata/tpcds/plans_sf1_partitions4/catalog_sales/part-0.parquet:.., /testdata/tpcds/plans_sf1_partitions4/catalog_sales/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/catalog_sales/part-1.parquet:.., /testdata/tpcds/plans_sf1_partitions4/catalog_sales/part-2.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/catalog_sales/part-2.parquet:.., /testdata/tpcds/plans_sf1_partitions4/catalog_sales/part-3.parquet:..]]}, projection=[cs_catalog_page_sk@12 as page_sk, cs_sold_date_sk@0 as date_sk, cs_ext_sales_price@23 as sales_price, cs_net_profit@33 as profit, Some(0),7,2 as return_amt, Some(0),7,2 as net_loss], file_type=parquet, predicate=DynamicFilter [ empty ] AND DynamicFilter [ empty ] + │ DistributedLeafExec: DataSourceExec: file_groups={3 groups: [[/testdata/tpcds/plans_sf1_partitions4/catalog_returns/part-0.parquet:.., /testdata/tpcds/plans_sf1_partitions4/catalog_returns/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/catalog_returns/part-1.parquet:.., /testdata/tpcds/plans_sf1_partitions4/catalog_returns/part-2.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/catalog_returns/part-2.parquet:.., /testdata/tpcds/plans_sf1_partitions4/catalog_returns/part-3.parquet:..]]}, projection=[cr_catalog_page_sk@12 as page_sk, CAST(cr_returned_date_sk@0 AS Float64) as date_sk, Some(0),7,2 as sales_price, Some(0),7,2 as profit, cr_return_amount@18 as return_amt, cr_net_loss@26 as net_loss], file_type=parquet, predicate=DynamicFilter [ empty ] AND DynamicFilter [ empty ] └────────────────────────────────────────────────── ┌───── Stage 4 ── Tasks: t0:[p0..p11] t1:[p12..p23] │ BroadcastExec: input_partitions=3, consumer_tasks=4, output_partitions=12 │ ProjectionExec: expr=[cp_catalog_page_sk@0 as cp_catalog_page_sk, cp_catalog_page_id@1 as cp_catalog_page_id, CAST(cp_catalog_page_sk@0 AS Float64) as CAST(catalog_page.cp_catalog_page_sk AS Float64)] │ RepartitionExec: partitioning=RoundRobinBatch(3), input_partitions=2 - │ PartitionIsolatorExec: tasks=2 partitions=4 - │ DataSourceExec: file_groups={4 groups: [[/testdata/tpcds/plans_sf1_partitions4/catalog_page/part-0.parquet], [/testdata/tpcds/plans_sf1_partitions4/catalog_page/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/catalog_page/part-2.parquet], [/testdata/tpcds/plans_sf1_partitions4/catalog_page/part-3.parquet]]}, projection=[cp_catalog_page_sk, cp_catalog_page_id], file_type=parquet + │ DistributedLeafExec: DataSourceExec: file_groups={2 groups: [[/testdata/tpcds/plans_sf1_partitions4/catalog_page/part-0.parquet, /testdata/tpcds/plans_sf1_partitions4/catalog_page/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/catalog_page/part-2.parquet, /testdata/tpcds/plans_sf1_partitions4/catalog_page/part-3.parquet]]}, projection=[cp_catalog_page_sk, cp_catalog_page_id], file_type=parquet └────────────────────────────────────────────────── ┌───── Stage 5 ── Tasks: t0:[p0..p11] t1:[p12..p23] │ BroadcastExec: input_partitions=3, consumer_tasks=4, output_partitions=12 │ ProjectionExec: expr=[d_date_sk@0 as d_date_sk, CAST(d_date_sk@0 AS Float64) as CAST(date_dim.d_date_sk AS Float64)] │ FilterExec: d_date@1 >= 2000-08-23 AND d_date@1 <= 2000-09-06, projection=[d_date_sk@0] │ RepartitionExec: partitioning=RoundRobinBatch(3), input_partitions=2 - │ PartitionIsolatorExec: tasks=2 partitions=4 - │ DataSourceExec: file_groups={4 groups: [[/testdata/tpcds/plans_sf1_partitions4/date_dim/part-0.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-2.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-3.parquet]]}, projection=[d_date_sk, d_date], file_type=parquet, predicate=d_date@2 >= 2000-08-23 AND d_date@2 <= 2000-09-06, pruning_predicate=d_date_null_count@1 != row_count@2 AND d_date_max@0 >= 2000-08-23 AND d_date_null_count@1 != row_count@2 AND d_date_min@3 <= 2000-09-06, required_guarantees=[] + │ DistributedLeafExec: DataSourceExec: file_groups={2 groups: [[/testdata/tpcds/plans_sf1_partitions4/date_dim/part-0.parquet, /testdata/tpcds/plans_sf1_partitions4/date_dim/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-2.parquet, /testdata/tpcds/plans_sf1_partitions4/date_dim/part-3.parquet]]}, projection=[d_date_sk, d_date], file_type=parquet, predicate=d_date@2 >= 2000-08-23 AND d_date@2 <= 2000-09-06, pruning_predicate=d_date_null_count@1 != row_count@2 AND d_date_max@0 >= 2000-08-23 AND d_date_null_count@1 != row_count@2 AND d_date_min@3 <= 2000-09-06, required_guarantees=[] └────────────────────────────────────────────────── ┌───── Stage 11 ── Tasks: t0:[p0..p2] t1:[p0..p2] t2:[p0..p2] t3:[p0..p2] │ RepartitionExec: partitioning=Hash([web_site_id@0], 3), input_partitions=3 @@ -573,8 +530,7 @@ mod tests { │ CoalescePartitionsExec │ [Stage 8] => NetworkBroadcastExec: partitions_per_consumer=3, stage_partitions=12, input_tasks=2 │ DistributedUnionExec: t0:[c0(0/2)] t1:[c0(1/2)] t2:[c1(0/2)] t3:[c1(1/2)] - │ PartitionIsolatorExec: tasks=2 partitions=6 - │ DataSourceExec: file_groups={6 groups: [[/testdata/tpcds/plans_sf1_partitions4/web_sales/part-0.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/web_sales/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/web_sales/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/web_sales/part-2.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/web_sales/part-2.parquet:..], ...]}, projection=[ws_web_site_sk@13 as wsr_web_site_sk, ws_sold_date_sk@0 as date_sk, ws_ext_sales_price@23 as sales_price, ws_net_profit@33 as profit, Some(0),7,2 as return_amt, Some(0),7,2 as net_loss], file_type=parquet, predicate=DynamicFilter [ empty ] AND DynamicFilter [ empty ] + │ DistributedLeafExec: DataSourceExec: file_groups={3 groups: [[/testdata/tpcds/plans_sf1_partitions4/web_sales/part-0.parquet:.., /testdata/tpcds/plans_sf1_partitions4/web_sales/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/web_sales/part-1.parquet:.., /testdata/tpcds/plans_sf1_partitions4/web_sales/part-2.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/web_sales/part-2.parquet:.., /testdata/tpcds/plans_sf1_partitions4/web_sales/part-3.parquet:..]]}, projection=[ws_web_site_sk@13 as wsr_web_site_sk, ws_sold_date_sk@0 as date_sk, ws_ext_sales_price@23 as sales_price, ws_net_profit@33 as profit, Some(0),7,2 as return_amt, Some(0),7,2 as net_loss], file_type=parquet, predicate=DynamicFilter [ empty ] AND DynamicFilter [ empty ] │ ProjectionExec: expr=[ws_web_site_sk@3 as wsr_web_site_sk, wr_returned_date_sk@0 as date_sk, Some(0),7,2 as sales_price, Some(0),7,2 as profit, wr_return_amt@1 as return_amt, wr_net_loss@2 as net_loss] │ HashJoinExec: mode=Partitioned, join_type=Left, on=[(wr_item_sk@1, ws_item_sk@0), (wr_order_number@2, ws_order_number@2)], projection=[wr_returned_date_sk@0, wr_return_amt@3, wr_net_loss@4, ws_web_site_sk@6] │ [Stage 9] => NetworkShuffleExec: output_partitions=3, input_tasks=2 @@ -582,26 +538,22 @@ mod tests { └────────────────────────────────────────────────── ┌───── Stage 7 ── Tasks: t0:[p0..p7] t1:[p8..p15] │ BroadcastExec: input_partitions=2, consumer_tasks=4, output_partitions=8 - │ PartitionIsolatorExec: tasks=2 partitions=4 - │ DataSourceExec: file_groups={4 groups: [[/testdata/tpcds/plans_sf1_partitions4/web_site/part-0.parquet], [/testdata/tpcds/plans_sf1_partitions4/web_site/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/web_site/part-2.parquet], [/testdata/tpcds/plans_sf1_partitions4/web_site/part-3.parquet]]}, projection=[web_site_sk, web_site_id, CAST(web_site_sk@0 AS Float64) as CAST(web_site.web_site_sk AS Float64)], file_type=parquet + │ DistributedLeafExec: DataSourceExec: file_groups={2 groups: [[/testdata/tpcds/plans_sf1_partitions4/web_site/part-0.parquet, /testdata/tpcds/plans_sf1_partitions4/web_site/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/web_site/part-2.parquet, /testdata/tpcds/plans_sf1_partitions4/web_site/part-3.parquet]]}, projection=[web_site_sk, web_site_id, CAST(web_site_sk@0 AS Float64) as CAST(web_site.web_site_sk AS Float64)], file_type=parquet └────────────────────────────────────────────────── ┌───── Stage 8 ── Tasks: t0:[p0..p11] t1:[p12..p23] │ BroadcastExec: input_partitions=3, consumer_tasks=4, output_partitions=12 │ ProjectionExec: expr=[d_date_sk@0 as d_date_sk, CAST(d_date_sk@0 AS Float64) as CAST(date_dim.d_date_sk AS Float64)] │ FilterExec: d_date@1 >= 2000-08-23 AND d_date@1 <= 2000-09-06, projection=[d_date_sk@0] │ RepartitionExec: partitioning=RoundRobinBatch(3), input_partitions=2 - │ PartitionIsolatorExec: tasks=2 partitions=4 - │ DataSourceExec: file_groups={4 groups: [[/testdata/tpcds/plans_sf1_partitions4/date_dim/part-0.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-2.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-3.parquet]]}, projection=[d_date_sk, d_date], file_type=parquet, predicate=d_date@2 >= 2000-08-23 AND d_date@2 <= 2000-09-06, pruning_predicate=d_date_null_count@1 != row_count@2 AND d_date_max@0 >= 2000-08-23 AND d_date_null_count@1 != row_count@2 AND d_date_min@3 <= 2000-09-06, required_guarantees=[] + │ DistributedLeafExec: DataSourceExec: file_groups={2 groups: [[/testdata/tpcds/plans_sf1_partitions4/date_dim/part-0.parquet, /testdata/tpcds/plans_sf1_partitions4/date_dim/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-2.parquet, /testdata/tpcds/plans_sf1_partitions4/date_dim/part-3.parquet]]}, projection=[d_date_sk, d_date], file_type=parquet, predicate=d_date@2 >= 2000-08-23 AND d_date@2 <= 2000-09-06, pruning_predicate=d_date_null_count@1 != row_count@2 AND d_date_max@0 >= 2000-08-23 AND d_date_null_count@1 != row_count@2 AND d_date_min@3 <= 2000-09-06, required_guarantees=[] └────────────────────────────────────────────────── ┌───── Stage 9 ── Tasks: t0:[p0..p5] t1:[p0..p5] │ RepartitionExec: partitioning=Hash([wr_item_sk@1, wr_order_number@2], 6), input_partitions=2 - │ PartitionIsolatorExec: tasks=2 partitions=4 - │ DataSourceExec: file_groups={4 groups: [[/testdata/tpcds/plans_sf1_partitions4/web_returns/part-0.parquet], [/testdata/tpcds/plans_sf1_partitions4/web_returns/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/web_returns/part-2.parquet], [/testdata/tpcds/plans_sf1_partitions4/web_returns/part-3.parquet]]}, projection=[wr_returned_date_sk, wr_item_sk, wr_order_number, wr_return_amt, wr_net_loss], file_type=parquet, predicate=DynamicFilter [ empty ] + │ DistributedLeafExec: DataSourceExec: file_groups={2 groups: [[/testdata/tpcds/plans_sf1_partitions4/web_returns/part-0.parquet, /testdata/tpcds/plans_sf1_partitions4/web_returns/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/web_returns/part-2.parquet, /testdata/tpcds/plans_sf1_partitions4/web_returns/part-3.parquet]]}, projection=[wr_returned_date_sk, wr_item_sk, wr_order_number, wr_return_amt, wr_net_loss], file_type=parquet, predicate=DynamicFilter [ empty ] └────────────────────────────────────────────────── ┌───── Stage 10 ── Tasks: t0:[p0..p5] t1:[p0..p5] t2:[p0..p5] - │ RepartitionExec: partitioning=Hash([ws_item_sk@0, ws_order_number@2], 6), input_partitions=2 - │ PartitionIsolatorExec: tasks=3 partitions=6 - │ DataSourceExec: file_groups={6 groups: [[/testdata/tpcds/plans_sf1_partitions4/web_sales/part-0.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/web_sales/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/web_sales/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/web_sales/part-2.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/web_sales/part-2.parquet:..], ...]}, projection=[ws_item_sk, ws_web_site_sk, ws_order_number], file_type=parquet + │ RepartitionExec: partitioning=Hash([ws_item_sk@0, ws_order_number@2], 6), input_partitions=3 + │ DistributedLeafExec: DataSourceExec: file_groups={3 groups: [[/testdata/tpcds/plans_sf1_partitions4/web_sales/part-0.parquet:.., /testdata/tpcds/plans_sf1_partitions4/web_sales/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/web_sales/part-1.parquet:.., /testdata/tpcds/plans_sf1_partitions4/web_sales/part-2.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/web_sales/part-2.parquet:.., /testdata/tpcds/plans_sf1_partitions4/web_sales/part-3.parquet:..]]}, projection=[ws_item_sk, ws_web_site_sk, ws_order_number], file_type=parquet └────────────────────────────────────────────────── "); Ok(()) @@ -609,7 +561,7 @@ mod tests { #[tokio::test] async fn test_tpcds_6() -> Result<()> { let display = test_tpcds_query("q6").await?; - assert_snapshot!(display, @" + assert_snapshot!(display, @r" ┌───── DistributedExec ── Tasks: t0:[p0] │ SortPreservingMergeExec: [cnt@1 ASC, state@0 ASC], fetch=100 │ SortExec: TopK(fetch=100), expr=[cnt@1 ASC, state@0 ASC], preserve_partitioning=[true] @@ -636,11 +588,10 @@ mod tests { │ [Stage 3] => NetworkBroadcastExec: partitions_per_consumer=2, stage_partitions=6, input_tasks=2 │ HashJoinExec: mode=CollectLeft, join_type=Inner, on=[(ss_sold_date_sk@1, CAST(d.d_date_sk AS Float64)@2)], projection=[ca_state@0, ss_item_sk@2, d_month_seq@4] │ CoalescePartitionsExec - │ [Stage 6] => NetworkBroadcastExec: partitions_per_consumer=2, stage_partitions=6, input_tasks=3 + │ [Stage 6] => NetworkBroadcastExec: partitions_per_consumer=3, stage_partitions=9, input_tasks=3 │ ProjectionExec: expr=[d_date_sk@0 as d_date_sk, d_month_seq@1 as d_month_seq, CAST(d_date_sk@0 AS Float64) as CAST(d.d_date_sk AS Float64)] │ RepartitionExec: partitioning=RoundRobinBatch(3), input_partitions=2 - │ PartitionIsolatorExec: tasks=3 partitions=4 - │ DataSourceExec: file_groups={4 groups: [[/testdata/tpcds/plans_sf1_partitions4/date_dim/part-0.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-2.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-3.parquet]]}, projection=[d_date_sk, d_month_seq], file_type=parquet, predicate=DynamicFilter [ empty ] AND DynamicFilter [ empty ] + │ DistributedLeafExec: DataSourceExec: file_groups={2 groups: [[/testdata/tpcds/plans_sf1_partitions4/date_dim/part-0.parquet, /testdata/tpcds/plans_sf1_partitions4/date_dim/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-2.parquet, /testdata/tpcds/plans_sf1_partitions4/date_dim/part-3.parquet]]}, projection=[d_date_sk, d_month_seq], file_type=parquet, predicate=DynamicFilter [ empty ] AND DynamicFilter [ empty ] └────────────────────────────────────────────────── ┌───── Stage 2 ── Tasks: t0:[p0..p8] │ BroadcastExec: input_partitions=3, consumer_tasks=3, output_partitions=9 @@ -652,21 +603,18 @@ mod tests { │ AggregateExec: mode=Partial, gby=[d_month_seq@0 as d_month_seq], aggr=[] │ FilterExec: d_year@1 = 2001 AND d_moy@2 = 1, projection=[d_month_seq@0] │ RepartitionExec: partitioning=RoundRobinBatch(3), input_partitions=2 - │ PartitionIsolatorExec: tasks=2 partitions=4 - │ DataSourceExec: file_groups={4 groups: [[/testdata/tpcds/plans_sf1_partitions4/date_dim/part-0.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-2.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-3.parquet]]}, projection=[d_month_seq, d_year, d_moy], file_type=parquet, predicate=d_year@6 = 2001 AND d_moy@8 = 1, pruning_predicate=d_year_null_count@2 != row_count@3 AND d_year_min@0 <= 2001 AND 2001 <= d_year_max@1 AND d_moy_null_count@6 != row_count@3 AND d_moy_min@4 <= 1 AND 1 <= d_moy_max@5, required_guarantees=[d_moy in (1), d_year in (2001)] + │ DistributedLeafExec: DataSourceExec: file_groups={2 groups: [[/testdata/tpcds/plans_sf1_partitions4/date_dim/part-0.parquet, /testdata/tpcds/plans_sf1_partitions4/date_dim/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-2.parquet, /testdata/tpcds/plans_sf1_partitions4/date_dim/part-3.parquet]]}, projection=[d_month_seq, d_year, d_moy], file_type=parquet, predicate=d_year@6 = 2001 AND d_moy@8 = 1, pruning_predicate=d_year_null_count@2 != row_count@3 AND d_year_min@0 <= 2001 AND 2001 <= d_year_max@1 AND d_moy_null_count@6 != row_count@3 AND d_moy_min@4 <= 1 AND 1 <= d_moy_max@5, required_guarantees=[d_moy in (1), d_year in (2001)] └────────────────────────────────────────────────── ┌───── Stage 3 ── Tasks: t0:[p0..p5] t1:[p6..p11] │ BroadcastExec: input_partitions=2, consumer_tasks=3, output_partitions=6 - │ PartitionIsolatorExec: tasks=2 partitions=4 - │ DataSourceExec: file_groups={4 groups: [[/testdata/tpcds/plans_sf1_partitions4/item/part-0.parquet], [/testdata/tpcds/plans_sf1_partitions4/item/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/item/part-2.parquet], [/testdata/tpcds/plans_sf1_partitions4/item/part-3.parquet]]}, projection=[i_item_sk, i_current_price, i_category], file_type=parquet + │ DistributedLeafExec: DataSourceExec: file_groups={2 groups: [[/testdata/tpcds/plans_sf1_partitions4/item/part-0.parquet, /testdata/tpcds/plans_sf1_partitions4/item/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/item/part-2.parquet, /testdata/tpcds/plans_sf1_partitions4/item/part-3.parquet]]}, projection=[i_item_sk, i_current_price, i_category], file_type=parquet └────────────────────────────────────────────────── - ┌───── Stage 6 ── Tasks: t0:[p0..p5] t1:[p6..p11] t2:[p12..p17] - │ BroadcastExec: input_partitions=2, consumer_tasks=3, output_partitions=6 + ┌───── Stage 6 ── Tasks: t0:[p0..p8] t1:[p9..p17] t2:[p18..p26] + │ BroadcastExec: input_partitions=3, consumer_tasks=3, output_partitions=9 │ HashJoinExec: mode=CollectLeft, join_type=Inner, on=[(CAST(c.c_customer_sk AS Float64)@2, ss_customer_sk@2)], projection=[ca_state@0, ss_sold_date_sk@3, ss_item_sk@4] │ CoalescePartitionsExec │ [Stage 5] => NetworkBroadcastExec: partitions_per_consumer=3, stage_partitions=9, input_tasks=2 - │ PartitionIsolatorExec: tasks=3 partitions=6 - │ DataSourceExec: file_groups={6 groups: [[/testdata/tpcds/plans_sf1_partitions4/store_sales/part-0.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_sales/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_sales/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_sales/part-2.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_sales/part-2.parquet:..], ...]}, projection=[ss_sold_date_sk, ss_item_sk, ss_customer_sk], file_type=parquet, predicate=DynamicFilter [ empty ] AND DynamicFilter [ empty ] + │ DistributedLeafExec: DataSourceExec: file_groups={3 groups: [[/testdata/tpcds/plans_sf1_partitions4/store_sales/part-0.parquet:.., /testdata/tpcds/plans_sf1_partitions4/store_sales/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_sales/part-1.parquet:.., /testdata/tpcds/plans_sf1_partitions4/store_sales/part-2.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_sales/part-2.parquet:.., /testdata/tpcds/plans_sf1_partitions4/store_sales/part-3.parquet:..]]}, projection=[ss_sold_date_sk, ss_item_sk, ss_customer_sk], file_type=parquet, predicate=DynamicFilter [ empty ] AND DynamicFilter [ empty ] └────────────────────────────────────────────────── ┌───── Stage 5 ── Tasks: t0:[p0..p8] t1:[p9..p17] │ BroadcastExec: input_partitions=3, consumer_tasks=3, output_partitions=9 @@ -675,20 +623,17 @@ mod tests { │ CoalescePartitionsExec │ [Stage 4] => NetworkBroadcastExec: partitions_per_consumer=2, stage_partitions=4, input_tasks=2 │ RepartitionExec: partitioning=RoundRobinBatch(3), input_partitions=2 - │ PartitionIsolatorExec: tasks=2 partitions=4 - │ DataSourceExec: file_groups={4 groups: [[/testdata/tpcds/plans_sf1_partitions4/customer/part-0.parquet], [/testdata/tpcds/plans_sf1_partitions4/customer/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/customer/part-2.parquet], [/testdata/tpcds/plans_sf1_partitions4/customer/part-3.parquet]]}, projection=[c_customer_sk, c_current_addr_sk], file_type=parquet, predicate=DynamicFilter [ empty ] + │ DistributedLeafExec: DataSourceExec: file_groups={2 groups: [[/testdata/tpcds/plans_sf1_partitions4/customer/part-0.parquet, /testdata/tpcds/plans_sf1_partitions4/customer/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/customer/part-2.parquet, /testdata/tpcds/plans_sf1_partitions4/customer/part-3.parquet]]}, projection=[c_customer_sk, c_current_addr_sk], file_type=parquet, predicate=DynamicFilter [ empty ] └────────────────────────────────────────────────── ┌───── Stage 4 ── Tasks: t0:[p0..p3] t1:[p4..p7] │ BroadcastExec: input_partitions=2, consumer_tasks=2, output_partitions=4 - │ PartitionIsolatorExec: tasks=2 partitions=4 - │ DataSourceExec: file_groups={4 groups: [[/testdata/tpcds/plans_sf1_partitions4/customer_address/part-0.parquet], [/testdata/tpcds/plans_sf1_partitions4/customer_address/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/customer_address/part-2.parquet], [/testdata/tpcds/plans_sf1_partitions4/customer_address/part-3.parquet]]}, projection=[ca_address_sk, ca_state], file_type=parquet + │ DistributedLeafExec: DataSourceExec: file_groups={2 groups: [[/testdata/tpcds/plans_sf1_partitions4/customer_address/part-0.parquet, /testdata/tpcds/plans_sf1_partitions4/customer_address/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/customer_address/part-2.parquet, /testdata/tpcds/plans_sf1_partitions4/customer_address/part-3.parquet]]}, projection=[ca_address_sk, ca_state], file_type=parquet └────────────────────────────────────────────────── ┌───── Stage 8 ── Tasks: t0:[p0..p2] t1:[p0..p2] │ RepartitionExec: partitioning=Hash([i_category@0], 3), input_partitions=3 │ AggregateExec: mode=Partial, gby=[i_category@1 as i_category], aggr=[avg(j.i_current_price)] │ RepartitionExec: partitioning=RoundRobinBatch(3), input_partitions=2 - │ PartitionIsolatorExec: tasks=2 partitions=4 - │ DataSourceExec: file_groups={4 groups: [[/testdata/tpcds/plans_sf1_partitions4/item/part-0.parquet], [/testdata/tpcds/plans_sf1_partitions4/item/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/item/part-2.parquet], [/testdata/tpcds/plans_sf1_partitions4/item/part-3.parquet]]}, projection=[i_current_price, i_category], file_type=parquet + │ DistributedLeafExec: DataSourceExec: file_groups={2 groups: [[/testdata/tpcds/plans_sf1_partitions4/item/part-0.parquet, /testdata/tpcds/plans_sf1_partitions4/item/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/item/part-2.parquet, /testdata/tpcds/plans_sf1_partitions4/item/part-3.parquet]]}, projection=[i_current_price, i_category], file_type=parquet └────────────────────────────────────────────────── "); Ok(()) @@ -696,7 +641,7 @@ mod tests { #[tokio::test] async fn test_tpcds_7() -> Result<()> { let display = test_tpcds_query("q7").await?; - assert_snapshot!(display, @" + assert_snapshot!(display, @r" ┌───── DistributedExec ── Tasks: t0:[p0] │ SortPreservingMergeExec: [i_item_id@0 ASC NULLS LAST], fetch=100 │ [Stage 7] => NetworkCoalesceExec: output_partitions=6, input_tasks=2 @@ -717,16 +662,14 @@ mod tests { │ CoalescePartitionsExec │ [Stage 5] => NetworkBroadcastExec: partitions_per_consumer=3, stage_partitions=9, input_tasks=3 │ RepartitionExec: partitioning=RoundRobinBatch(3), input_partitions=2 - │ PartitionIsolatorExec: tasks=3 partitions=4 - │ DataSourceExec: file_groups={4 groups: [[/testdata/tpcds/plans_sf1_partitions4/item/part-0.parquet], [/testdata/tpcds/plans_sf1_partitions4/item/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/item/part-2.parquet], [/testdata/tpcds/plans_sf1_partitions4/item/part-3.parquet]]}, projection=[i_item_sk, i_item_id], file_type=parquet, predicate=DynamicFilter [ empty ] + │ DistributedLeafExec: DataSourceExec: file_groups={2 groups: [[/testdata/tpcds/plans_sf1_partitions4/item/part-0.parquet, /testdata/tpcds/plans_sf1_partitions4/item/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/item/part-2.parquet, /testdata/tpcds/plans_sf1_partitions4/item/part-3.parquet]]}, projection=[i_item_sk, i_item_id], file_type=parquet, predicate=DynamicFilter [ empty ] └────────────────────────────────────────────────── ┌───── Stage 1 ── Tasks: t0:[p0..p8] t1:[p9..p17] │ BroadcastExec: input_partitions=3, consumer_tasks=3, output_partitions=9 │ ProjectionExec: expr=[p_promo_sk@0 as p_promo_sk, CAST(p_promo_sk@0 AS Float64) as CAST(promotion.p_promo_sk AS Float64)] │ FilterExec: p_channel_email@1 = N OR p_channel_event@2 = N, projection=[p_promo_sk@0] │ RepartitionExec: partitioning=RoundRobinBatch(3), input_partitions=2 - │ PartitionIsolatorExec: tasks=2 partitions=4 - │ DataSourceExec: file_groups={4 groups: [[/testdata/tpcds/plans_sf1_partitions4/promotion/part-0.parquet], [/testdata/tpcds/plans_sf1_partitions4/promotion/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/promotion/part-2.parquet], [/testdata/tpcds/plans_sf1_partitions4/promotion/part-3.parquet]]}, projection=[p_promo_sk, p_channel_email, p_channel_event], file_type=parquet, predicate=p_channel_email@9 = N OR p_channel_event@14 = N, pruning_predicate=p_channel_email_null_count@2 != row_count@3 AND p_channel_email_min@0 <= N AND N <= p_channel_email_max@1 OR p_channel_event_null_count@6 != row_count@3 AND p_channel_event_min@4 <= N AND N <= p_channel_event_max@5, required_guarantees=[] + │ DistributedLeafExec: DataSourceExec: file_groups={2 groups: [[/testdata/tpcds/plans_sf1_partitions4/promotion/part-0.parquet, /testdata/tpcds/plans_sf1_partitions4/promotion/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/promotion/part-2.parquet, /testdata/tpcds/plans_sf1_partitions4/promotion/part-3.parquet]]}, projection=[p_promo_sk, p_channel_email, p_channel_event], file_type=parquet, predicate=p_channel_email@9 = N OR p_channel_event@14 = N, pruning_predicate=p_channel_email_null_count@2 != row_count@3 AND p_channel_email_min@0 <= N AND N <= p_channel_email_max@1 OR p_channel_event_null_count@6 != row_count@3 AND p_channel_event_min@4 <= N AND N <= p_channel_event_max@5, required_guarantees=[] └────────────────────────────────────────────────── ┌───── Stage 5 ── Tasks: t0:[p0..p8] t1:[p9..p17] t2:[p18..p26] │ BroadcastExec: input_partitions=3, consumer_tasks=3, output_partitions=9 @@ -742,20 +685,17 @@ mod tests { │ ProjectionExec: expr=[d_date_sk@0 as d_date_sk, CAST(d_date_sk@0 AS Float64) as CAST(date_dim.d_date_sk AS Float64)] │ FilterExec: d_year@1 = 2000, projection=[d_date_sk@0] │ RepartitionExec: partitioning=RoundRobinBatch(3), input_partitions=2 - │ PartitionIsolatorExec: tasks=2 partitions=4 - │ DataSourceExec: file_groups={4 groups: [[/testdata/tpcds/plans_sf1_partitions4/date_dim/part-0.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-2.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-3.parquet]]}, projection=[d_date_sk, d_year], file_type=parquet, predicate=d_year@6 = 2000, pruning_predicate=d_year_null_count@2 != row_count@3 AND d_year_min@0 <= 2000 AND 2000 <= d_year_max@1, required_guarantees=[d_year in (2000)] + │ DistributedLeafExec: DataSourceExec: file_groups={2 groups: [[/testdata/tpcds/plans_sf1_partitions4/date_dim/part-0.parquet, /testdata/tpcds/plans_sf1_partitions4/date_dim/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-2.parquet, /testdata/tpcds/plans_sf1_partitions4/date_dim/part-3.parquet]]}, projection=[d_date_sk, d_year], file_type=parquet, predicate=d_year@6 = 2000, pruning_predicate=d_year_null_count@2 != row_count@3 AND d_year_min@0 <= 2000 AND 2000 <= d_year_max@1, required_guarantees=[d_year in (2000)] └────────────────────────────────────────────────── ┌───── Stage 3 ── Tasks: t0:[p0..p8] t1:[p0..p8] t2:[p0..p8] - │ RepartitionExec: partitioning=Hash([CAST(customer_demographics.cd_demo_sk AS Float64)@1], 9), input_partitions=2 + │ RepartitionExec: partitioning=Hash([CAST(customer_demographics.cd_demo_sk AS Float64)@1], 9), input_partitions=3 │ ProjectionExec: expr=[cd_demo_sk@0 as cd_demo_sk, CAST(cd_demo_sk@0 AS Float64) as CAST(customer_demographics.cd_demo_sk AS Float64)] │ FilterExec: cd_gender@1 = M AND cd_marital_status@2 = S AND cd_education_status@3 = College, projection=[cd_demo_sk@0] - │ PartitionIsolatorExec: tasks=3 partitions=6 - │ DataSourceExec: file_groups={6 groups: [[/testdata/tpcds/plans_sf1_partitions4/customer_demographics/part-0.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/customer_demographics/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/customer_demographics/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/customer_demographics/part-2.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/customer_demographics/part-2.parquet:..], ...]}, projection=[cd_demo_sk, cd_gender, cd_marital_status, cd_education_status], file_type=parquet, predicate=cd_gender@1 = M AND cd_marital_status@2 = S AND cd_education_status@3 = College, pruning_predicate=cd_gender_null_count@2 != row_count@3 AND cd_gender_min@0 <= M AND M <= cd_gender_max@1 AND cd_marital_status_null_count@6 != row_count@3 AND cd_marital_status_min@4 <= S AND S <= cd_marital_status_max@5 AND cd_education_status_null_count@9 != row_count@3 AND cd_education_status_min@7 <= College AND College <= cd_education_status_max@8, required_guarantees=[cd_education_status in (College), cd_gender in (M), cd_marital_status in (S)] + │ DistributedLeafExec: DataSourceExec: file_groups={3 groups: [[/testdata/tpcds/plans_sf1_partitions4/customer_demographics/part-0.parquet:.., /testdata/tpcds/plans_sf1_partitions4/customer_demographics/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/customer_demographics/part-1.parquet:.., /testdata/tpcds/plans_sf1_partitions4/customer_demographics/part-2.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/customer_demographics/part-2.parquet:.., /testdata/tpcds/plans_sf1_partitions4/customer_demographics/part-3.parquet:..]]}, projection=[cd_demo_sk, cd_gender, cd_marital_status, cd_education_status], file_type=parquet, predicate=cd_gender@1 = M AND cd_marital_status@2 = S AND cd_education_status@3 = College, pruning_predicate=cd_gender_null_count@2 != row_count@3 AND cd_gender_min@0 <= M AND M <= cd_gender_max@1 AND cd_marital_status_null_count@6 != row_count@3 AND cd_marital_status_min@4 <= S AND S <= cd_marital_status_max@5 AND cd_education_status_null_count@9 != row_count@3 AND cd_education_status_min@7 <= College AND College <= cd_education_status_max@8, required_guarantees=[cd_education_status in (College), cd_gender in (M), cd_marital_status in (S)] └────────────────────────────────────────────────── ┌───── Stage 4 ── Tasks: t0:[p0..p8] t1:[p0..p8] t2:[p0..p8] - │ RepartitionExec: partitioning=Hash([ss_cdemo_sk@2], 9), input_partitions=2 - │ PartitionIsolatorExec: tasks=3 partitions=6 - │ DataSourceExec: file_groups={6 groups: [[/testdata/tpcds/plans_sf1_partitions4/store_sales/part-0.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_sales/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_sales/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_sales/part-2.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_sales/part-2.parquet:..], ...]}, projection=[ss_sold_date_sk, ss_item_sk, ss_cdemo_sk, ss_promo_sk, ss_quantity, ss_list_price, ss_sales_price, ss_coupon_amt], file_type=parquet, predicate=DynamicFilter [ empty ] AND DynamicFilter [ empty ] AND DynamicFilter [ empty ] + │ RepartitionExec: partitioning=Hash([ss_cdemo_sk@2], 9), input_partitions=3 + │ DistributedLeafExec: DataSourceExec: file_groups={3 groups: [[/testdata/tpcds/plans_sf1_partitions4/store_sales/part-0.parquet:.., /testdata/tpcds/plans_sf1_partitions4/store_sales/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_sales/part-1.parquet:.., /testdata/tpcds/plans_sf1_partitions4/store_sales/part-2.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_sales/part-2.parquet:.., /testdata/tpcds/plans_sf1_partitions4/store_sales/part-3.parquet:..]]}, projection=[ss_sold_date_sk, ss_item_sk, ss_cdemo_sk, ss_promo_sk, ss_quantity, ss_list_price, ss_sales_price, ss_coupon_amt], file_type=parquet, predicate=DynamicFilter [ empty ] AND DynamicFilter [ empty ] AND DynamicFilter [ empty ] └────────────────────────────────────────────────── "); Ok(()) @@ -772,7 +712,7 @@ mod tests { │ AggregateExec: mode=Partial, gby=[s_store_name@1 as s_store_name], aggr=[sum(store_sales.ss_net_profit)] │ HashJoinExec: mode=CollectLeft, join_type=Inner, on=[(substr(store.s_zip,Int64(1),Int64(2))@3, substr(v1.ca_zip,Int64(1),Int64(2))@1)], projection=[ss_net_profit@0, s_store_name@1] │ CoalescePartitionsExec - │ [Stage 3] => NetworkBroadcastExec: partitions_per_consumer=2, stage_partitions=2, input_tasks=3 + │ [Stage 3] => NetworkBroadcastExec: partitions_per_consumer=3, stage_partitions=3, input_tasks=3 │ ProjectionExec: expr=[ca_zip@0 as ca_zip, substr(ca_zip@0, 1, 2) as substr(v1.ca_zip,Int64(1),Int64(2))] │ HashJoinExec: mode=CollectLeft, join_type=LeftSemi, on=[(ca_zip@0, ca_zip@0)], NullsEqual: true │ CoalescePartitionsExec @@ -783,8 +723,8 @@ mod tests { │ AggregateExec: mode=FinalPartitioned, gby=[ca_zip@0 as ca_zip], aggr=[count(Int64(1))] │ [Stage 6] => NetworkShuffleExec: output_partitions=3, input_tasks=2 └────────────────────────────────────────────────── - ┌───── Stage 3 ── Tasks: t0:[p0..p1] t1:[p2..p3] t2:[p4..p5] - │ BroadcastExec: input_partitions=2, consumer_tasks=1, output_partitions=2 + ┌───── Stage 3 ── Tasks: t0:[p0..p2] t1:[p3..p5] t2:[p6..p8] + │ BroadcastExec: input_partitions=3, consumer_tasks=1, output_partitions=3 │ ProjectionExec: expr=[ss_net_profit@2 as ss_net_profit, s_store_name@0 as s_store_name, s_zip@1 as s_zip, substr(s_zip@1, 1, 2) as substr(store.s_zip,Int64(1),Int64(2))] │ HashJoinExec: mode=CollectLeft, join_type=Inner, on=[(CAST(store.s_store_sk AS Float64)@3, ss_store_sk@0)], projection=[s_store_name@1, s_zip@2, ss_net_profit@5] │ CoalescePartitionsExec @@ -792,21 +732,18 @@ mod tests { │ HashJoinExec: mode=CollectLeft, join_type=Inner, on=[(CAST(date_dim.d_date_sk AS Float64)@1, ss_sold_date_sk@0)], projection=[ss_store_sk@3, ss_net_profit@4] │ CoalescePartitionsExec │ [Stage 2] => NetworkBroadcastExec: partitions_per_consumer=3, stage_partitions=9, input_tasks=2 - │ PartitionIsolatorExec: tasks=3 partitions=6 - │ DataSourceExec: file_groups={6 groups: [[/testdata/tpcds/plans_sf1_partitions4/store_sales/part-0.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_sales/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_sales/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_sales/part-2.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_sales/part-2.parquet:..], ...]}, projection=[ss_sold_date_sk, ss_store_sk, ss_net_profit], file_type=parquet, predicate=DynamicFilter [ empty ] AND DynamicFilter [ empty ] + │ DistributedLeafExec: DataSourceExec: file_groups={3 groups: [[/testdata/tpcds/plans_sf1_partitions4/store_sales/part-0.parquet:.., /testdata/tpcds/plans_sf1_partitions4/store_sales/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_sales/part-1.parquet:.., /testdata/tpcds/plans_sf1_partitions4/store_sales/part-2.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_sales/part-2.parquet:.., /testdata/tpcds/plans_sf1_partitions4/store_sales/part-3.parquet:..]]}, projection=[ss_sold_date_sk, ss_store_sk, ss_net_profit], file_type=parquet, predicate=DynamicFilter [ empty ] AND DynamicFilter [ empty ] └────────────────────────────────────────────────── ┌───── Stage 1 ── Tasks: t0:[p0..p5] t1:[p6..p11] │ BroadcastExec: input_partitions=2, consumer_tasks=3, output_partitions=6 - │ PartitionIsolatorExec: tasks=2 partitions=4 - │ DataSourceExec: file_groups={4 groups: [[/testdata/tpcds/plans_sf1_partitions4/store/part-0.parquet], [/testdata/tpcds/plans_sf1_partitions4/store/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/store/part-2.parquet], [/testdata/tpcds/plans_sf1_partitions4/store/part-3.parquet]]}, projection=[s_store_sk, s_store_name, s_zip, CAST(s_store_sk@0 AS Float64) as CAST(store.s_store_sk AS Float64)], file_type=parquet + │ DistributedLeafExec: DataSourceExec: file_groups={2 groups: [[/testdata/tpcds/plans_sf1_partitions4/store/part-0.parquet, /testdata/tpcds/plans_sf1_partitions4/store/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/store/part-2.parquet, /testdata/tpcds/plans_sf1_partitions4/store/part-3.parquet]]}, projection=[s_store_sk, s_store_name, s_zip, CAST(s_store_sk@0 AS Float64) as CAST(store.s_store_sk AS Float64)], file_type=parquet └────────────────────────────────────────────────── ┌───── Stage 2 ── Tasks: t0:[p0..p8] t1:[p9..p17] │ BroadcastExec: input_partitions=3, consumer_tasks=3, output_partitions=9 │ ProjectionExec: expr=[d_date_sk@0 as d_date_sk, CAST(d_date_sk@0 AS Float64) as CAST(date_dim.d_date_sk AS Float64)] │ FilterExec: d_qoy@2 = 2 AND d_year@1 = 1998, projection=[d_date_sk@0] │ RepartitionExec: partitioning=RoundRobinBatch(3), input_partitions=2 - │ PartitionIsolatorExec: tasks=2 partitions=4 - │ DataSourceExec: file_groups={4 groups: [[/testdata/tpcds/plans_sf1_partitions4/date_dim/part-0.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-2.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-3.parquet]]}, projection=[d_date_sk, d_year, d_qoy], file_type=parquet, predicate=d_qoy@10 = 2 AND d_year@6 = 1998, pruning_predicate=d_qoy_null_count@2 != row_count@3 AND d_qoy_min@0 <= 2 AND 2 <= d_qoy_max@1 AND d_year_null_count@6 != row_count@3 AND d_year_min@4 <= 1998 AND 1998 <= d_year_max@5, required_guarantees=[d_qoy in (2), d_year in (1998)] + │ DistributedLeafExec: DataSourceExec: file_groups={2 groups: [[/testdata/tpcds/plans_sf1_partitions4/date_dim/part-0.parquet, /testdata/tpcds/plans_sf1_partitions4/date_dim/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-2.parquet, /testdata/tpcds/plans_sf1_partitions4/date_dim/part-3.parquet]]}, projection=[d_date_sk, d_year, d_qoy], file_type=parquet, predicate=d_qoy@10 = 2 AND d_year@6 = 1998, pruning_predicate=d_qoy_null_count@2 != row_count@3 AND d_qoy_min@0 <= 2 AND 2 <= d_qoy_max@1 AND d_year_null_count@6 != row_count@3 AND d_year_min@4 <= 1998 AND 1998 <= d_year_max@5, required_guarantees=[d_qoy in (2), d_year in (1998)] └────────────────────────────────────────────────── ┌───── Stage 4 ── Tasks: t0:[p0..p2] t1:[p0..p2] │ RepartitionExec: partitioning=Hash([ca_zip@0], 3), input_partitions=3 @@ -814,8 +751,7 @@ mod tests { │ ProjectionExec: expr=[substr(ca_zip@0, 1, 5) as ca_zip] │ FilterExec: substr(ca_zip@0, 1, 5) IN (SET) ([24128, 76232, 65084, 87816, 83926, 77556, 20548, 26231, 43848, 15126, 91137, 61265, 98294, 25782, 17920, 18426, 98235, 40081, 84093, 28577, 55565, 17183, 54601, 67897, 22752, 86284, 18376, 38607, 45200, 21756, 29741, 96765, 23932, 89360, 29839, 25989, 28898, 91068, 72550, 10390, 18845, 47770, 82636, 41367, 76638, 86198, 81312, 37126, 39192, 88424, 72175, 81426, 53672, 10445, 42666, 66864, 66708, 41248, 48583, 82276, 18842, 78890, 49448, 14089, 38122, 34425, 79077, 19849, 43285, 39861, 66162, 77610, 13695, 99543, 83444, 83041, 12305, 57665, 68341, 25003, 57834, 62878, 49130, 81096, 18840, 27700, 23470, 50412, 21195, 16021, 76107, 71954, 68309, 18119, 98359, 64544, 10336, 86379, 27068, 39736, 98569, 28915, 24206, 56529, 57647, 54917, 42961, 91110, 63981, 14922, 36420, 23006, 67467, 32754, 30903, 20260, 31671, 51798, 72325, 85816, 68621, 13955, 36446, 41766, 68806, 16725, 15146, 22744, 35850, 88086, 51649, 18270, 52867, 39972, 96976, 63792, 11376, 94898, 13595, 10516, 90225, 58943, 39371, 94945, 28587, 96576, 57855, 28488, 26105, 83933, 25858, 34322, 44438, 73171, 30122, 34102, 22685, 71256, 78451, 54364, 13354, 45375, 40558, 56458, 28286, 45266, 47305, 69399, 83921, 26233, 11101, 15371, 69913, 35942, 15882, 25631, 24610, 44165, 99076, 33786, 70738, 26653, 14328, 72305, 62496, 22152, 10144, 64147, 48425, 14663, 21076, 18799, 30450, 63089, 81019, 68893, 24996, 51200, 51211, 45692, 92712, 70466, 79994, 22437, 25280, 38935, 71791, 73134, 56571, 14060, 19505, 72425, 56575, 74351, 68786, 51650, 20004, 18383, 76614, 11634, 18906, 15765, 41368, 73241, 76698, 78567, 97189, 28545, 76231, 75691, 22246, 51061, 90578, 56691, 68014, 51103, 94167, 57047, 14867, 73520, 15734, 63435, 25733, 35474, 24676, 94627, 53535, 17879, 15559, 53268, 59166, 11928, 59402, 33282, 45721, 43933, 68101, 33515, 36634, 71286, 19736, 58058, 55253, 67473, 41918, 19515, 36495, 19430, 22351, 77191, 91393, 49156, 50298, 87501, 18652, 53179, 18767, 63193, 23968, 65164, 68880, 21286, 72823, 58470, 67301, 13394, 31016, 70372, 67030, 40604, 24317, 45748, 39127, 26065, 77721, 31029, 31880, 60576, 24671, 45549, 13376, 50016, 33123, 19769, 22927, 97789, 46081, 72151, 15723, 46136, 51949, 68100, 96888, 64528, 14171, 79777, 28709, 11489, 25103, 32213, 78668, 22245, 15798, 27156, 37930, 62971, 21337, 51622, 67853, 10567, 38415, 15455, 58263, 42029, 60279, 37125, 56240, 88190, 50308, 26859, 64457, 89091, 82136, 62377, 36233, 63837, 58078, 17043, 30010, 60099, 28810, 98025, 29178, 87343, 73273, 30469, 64034, 39516, 86057, 21309, 90257, 67875, 40162, 11356, 73650, 61810, 72013, 30431, 22461, 19512, 13375, 55307, 30625, 83849, 68908, 26689, 96451, 38193, 46820, 88885, 84935, 69035, 83144, 47537, 56616, 94983, 48033, 69952, 25486, 61547, 27385, 61860, 58048, 56910, 16807, 17871, 35258, 31387, 35458, 35576]) │ RepartitionExec: partitioning=RoundRobinBatch(3), input_partitions=2 - │ PartitionIsolatorExec: tasks=2 partitions=4 - │ DataSourceExec: file_groups={4 groups: [[/testdata/tpcds/plans_sf1_partitions4/customer_address/part-0.parquet], [/testdata/tpcds/plans_sf1_partitions4/customer_address/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/customer_address/part-2.parquet], [/testdata/tpcds/plans_sf1_partitions4/customer_address/part-3.parquet]]}, projection=[ca_zip], file_type=parquet, predicate=substr(ca_zip@9, 1, 5) IN (SET) ([24128, 76232, 65084, 87816, 83926, 77556, 20548, 26231, 43848, 15126, 91137, 61265, 98294, 25782, 17920, 18426, 98235, 40081, 84093, 28577, 55565, 17183, 54601, 67897, 22752, 86284, 18376, 38607, 45200, 21756, 29741, 96765, 23932, 89360, 29839, 25989, 28898, 91068, 72550, 10390, 18845, 47770, 82636, 41367, 76638, 86198, 81312, 37126, 39192, 88424, 72175, 81426, 53672, 10445, 42666, 66864, 66708, 41248, 48583, 82276, 18842, 78890, 49448, 14089, 38122, 34425, 79077, 19849, 43285, 39861, 66162, 77610, 13695, 99543, 83444, 83041, 12305, 57665, 68341, 25003, 57834, 62878, 49130, 81096, 18840, 27700, 23470, 50412, 21195, 16021, 76107, 71954, 68309, 18119, 98359, 64544, 10336, 86379, 27068, 39736, 98569, 28915, 24206, 56529, 57647, 54917, 42961, 91110, 63981, 14922, 36420, 23006, 67467, 32754, 30903, 20260, 31671, 51798, 72325, 85816, 68621, 13955, 36446, 41766, 68806, 16725, 15146, 22744, 35850, 88086, 51649, 18270, 52867, 39972, 96976, 63792, 11376, 94898, 13595, 10516, 90225, 58943, 39371, 94945, 28587, 96576, 57855, 28488, 26105, 83933, 25858, 34322, 44438, 73171, 30122, 34102, 22685, 71256, 78451, 54364, 13354, 45375, 40558, 56458, 28286, 45266, 47305, 69399, 83921, 26233, 11101, 15371, 69913, 35942, 15882, 25631, 24610, 44165, 99076, 33786, 70738, 26653, 14328, 72305, 62496, 22152, 10144, 64147, 48425, 14663, 21076, 18799, 30450, 63089, 81019, 68893, 24996, 51200, 51211, 45692, 92712, 70466, 79994, 22437, 25280, 38935, 71791, 73134, 56571, 14060, 19505, 72425, 56575, 74351, 68786, 51650, 20004, 18383, 76614, 11634, 18906, 15765, 41368, 73241, 76698, 78567, 97189, 28545, 76231, 75691, 22246, 51061, 90578, 56691, 68014, 51103, 94167, 57047, 14867, 73520, 15734, 63435, 25733, 35474, 24676, 94627, 53535, 17879, 15559, 53268, 59166, 11928, 59402, 33282, 45721, 43933, 68101, 33515, 36634, 71286, 19736, 58058, 55253, 67473, 41918, 19515, 36495, 19430, 22351, 77191, 91393, 49156, 50298, 87501, 18652, 53179, 18767, 63193, 23968, 65164, 68880, 21286, 72823, 58470, 67301, 13394, 31016, 70372, 67030, 40604, 24317, 45748, 39127, 26065, 77721, 31029, 31880, 60576, 24671, 45549, 13376, 50016, 33123, 19769, 22927, 97789, 46081, 72151, 15723, 46136, 51949, 68100, 96888, 64528, 14171, 79777, 28709, 11489, 25103, 32213, 78668, 22245, 15798, 27156, 37930, 62971, 21337, 51622, 67853, 10567, 38415, 15455, 58263, 42029, 60279, 37125, 56240, 88190, 50308, 26859, 64457, 89091, 82136, 62377, 36233, 63837, 58078, 17043, 30010, 60099, 28810, 98025, 29178, 87343, 73273, 30469, 64034, 39516, 86057, 21309, 90257, 67875, 40162, 11356, 73650, 61810, 72013, 30431, 22461, 19512, 13375, 55307, 30625, 83849, 68908, 26689, 96451, 38193, 46820, 88885, 84935, 69035, 83144, 47537, 56616, 94983, 48033, 69952, 25486, 61547, 27385, 61860, 58048, 56910, 16807, 17871, 35258, 31387, 35458, 35576]) AND DynamicFilter [ empty ] + │ DistributedLeafExec: DataSourceExec: file_groups={2 groups: [[/testdata/tpcds/plans_sf1_partitions4/customer_address/part-0.parquet, /testdata/tpcds/plans_sf1_partitions4/customer_address/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/customer_address/part-2.parquet, /testdata/tpcds/plans_sf1_partitions4/customer_address/part-3.parquet]]}, projection=[ca_zip], file_type=parquet, predicate=substr(ca_zip@9, 1, 5) IN (SET) ([24128, 76232, 65084, 87816, 83926, 77556, 20548, 26231, 43848, 15126, 91137, 61265, 98294, 25782, 17920, 18426, 98235, 40081, 84093, 28577, 55565, 17183, 54601, 67897, 22752, 86284, 18376, 38607, 45200, 21756, 29741, 96765, 23932, 89360, 29839, 25989, 28898, 91068, 72550, 10390, 18845, 47770, 82636, 41367, 76638, 86198, 81312, 37126, 39192, 88424, 72175, 81426, 53672, 10445, 42666, 66864, 66708, 41248, 48583, 82276, 18842, 78890, 49448, 14089, 38122, 34425, 79077, 19849, 43285, 39861, 66162, 77610, 13695, 99543, 83444, 83041, 12305, 57665, 68341, 25003, 57834, 62878, 49130, 81096, 18840, 27700, 23470, 50412, 21195, 16021, 76107, 71954, 68309, 18119, 98359, 64544, 10336, 86379, 27068, 39736, 98569, 28915, 24206, 56529, 57647, 54917, 42961, 91110, 63981, 14922, 36420, 23006, 67467, 32754, 30903, 20260, 31671, 51798, 72325, 85816, 68621, 13955, 36446, 41766, 68806, 16725, 15146, 22744, 35850, 88086, 51649, 18270, 52867, 39972, 96976, 63792, 11376, 94898, 13595, 10516, 90225, 58943, 39371, 94945, 28587, 96576, 57855, 28488, 26105, 83933, 25858, 34322, 44438, 73171, 30122, 34102, 22685, 71256, 78451, 54364, 13354, 45375, 40558, 56458, 28286, 45266, 47305, 69399, 83921, 26233, 11101, 15371, 69913, 35942, 15882, 25631, 24610, 44165, 99076, 33786, 70738, 26653, 14328, 72305, 62496, 22152, 10144, 64147, 48425, 14663, 21076, 18799, 30450, 63089, 81019, 68893, 24996, 51200, 51211, 45692, 92712, 70466, 79994, 22437, 25280, 38935, 71791, 73134, 56571, 14060, 19505, 72425, 56575, 74351, 68786, 51650, 20004, 18383, 76614, 11634, 18906, 15765, 41368, 73241, 76698, 78567, 97189, 28545, 76231, 75691, 22246, 51061, 90578, 56691, 68014, 51103, 94167, 57047, 14867, 73520, 15734, 63435, 25733, 35474, 24676, 94627, 53535, 17879, 15559, 53268, 59166, 11928, 59402, 33282, 45721, 43933, 68101, 33515, 36634, 71286, 19736, 58058, 55253, 67473, 41918, 19515, 36495, 19430, 22351, 77191, 91393, 49156, 50298, 87501, 18652, 53179, 18767, 63193, 23968, 65164, 68880, 21286, 72823, 58470, 67301, 13394, 31016, 70372, 67030, 40604, 24317, 45748, 39127, 26065, 77721, 31029, 31880, 60576, 24671, 45549, 13376, 50016, 33123, 19769, 22927, 97789, 46081, 72151, 15723, 46136, 51949, 68100, 96888, 64528, 14171, 79777, 28709, 11489, 25103, 32213, 78668, 22245, 15798, 27156, 37930, 62971, 21337, 51622, 67853, 10567, 38415, 15455, 58263, 42029, 60279, 37125, 56240, 88190, 50308, 26859, 64457, 89091, 82136, 62377, 36233, 63837, 58078, 17043, 30010, 60099, 28810, 98025, 29178, 87343, 73273, 30469, 64034, 39516, 86057, 21309, 90257, 67875, 40162, 11356, 73650, 61810, 72013, 30431, 22461, 19512, 13375, 55307, 30625, 83849, 68908, 26689, 96451, 38193, 46820, 88885, 84935, 69035, 83144, 47537, 56616, 94983, 48033, 69952, 25486, 61547, 27385, 61860, 58048, 56910, 16807, 17871, 35258, 31387, 35458, 35576]) AND DynamicFilter [ empty ] └────────────────────────────────────────────────── ┌───── Stage 6 ── Tasks: t0:[p0..p2] t1:[p0..p2] │ RepartitionExec: partitioning=Hash([ca_zip@0], 3), input_partitions=3 @@ -824,15 +760,13 @@ mod tests { │ CoalescePartitionsExec │ [Stage 5] => NetworkBroadcastExec: partitions_per_consumer=3, stage_partitions=6, input_tasks=2 │ RepartitionExec: partitioning=RoundRobinBatch(3), input_partitions=2 - │ PartitionIsolatorExec: tasks=2 partitions=4 - │ DataSourceExec: file_groups={4 groups: [[/testdata/tpcds/plans_sf1_partitions4/customer_address/part-0.parquet], [/testdata/tpcds/plans_sf1_partitions4/customer_address/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/customer_address/part-2.parquet], [/testdata/tpcds/plans_sf1_partitions4/customer_address/part-3.parquet]]}, projection=[ca_address_sk, ca_zip], file_type=parquet, predicate=DynamicFilter [ empty ] AND DynamicFilter [ empty ] + │ DistributedLeafExec: DataSourceExec: file_groups={2 groups: [[/testdata/tpcds/plans_sf1_partitions4/customer_address/part-0.parquet, /testdata/tpcds/plans_sf1_partitions4/customer_address/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/customer_address/part-2.parquet, /testdata/tpcds/plans_sf1_partitions4/customer_address/part-3.parquet]]}, projection=[ca_address_sk, ca_zip], file_type=parquet, predicate=DynamicFilter [ empty ] AND DynamicFilter [ empty ] └────────────────────────────────────────────────── ┌───── Stage 5 ── Tasks: t0:[p0..p5] t1:[p6..p11] │ BroadcastExec: input_partitions=3, consumer_tasks=2, output_partitions=6 │ FilterExec: c_preferred_cust_flag@1 = Y, projection=[c_current_addr_sk@0] │ RepartitionExec: partitioning=RoundRobinBatch(3), input_partitions=2 - │ PartitionIsolatorExec: tasks=2 partitions=4 - │ DataSourceExec: file_groups={4 groups: [[/testdata/tpcds/plans_sf1_partitions4/customer/part-0.parquet], [/testdata/tpcds/plans_sf1_partitions4/customer/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/customer/part-2.parquet], [/testdata/tpcds/plans_sf1_partitions4/customer/part-3.parquet]]}, projection=[c_current_addr_sk, c_preferred_cust_flag], file_type=parquet, predicate=c_preferred_cust_flag@10 = Y, pruning_predicate=c_preferred_cust_flag_null_count@2 != row_count@3 AND c_preferred_cust_flag_min@0 <= Y AND Y <= c_preferred_cust_flag_max@1, required_guarantees=[c_preferred_cust_flag in (Y)] + │ DistributedLeafExec: DataSourceExec: file_groups={2 groups: [[/testdata/tpcds/plans_sf1_partitions4/customer/part-0.parquet, /testdata/tpcds/plans_sf1_partitions4/customer/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/customer/part-2.parquet, /testdata/tpcds/plans_sf1_partitions4/customer/part-3.parquet]]}, projection=[c_current_addr_sk, c_preferred_cust_flag], file_type=parquet, predicate=c_preferred_cust_flag@10 = Y, pruning_predicate=c_preferred_cust_flag_null_count@2 != row_count@3 AND c_preferred_cust_flag_min@0 <= Y AND Y <= c_preferred_cust_flag_max@1, required_guarantees=[c_preferred_cust_flag in (Y)] └────────────────────────────────────────────────── "); Ok(()) @@ -840,7 +774,7 @@ mod tests { #[tokio::test] async fn test_tpcds_9() -> Result<()> { let display = test_tpcds_query("q9").await?; - assert_snapshot!(display, @" + assert_snapshot!(display, @r" ┌───── DistributedExec ── Tasks: t0:[p0] │ CoalescePartitionsExec │ ProjectionExec: expr=[CASE WHEN count(*)@0 > 74129 THEN avg(store_sales.ss_ext_discount_amt)@1 ELSE avg(store_sales.ss_net_paid)@2 END as bucket1, CASE WHEN count(*)@3 > 122840 THEN avg(store_sales.ss_ext_discount_amt)@4 ELSE avg(store_sales.ss_net_paid)@5 END as bucket2, CASE WHEN count(*)@6 > 56580 THEN avg(store_sales.ss_ext_discount_amt)@7 ELSE avg(store_sales.ss_net_paid)@8 END as bucket3, CASE WHEN count(*)@9 > 10097 THEN avg(store_sales.ss_ext_discount_amt)@10 ELSE avg(store_sales.ss_net_paid)@11 END as bucket4, CASE WHEN count(*)@12 > 165306 THEN avg(store_sales.ss_ext_discount_amt)@13 ELSE avg(store_sales.ss_net_paid)@14 END as bucket5] @@ -865,149 +799,133 @@ mod tests { │ ProjectionExec: expr=[count(Int64(1))@0 as count(*)] │ AggregateExec: mode=Final, gby=[], aggr=[count(Int64(1))] │ CoalescePartitionsExec - │ [Stage 2] => NetworkCoalesceExec: output_partitions=6, input_tasks=3 + │ [Stage 2] => NetworkCoalesceExec: output_partitions=9, input_tasks=3 │ AggregateExec: mode=Final, gby=[], aggr=[avg(store_sales.ss_ext_discount_amt)] │ CoalescePartitionsExec - │ [Stage 3] => NetworkCoalesceExec: output_partitions=6, input_tasks=3 + │ [Stage 3] => NetworkCoalesceExec: output_partitions=9, input_tasks=3 │ AggregateExec: mode=Final, gby=[], aggr=[avg(store_sales.ss_net_paid)] │ CoalescePartitionsExec - │ [Stage 4] => NetworkCoalesceExec: output_partitions=6, input_tasks=3 + │ [Stage 4] => NetworkCoalesceExec: output_partitions=9, input_tasks=3 │ ProjectionExec: expr=[count(Int64(1))@0 as count(*)] │ AggregateExec: mode=Final, gby=[], aggr=[count(Int64(1))] │ CoalescePartitionsExec - │ [Stage 5] => NetworkCoalesceExec: output_partitions=6, input_tasks=3 + │ [Stage 5] => NetworkCoalesceExec: output_partitions=9, input_tasks=3 │ AggregateExec: mode=Final, gby=[], aggr=[avg(store_sales.ss_ext_discount_amt)] │ CoalescePartitionsExec - │ [Stage 6] => NetworkCoalesceExec: output_partitions=6, input_tasks=3 + │ [Stage 6] => NetworkCoalesceExec: output_partitions=9, input_tasks=3 │ AggregateExec: mode=Final, gby=[], aggr=[avg(store_sales.ss_net_paid)] │ CoalescePartitionsExec - │ [Stage 7] => NetworkCoalesceExec: output_partitions=6, input_tasks=3 + │ [Stage 7] => NetworkCoalesceExec: output_partitions=9, input_tasks=3 │ ProjectionExec: expr=[count(Int64(1))@0 as count(*)] │ AggregateExec: mode=Final, gby=[], aggr=[count(Int64(1))] │ CoalescePartitionsExec - │ [Stage 8] => NetworkCoalesceExec: output_partitions=6, input_tasks=3 + │ [Stage 8] => NetworkCoalesceExec: output_partitions=9, input_tasks=3 │ AggregateExec: mode=Final, gby=[], aggr=[avg(store_sales.ss_ext_discount_amt)] │ CoalescePartitionsExec - │ [Stage 9] => NetworkCoalesceExec: output_partitions=6, input_tasks=3 + │ [Stage 9] => NetworkCoalesceExec: output_partitions=9, input_tasks=3 │ AggregateExec: mode=Final, gby=[], aggr=[avg(store_sales.ss_net_paid)] │ CoalescePartitionsExec - │ [Stage 10] => NetworkCoalesceExec: output_partitions=6, input_tasks=3 + │ [Stage 10] => NetworkCoalesceExec: output_partitions=9, input_tasks=3 │ ProjectionExec: expr=[count(Int64(1))@0 as count(*)] │ AggregateExec: mode=Final, gby=[], aggr=[count(Int64(1))] │ CoalescePartitionsExec - │ [Stage 11] => NetworkCoalesceExec: output_partitions=6, input_tasks=3 + │ [Stage 11] => NetworkCoalesceExec: output_partitions=9, input_tasks=3 │ AggregateExec: mode=Final, gby=[], aggr=[avg(store_sales.ss_ext_discount_amt)] │ CoalescePartitionsExec - │ [Stage 12] => NetworkCoalesceExec: output_partitions=6, input_tasks=3 + │ [Stage 12] => NetworkCoalesceExec: output_partitions=9, input_tasks=3 │ AggregateExec: mode=Final, gby=[], aggr=[avg(store_sales.ss_net_paid)] │ CoalescePartitionsExec - │ [Stage 13] => NetworkCoalesceExec: output_partitions=6, input_tasks=3 + │ [Stage 13] => NetworkCoalesceExec: output_partitions=9, input_tasks=3 │ ProjectionExec: expr=[count(Int64(1))@0 as count(*)] │ AggregateExec: mode=Final, gby=[], aggr=[count(Int64(1))] │ CoalescePartitionsExec - │ [Stage 14] => NetworkCoalesceExec: output_partitions=6, input_tasks=3 + │ [Stage 14] => NetworkCoalesceExec: output_partitions=9, input_tasks=3 │ AggregateExec: mode=Final, gby=[], aggr=[avg(store_sales.ss_ext_discount_amt)] │ CoalescePartitionsExec - │ [Stage 15] => NetworkCoalesceExec: output_partitions=6, input_tasks=3 + │ [Stage 15] => NetworkCoalesceExec: output_partitions=9, input_tasks=3 │ AggregateExec: mode=Final, gby=[], aggr=[avg(store_sales.ss_net_paid)] │ CoalescePartitionsExec - │ [Stage 16] => NetworkCoalesceExec: output_partitions=6, input_tasks=3 + │ [Stage 16] => NetworkCoalesceExec: output_partitions=9, input_tasks=3 └────────────────────────────────────────────────── ┌───── Stage 1 ── Tasks: t0:[p0..p2] t1:[p3..p5] │ FilterExec: r_reason_sk@0 = 1, projection=[] │ RepartitionExec: partitioning=RoundRobinBatch(3), input_partitions=2 - │ PartitionIsolatorExec: tasks=2 partitions=4 - │ DataSourceExec: file_groups={4 groups: [[/testdata/tpcds/plans_sf1_partitions4/reason/part-0.parquet], [/testdata/tpcds/plans_sf1_partitions4/reason/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/reason/part-2.parquet], [/testdata/tpcds/plans_sf1_partitions4/reason/part-3.parquet]]}, projection=[r_reason_sk], file_type=parquet, predicate=r_reason_sk@0 = 1, pruning_predicate=r_reason_sk_null_count@2 != row_count@3 AND r_reason_sk_min@0 <= 1 AND 1 <= r_reason_sk_max@1, required_guarantees=[r_reason_sk in (1)] + │ DistributedLeafExec: DataSourceExec: file_groups={2 groups: [[/testdata/tpcds/plans_sf1_partitions4/reason/part-0.parquet, /testdata/tpcds/plans_sf1_partitions4/reason/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/reason/part-2.parquet, /testdata/tpcds/plans_sf1_partitions4/reason/part-3.parquet]]}, projection=[r_reason_sk], file_type=parquet, predicate=r_reason_sk@0 = 1, pruning_predicate=r_reason_sk_null_count@2 != row_count@3 AND r_reason_sk_min@0 <= 1 AND 1 <= r_reason_sk_max@1, required_guarantees=[r_reason_sk in (1)] └────────────────────────────────────────────────── - ┌───── Stage 2 ── Tasks: t0:[p0..p1] t1:[p2..p3] t2:[p4..p5] + ┌───── Stage 2 ── Tasks: t0:[p0..p2] t1:[p3..p5] t2:[p6..p8] │ AggregateExec: mode=Partial, gby=[], aggr=[count(Int64(1))] │ FilterExec: ss_quantity@0 >= 1 AND ss_quantity@0 <= 20, projection=[] - │ PartitionIsolatorExec: tasks=3 partitions=6 - │ DataSourceExec: file_groups={6 groups: [[/testdata/tpcds/plans_sf1_partitions4/store_sales/part-0.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_sales/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_sales/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_sales/part-2.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_sales/part-2.parquet:..], ...]}, projection=[ss_quantity], file_type=parquet, predicate=ss_quantity@10 >= 1 AND ss_quantity@10 <= 20, pruning_predicate=ss_quantity_null_count@1 != row_count@2 AND ss_quantity_max@0 >= 1 AND ss_quantity_null_count@1 != row_count@2 AND ss_quantity_min@3 <= 20, required_guarantees=[] + │ DistributedLeafExec: DataSourceExec: file_groups={3 groups: [[/testdata/tpcds/plans_sf1_partitions4/store_sales/part-0.parquet:.., /testdata/tpcds/plans_sf1_partitions4/store_sales/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_sales/part-1.parquet:.., /testdata/tpcds/plans_sf1_partitions4/store_sales/part-2.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_sales/part-2.parquet:.., /testdata/tpcds/plans_sf1_partitions4/store_sales/part-3.parquet:..]]}, projection=[ss_quantity], file_type=parquet, predicate=ss_quantity@10 >= 1 AND ss_quantity@10 <= 20, pruning_predicate=ss_quantity_null_count@1 != row_count@2 AND ss_quantity_max@0 >= 1 AND ss_quantity_null_count@1 != row_count@2 AND ss_quantity_min@3 <= 20, required_guarantees=[] └────────────────────────────────────────────────── - ┌───── Stage 3 ── Tasks: t0:[p0..p1] t1:[p2..p3] t2:[p4..p5] + ┌───── Stage 3 ── Tasks: t0:[p0..p2] t1:[p3..p5] t2:[p6..p8] │ AggregateExec: mode=Partial, gby=[], aggr=[avg(store_sales.ss_ext_discount_amt)] │ FilterExec: ss_quantity@0 >= 1 AND ss_quantity@0 <= 20, projection=[ss_ext_discount_amt@1] - │ PartitionIsolatorExec: tasks=3 partitions=6 - │ DataSourceExec: file_groups={6 groups: [[/testdata/tpcds/plans_sf1_partitions4/store_sales/part-0.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_sales/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_sales/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_sales/part-2.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_sales/part-2.parquet:..], ...]}, projection=[ss_quantity, ss_ext_discount_amt], file_type=parquet, predicate=ss_quantity@10 >= 1 AND ss_quantity@10 <= 20, pruning_predicate=ss_quantity_null_count@1 != row_count@2 AND ss_quantity_max@0 >= 1 AND ss_quantity_null_count@1 != row_count@2 AND ss_quantity_min@3 <= 20, required_guarantees=[] + │ DistributedLeafExec: DataSourceExec: file_groups={3 groups: [[/testdata/tpcds/plans_sf1_partitions4/store_sales/part-0.parquet:.., /testdata/tpcds/plans_sf1_partitions4/store_sales/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_sales/part-1.parquet:.., /testdata/tpcds/plans_sf1_partitions4/store_sales/part-2.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_sales/part-2.parquet:.., /testdata/tpcds/plans_sf1_partitions4/store_sales/part-3.parquet:..]]}, projection=[ss_quantity, ss_ext_discount_amt], file_type=parquet, predicate=ss_quantity@10 >= 1 AND ss_quantity@10 <= 20, pruning_predicate=ss_quantity_null_count@1 != row_count@2 AND ss_quantity_max@0 >= 1 AND ss_quantity_null_count@1 != row_count@2 AND ss_quantity_min@3 <= 20, required_guarantees=[] └────────────────────────────────────────────────── - ┌───── Stage 4 ── Tasks: t0:[p0..p1] t1:[p2..p3] t2:[p4..p5] + ┌───── Stage 4 ── Tasks: t0:[p0..p2] t1:[p3..p5] t2:[p6..p8] │ AggregateExec: mode=Partial, gby=[], aggr=[avg(store_sales.ss_net_paid)] │ FilterExec: ss_quantity@0 >= 1 AND ss_quantity@0 <= 20, projection=[ss_net_paid@1] - │ PartitionIsolatorExec: tasks=3 partitions=6 - │ DataSourceExec: file_groups={6 groups: [[/testdata/tpcds/plans_sf1_partitions4/store_sales/part-0.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_sales/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_sales/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_sales/part-2.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_sales/part-2.parquet:..], ...]}, projection=[ss_quantity, ss_net_paid], file_type=parquet, predicate=ss_quantity@10 >= 1 AND ss_quantity@10 <= 20, pruning_predicate=ss_quantity_null_count@1 != row_count@2 AND ss_quantity_max@0 >= 1 AND ss_quantity_null_count@1 != row_count@2 AND ss_quantity_min@3 <= 20, required_guarantees=[] + │ DistributedLeafExec: DataSourceExec: file_groups={3 groups: [[/testdata/tpcds/plans_sf1_partitions4/store_sales/part-0.parquet:.., /testdata/tpcds/plans_sf1_partitions4/store_sales/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_sales/part-1.parquet:.., /testdata/tpcds/plans_sf1_partitions4/store_sales/part-2.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_sales/part-2.parquet:.., /testdata/tpcds/plans_sf1_partitions4/store_sales/part-3.parquet:..]]}, projection=[ss_quantity, ss_net_paid], file_type=parquet, predicate=ss_quantity@10 >= 1 AND ss_quantity@10 <= 20, pruning_predicate=ss_quantity_null_count@1 != row_count@2 AND ss_quantity_max@0 >= 1 AND ss_quantity_null_count@1 != row_count@2 AND ss_quantity_min@3 <= 20, required_guarantees=[] └────────────────────────────────────────────────── - ┌───── Stage 5 ── Tasks: t0:[p0..p1] t1:[p2..p3] t2:[p4..p5] + ┌───── Stage 5 ── Tasks: t0:[p0..p2] t1:[p3..p5] t2:[p6..p8] │ AggregateExec: mode=Partial, gby=[], aggr=[count(Int64(1))] │ FilterExec: ss_quantity@0 >= 21 AND ss_quantity@0 <= 40, projection=[] - │ PartitionIsolatorExec: tasks=3 partitions=6 - │ DataSourceExec: file_groups={6 groups: [[/testdata/tpcds/plans_sf1_partitions4/store_sales/part-0.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_sales/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_sales/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_sales/part-2.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_sales/part-2.parquet:..], ...]}, projection=[ss_quantity], file_type=parquet, predicate=ss_quantity@10 >= 21 AND ss_quantity@10 <= 40, pruning_predicate=ss_quantity_null_count@1 != row_count@2 AND ss_quantity_max@0 >= 21 AND ss_quantity_null_count@1 != row_count@2 AND ss_quantity_min@3 <= 40, required_guarantees=[] + │ DistributedLeafExec: DataSourceExec: file_groups={3 groups: [[/testdata/tpcds/plans_sf1_partitions4/store_sales/part-0.parquet:.., /testdata/tpcds/plans_sf1_partitions4/store_sales/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_sales/part-1.parquet:.., /testdata/tpcds/plans_sf1_partitions4/store_sales/part-2.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_sales/part-2.parquet:.., /testdata/tpcds/plans_sf1_partitions4/store_sales/part-3.parquet:..]]}, projection=[ss_quantity], file_type=parquet, predicate=ss_quantity@10 >= 21 AND ss_quantity@10 <= 40, pruning_predicate=ss_quantity_null_count@1 != row_count@2 AND ss_quantity_max@0 >= 21 AND ss_quantity_null_count@1 != row_count@2 AND ss_quantity_min@3 <= 40, required_guarantees=[] └────────────────────────────────────────────────── - ┌───── Stage 6 ── Tasks: t0:[p0..p1] t1:[p2..p3] t2:[p4..p5] + ┌───── Stage 6 ── Tasks: t0:[p0..p2] t1:[p3..p5] t2:[p6..p8] │ AggregateExec: mode=Partial, gby=[], aggr=[avg(store_sales.ss_ext_discount_amt)] │ FilterExec: ss_quantity@0 >= 21 AND ss_quantity@0 <= 40, projection=[ss_ext_discount_amt@1] - │ PartitionIsolatorExec: tasks=3 partitions=6 - │ DataSourceExec: file_groups={6 groups: [[/testdata/tpcds/plans_sf1_partitions4/store_sales/part-0.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_sales/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_sales/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_sales/part-2.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_sales/part-2.parquet:..], ...]}, projection=[ss_quantity, ss_ext_discount_amt], file_type=parquet, predicate=ss_quantity@10 >= 21 AND ss_quantity@10 <= 40, pruning_predicate=ss_quantity_null_count@1 != row_count@2 AND ss_quantity_max@0 >= 21 AND ss_quantity_null_count@1 != row_count@2 AND ss_quantity_min@3 <= 40, required_guarantees=[] + │ DistributedLeafExec: DataSourceExec: file_groups={3 groups: [[/testdata/tpcds/plans_sf1_partitions4/store_sales/part-0.parquet:.., /testdata/tpcds/plans_sf1_partitions4/store_sales/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_sales/part-1.parquet:.., /testdata/tpcds/plans_sf1_partitions4/store_sales/part-2.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_sales/part-2.parquet:.., /testdata/tpcds/plans_sf1_partitions4/store_sales/part-3.parquet:..]]}, projection=[ss_quantity, ss_ext_discount_amt], file_type=parquet, predicate=ss_quantity@10 >= 21 AND ss_quantity@10 <= 40, pruning_predicate=ss_quantity_null_count@1 != row_count@2 AND ss_quantity_max@0 >= 21 AND ss_quantity_null_count@1 != row_count@2 AND ss_quantity_min@3 <= 40, required_guarantees=[] └────────────────────────────────────────────────── - ┌───── Stage 7 ── Tasks: t0:[p0..p1] t1:[p2..p3] t2:[p4..p5] + ┌───── Stage 7 ── Tasks: t0:[p0..p2] t1:[p3..p5] t2:[p6..p8] │ AggregateExec: mode=Partial, gby=[], aggr=[avg(store_sales.ss_net_paid)] │ FilterExec: ss_quantity@0 >= 21 AND ss_quantity@0 <= 40, projection=[ss_net_paid@1] - │ PartitionIsolatorExec: tasks=3 partitions=6 - │ DataSourceExec: file_groups={6 groups: [[/testdata/tpcds/plans_sf1_partitions4/store_sales/part-0.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_sales/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_sales/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_sales/part-2.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_sales/part-2.parquet:..], ...]}, projection=[ss_quantity, ss_net_paid], file_type=parquet, predicate=ss_quantity@10 >= 21 AND ss_quantity@10 <= 40, pruning_predicate=ss_quantity_null_count@1 != row_count@2 AND ss_quantity_max@0 >= 21 AND ss_quantity_null_count@1 != row_count@2 AND ss_quantity_min@3 <= 40, required_guarantees=[] + │ DistributedLeafExec: DataSourceExec: file_groups={3 groups: [[/testdata/tpcds/plans_sf1_partitions4/store_sales/part-0.parquet:.., /testdata/tpcds/plans_sf1_partitions4/store_sales/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_sales/part-1.parquet:.., /testdata/tpcds/plans_sf1_partitions4/store_sales/part-2.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_sales/part-2.parquet:.., /testdata/tpcds/plans_sf1_partitions4/store_sales/part-3.parquet:..]]}, projection=[ss_quantity, ss_net_paid], file_type=parquet, predicate=ss_quantity@10 >= 21 AND ss_quantity@10 <= 40, pruning_predicate=ss_quantity_null_count@1 != row_count@2 AND ss_quantity_max@0 >= 21 AND ss_quantity_null_count@1 != row_count@2 AND ss_quantity_min@3 <= 40, required_guarantees=[] └────────────────────────────────────────────────── - ┌───── Stage 8 ── Tasks: t0:[p0..p1] t1:[p2..p3] t2:[p4..p5] + ┌───── Stage 8 ── Tasks: t0:[p0..p2] t1:[p3..p5] t2:[p6..p8] │ AggregateExec: mode=Partial, gby=[], aggr=[count(Int64(1))] │ FilterExec: ss_quantity@0 >= 41 AND ss_quantity@0 <= 60, projection=[] - │ PartitionIsolatorExec: tasks=3 partitions=6 - │ DataSourceExec: file_groups={6 groups: [[/testdata/tpcds/plans_sf1_partitions4/store_sales/part-0.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_sales/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_sales/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_sales/part-2.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_sales/part-2.parquet:..], ...]}, projection=[ss_quantity], file_type=parquet, predicate=ss_quantity@10 >= 41 AND ss_quantity@10 <= 60, pruning_predicate=ss_quantity_null_count@1 != row_count@2 AND ss_quantity_max@0 >= 41 AND ss_quantity_null_count@1 != row_count@2 AND ss_quantity_min@3 <= 60, required_guarantees=[] + │ DistributedLeafExec: DataSourceExec: file_groups={3 groups: [[/testdata/tpcds/plans_sf1_partitions4/store_sales/part-0.parquet:.., /testdata/tpcds/plans_sf1_partitions4/store_sales/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_sales/part-1.parquet:.., /testdata/tpcds/plans_sf1_partitions4/store_sales/part-2.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_sales/part-2.parquet:.., /testdata/tpcds/plans_sf1_partitions4/store_sales/part-3.parquet:..]]}, projection=[ss_quantity], file_type=parquet, predicate=ss_quantity@10 >= 41 AND ss_quantity@10 <= 60, pruning_predicate=ss_quantity_null_count@1 != row_count@2 AND ss_quantity_max@0 >= 41 AND ss_quantity_null_count@1 != row_count@2 AND ss_quantity_min@3 <= 60, required_guarantees=[] └────────────────────────────────────────────────── - ┌───── Stage 9 ── Tasks: t0:[p0..p1] t1:[p2..p3] t2:[p4..p5] + ┌───── Stage 9 ── Tasks: t0:[p0..p2] t1:[p3..p5] t2:[p6..p8] │ AggregateExec: mode=Partial, gby=[], aggr=[avg(store_sales.ss_ext_discount_amt)] │ FilterExec: ss_quantity@0 >= 41 AND ss_quantity@0 <= 60, projection=[ss_ext_discount_amt@1] - │ PartitionIsolatorExec: tasks=3 partitions=6 - │ DataSourceExec: file_groups={6 groups: [[/testdata/tpcds/plans_sf1_partitions4/store_sales/part-0.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_sales/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_sales/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_sales/part-2.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_sales/part-2.parquet:..], ...]}, projection=[ss_quantity, ss_ext_discount_amt], file_type=parquet, predicate=ss_quantity@10 >= 41 AND ss_quantity@10 <= 60, pruning_predicate=ss_quantity_null_count@1 != row_count@2 AND ss_quantity_max@0 >= 41 AND ss_quantity_null_count@1 != row_count@2 AND ss_quantity_min@3 <= 60, required_guarantees=[] + │ DistributedLeafExec: DataSourceExec: file_groups={3 groups: [[/testdata/tpcds/plans_sf1_partitions4/store_sales/part-0.parquet:.., /testdata/tpcds/plans_sf1_partitions4/store_sales/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_sales/part-1.parquet:.., /testdata/tpcds/plans_sf1_partitions4/store_sales/part-2.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_sales/part-2.parquet:.., /testdata/tpcds/plans_sf1_partitions4/store_sales/part-3.parquet:..]]}, projection=[ss_quantity, ss_ext_discount_amt], file_type=parquet, predicate=ss_quantity@10 >= 41 AND ss_quantity@10 <= 60, pruning_predicate=ss_quantity_null_count@1 != row_count@2 AND ss_quantity_max@0 >= 41 AND ss_quantity_null_count@1 != row_count@2 AND ss_quantity_min@3 <= 60, required_guarantees=[] └────────────────────────────────────────────────── - ┌───── Stage 10 ── Tasks: t0:[p0..p1] t1:[p2..p3] t2:[p4..p5] + ┌───── Stage 10 ── Tasks: t0:[p0..p2] t1:[p3..p5] t2:[p6..p8] │ AggregateExec: mode=Partial, gby=[], aggr=[avg(store_sales.ss_net_paid)] │ FilterExec: ss_quantity@0 >= 41 AND ss_quantity@0 <= 60, projection=[ss_net_paid@1] - │ PartitionIsolatorExec: tasks=3 partitions=6 - │ DataSourceExec: file_groups={6 groups: [[/testdata/tpcds/plans_sf1_partitions4/store_sales/part-0.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_sales/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_sales/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_sales/part-2.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_sales/part-2.parquet:..], ...]}, projection=[ss_quantity, ss_net_paid], file_type=parquet, predicate=ss_quantity@10 >= 41 AND ss_quantity@10 <= 60, pruning_predicate=ss_quantity_null_count@1 != row_count@2 AND ss_quantity_max@0 >= 41 AND ss_quantity_null_count@1 != row_count@2 AND ss_quantity_min@3 <= 60, required_guarantees=[] + │ DistributedLeafExec: DataSourceExec: file_groups={3 groups: [[/testdata/tpcds/plans_sf1_partitions4/store_sales/part-0.parquet:.., /testdata/tpcds/plans_sf1_partitions4/store_sales/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_sales/part-1.parquet:.., /testdata/tpcds/plans_sf1_partitions4/store_sales/part-2.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_sales/part-2.parquet:.., /testdata/tpcds/plans_sf1_partitions4/store_sales/part-3.parquet:..]]}, projection=[ss_quantity, ss_net_paid], file_type=parquet, predicate=ss_quantity@10 >= 41 AND ss_quantity@10 <= 60, pruning_predicate=ss_quantity_null_count@1 != row_count@2 AND ss_quantity_max@0 >= 41 AND ss_quantity_null_count@1 != row_count@2 AND ss_quantity_min@3 <= 60, required_guarantees=[] └────────────────────────────────────────────────── - ┌───── Stage 11 ── Tasks: t0:[p0..p1] t1:[p2..p3] t2:[p4..p5] + ┌───── Stage 11 ── Tasks: t0:[p0..p2] t1:[p3..p5] t2:[p6..p8] │ AggregateExec: mode=Partial, gby=[], aggr=[count(Int64(1))] │ FilterExec: ss_quantity@0 >= 61 AND ss_quantity@0 <= 80, projection=[] - │ PartitionIsolatorExec: tasks=3 partitions=6 - │ DataSourceExec: file_groups={6 groups: [[/testdata/tpcds/plans_sf1_partitions4/store_sales/part-0.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_sales/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_sales/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_sales/part-2.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_sales/part-2.parquet:..], ...]}, projection=[ss_quantity], file_type=parquet, predicate=ss_quantity@10 >= 61 AND ss_quantity@10 <= 80, pruning_predicate=ss_quantity_null_count@1 != row_count@2 AND ss_quantity_max@0 >= 61 AND ss_quantity_null_count@1 != row_count@2 AND ss_quantity_min@3 <= 80, required_guarantees=[] + │ DistributedLeafExec: DataSourceExec: file_groups={3 groups: [[/testdata/tpcds/plans_sf1_partitions4/store_sales/part-0.parquet:.., /testdata/tpcds/plans_sf1_partitions4/store_sales/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_sales/part-1.parquet:.., /testdata/tpcds/plans_sf1_partitions4/store_sales/part-2.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_sales/part-2.parquet:.., /testdata/tpcds/plans_sf1_partitions4/store_sales/part-3.parquet:..]]}, projection=[ss_quantity], file_type=parquet, predicate=ss_quantity@10 >= 61 AND ss_quantity@10 <= 80, pruning_predicate=ss_quantity_null_count@1 != row_count@2 AND ss_quantity_max@0 >= 61 AND ss_quantity_null_count@1 != row_count@2 AND ss_quantity_min@3 <= 80, required_guarantees=[] └────────────────────────────────────────────────── - ┌───── Stage 12 ── Tasks: t0:[p0..p1] t1:[p2..p3] t2:[p4..p5] + ┌───── Stage 12 ── Tasks: t0:[p0..p2] t1:[p3..p5] t2:[p6..p8] │ AggregateExec: mode=Partial, gby=[], aggr=[avg(store_sales.ss_ext_discount_amt)] │ FilterExec: ss_quantity@0 >= 61 AND ss_quantity@0 <= 80, projection=[ss_ext_discount_amt@1] - │ PartitionIsolatorExec: tasks=3 partitions=6 - │ DataSourceExec: file_groups={6 groups: [[/testdata/tpcds/plans_sf1_partitions4/store_sales/part-0.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_sales/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_sales/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_sales/part-2.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_sales/part-2.parquet:..], ...]}, projection=[ss_quantity, ss_ext_discount_amt], file_type=parquet, predicate=ss_quantity@10 >= 61 AND ss_quantity@10 <= 80, pruning_predicate=ss_quantity_null_count@1 != row_count@2 AND ss_quantity_max@0 >= 61 AND ss_quantity_null_count@1 != row_count@2 AND ss_quantity_min@3 <= 80, required_guarantees=[] + │ DistributedLeafExec: DataSourceExec: file_groups={3 groups: [[/testdata/tpcds/plans_sf1_partitions4/store_sales/part-0.parquet:.., /testdata/tpcds/plans_sf1_partitions4/store_sales/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_sales/part-1.parquet:.., /testdata/tpcds/plans_sf1_partitions4/store_sales/part-2.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_sales/part-2.parquet:.., /testdata/tpcds/plans_sf1_partitions4/store_sales/part-3.parquet:..]]}, projection=[ss_quantity, ss_ext_discount_amt], file_type=parquet, predicate=ss_quantity@10 >= 61 AND ss_quantity@10 <= 80, pruning_predicate=ss_quantity_null_count@1 != row_count@2 AND ss_quantity_max@0 >= 61 AND ss_quantity_null_count@1 != row_count@2 AND ss_quantity_min@3 <= 80, required_guarantees=[] └────────────────────────────────────────────────── - ┌───── Stage 13 ── Tasks: t0:[p0..p1] t1:[p2..p3] t2:[p4..p5] + ┌───── Stage 13 ── Tasks: t0:[p0..p2] t1:[p3..p5] t2:[p6..p8] │ AggregateExec: mode=Partial, gby=[], aggr=[avg(store_sales.ss_net_paid)] │ FilterExec: ss_quantity@0 >= 61 AND ss_quantity@0 <= 80, projection=[ss_net_paid@1] - │ PartitionIsolatorExec: tasks=3 partitions=6 - │ DataSourceExec: file_groups={6 groups: [[/testdata/tpcds/plans_sf1_partitions4/store_sales/part-0.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_sales/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_sales/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_sales/part-2.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_sales/part-2.parquet:..], ...]}, projection=[ss_quantity, ss_net_paid], file_type=parquet, predicate=ss_quantity@10 >= 61 AND ss_quantity@10 <= 80, pruning_predicate=ss_quantity_null_count@1 != row_count@2 AND ss_quantity_max@0 >= 61 AND ss_quantity_null_count@1 != row_count@2 AND ss_quantity_min@3 <= 80, required_guarantees=[] + │ DistributedLeafExec: DataSourceExec: file_groups={3 groups: [[/testdata/tpcds/plans_sf1_partitions4/store_sales/part-0.parquet:.., /testdata/tpcds/plans_sf1_partitions4/store_sales/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_sales/part-1.parquet:.., /testdata/tpcds/plans_sf1_partitions4/store_sales/part-2.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_sales/part-2.parquet:.., /testdata/tpcds/plans_sf1_partitions4/store_sales/part-3.parquet:..]]}, projection=[ss_quantity, ss_net_paid], file_type=parquet, predicate=ss_quantity@10 >= 61 AND ss_quantity@10 <= 80, pruning_predicate=ss_quantity_null_count@1 != row_count@2 AND ss_quantity_max@0 >= 61 AND ss_quantity_null_count@1 != row_count@2 AND ss_quantity_min@3 <= 80, required_guarantees=[] └────────────────────────────────────────────────── - ┌───── Stage 14 ── Tasks: t0:[p0..p1] t1:[p2..p3] t2:[p4..p5] + ┌───── Stage 14 ── Tasks: t0:[p0..p2] t1:[p3..p5] t2:[p6..p8] │ AggregateExec: mode=Partial, gby=[], aggr=[count(Int64(1))] │ FilterExec: ss_quantity@0 >= 81 AND ss_quantity@0 <= 100, projection=[] - │ PartitionIsolatorExec: tasks=3 partitions=6 - │ DataSourceExec: file_groups={6 groups: [[/testdata/tpcds/plans_sf1_partitions4/store_sales/part-0.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_sales/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_sales/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_sales/part-2.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_sales/part-2.parquet:..], ...]}, projection=[ss_quantity], file_type=parquet, predicate=ss_quantity@10 >= 81 AND ss_quantity@10 <= 100, pruning_predicate=ss_quantity_null_count@1 != row_count@2 AND ss_quantity_max@0 >= 81 AND ss_quantity_null_count@1 != row_count@2 AND ss_quantity_min@3 <= 100, required_guarantees=[] + │ DistributedLeafExec: DataSourceExec: file_groups={3 groups: [[/testdata/tpcds/plans_sf1_partitions4/store_sales/part-0.parquet:.., /testdata/tpcds/plans_sf1_partitions4/store_sales/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_sales/part-1.parquet:.., /testdata/tpcds/plans_sf1_partitions4/store_sales/part-2.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_sales/part-2.parquet:.., /testdata/tpcds/plans_sf1_partitions4/store_sales/part-3.parquet:..]]}, projection=[ss_quantity], file_type=parquet, predicate=ss_quantity@10 >= 81 AND ss_quantity@10 <= 100, pruning_predicate=ss_quantity_null_count@1 != row_count@2 AND ss_quantity_max@0 >= 81 AND ss_quantity_null_count@1 != row_count@2 AND ss_quantity_min@3 <= 100, required_guarantees=[] └────────────────────────────────────────────────── - ┌───── Stage 15 ── Tasks: t0:[p0..p1] t1:[p2..p3] t2:[p4..p5] + ┌───── Stage 15 ── Tasks: t0:[p0..p2] t1:[p3..p5] t2:[p6..p8] │ AggregateExec: mode=Partial, gby=[], aggr=[avg(store_sales.ss_ext_discount_amt)] │ FilterExec: ss_quantity@0 >= 81 AND ss_quantity@0 <= 100, projection=[ss_ext_discount_amt@1] - │ PartitionIsolatorExec: tasks=3 partitions=6 - │ DataSourceExec: file_groups={6 groups: [[/testdata/tpcds/plans_sf1_partitions4/store_sales/part-0.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_sales/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_sales/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_sales/part-2.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_sales/part-2.parquet:..], ...]}, projection=[ss_quantity, ss_ext_discount_amt], file_type=parquet, predicate=ss_quantity@10 >= 81 AND ss_quantity@10 <= 100, pruning_predicate=ss_quantity_null_count@1 != row_count@2 AND ss_quantity_max@0 >= 81 AND ss_quantity_null_count@1 != row_count@2 AND ss_quantity_min@3 <= 100, required_guarantees=[] + │ DistributedLeafExec: DataSourceExec: file_groups={3 groups: [[/testdata/tpcds/plans_sf1_partitions4/store_sales/part-0.parquet:.., /testdata/tpcds/plans_sf1_partitions4/store_sales/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_sales/part-1.parquet:.., /testdata/tpcds/plans_sf1_partitions4/store_sales/part-2.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_sales/part-2.parquet:.., /testdata/tpcds/plans_sf1_partitions4/store_sales/part-3.parquet:..]]}, projection=[ss_quantity, ss_ext_discount_amt], file_type=parquet, predicate=ss_quantity@10 >= 81 AND ss_quantity@10 <= 100, pruning_predicate=ss_quantity_null_count@1 != row_count@2 AND ss_quantity_max@0 >= 81 AND ss_quantity_null_count@1 != row_count@2 AND ss_quantity_min@3 <= 100, required_guarantees=[] └────────────────────────────────────────────────── - ┌───── Stage 16 ── Tasks: t0:[p0..p1] t1:[p2..p3] t2:[p4..p5] + ┌───── Stage 16 ── Tasks: t0:[p0..p2] t1:[p3..p5] t2:[p6..p8] │ AggregateExec: mode=Partial, gby=[], aggr=[avg(store_sales.ss_net_paid)] │ FilterExec: ss_quantity@0 >= 81 AND ss_quantity@0 <= 100, projection=[ss_net_paid@1] - │ PartitionIsolatorExec: tasks=3 partitions=6 - │ DataSourceExec: file_groups={6 groups: [[/testdata/tpcds/plans_sf1_partitions4/store_sales/part-0.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_sales/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_sales/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_sales/part-2.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_sales/part-2.parquet:..], ...]}, projection=[ss_quantity, ss_net_paid], file_type=parquet, predicate=ss_quantity@10 >= 81 AND ss_quantity@10 <= 100, pruning_predicate=ss_quantity_null_count@1 != row_count@2 AND ss_quantity_max@0 >= 81 AND ss_quantity_null_count@1 != row_count@2 AND ss_quantity_min@3 <= 100, required_guarantees=[] + │ DistributedLeafExec: DataSourceExec: file_groups={3 groups: [[/testdata/tpcds/plans_sf1_partitions4/store_sales/part-0.parquet:.., /testdata/tpcds/plans_sf1_partitions4/store_sales/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_sales/part-1.parquet:.., /testdata/tpcds/plans_sf1_partitions4/store_sales/part-2.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_sales/part-2.parquet:.., /testdata/tpcds/plans_sf1_partitions4/store_sales/part-3.parquet:..]]}, projection=[ss_quantity, ss_net_paid], file_type=parquet, predicate=ss_quantity@10 >= 81 AND ss_quantity@10 <= 100, pruning_predicate=ss_quantity_null_count@1 != row_count@2 AND ss_quantity_max@0 >= 81 AND ss_quantity_null_count@1 != row_count@2 AND ss_quantity_min@3 <= 100, required_guarantees=[] └────────────────────────────────────────────────── "); Ok(()) @@ -1015,7 +933,7 @@ mod tests { #[tokio::test] async fn test_tpcds_10() -> Result<()> { let display = test_tpcds_query("q10").await?; - assert_snapshot!(display, @" + assert_snapshot!(display, @r" ┌───── DistributedExec ── Tasks: t0:[p0] │ SortPreservingMergeExec: [cd_gender@0 ASC NULLS LAST, cd_marital_status@1 ASC NULLS LAST, cd_education_status@2 ASC NULLS LAST, cd_purchase_estimate@4 ASC NULLS LAST, cd_credit_rating@6 ASC NULLS LAST, cd_dep_count@8 ASC NULLS LAST, cd_dep_employed_count@10 ASC NULLS LAST, cd_dep_college_count@12 ASC NULLS LAST], fetch=100 │ SortExec: TopK(fetch=100), expr=[cd_gender@0 ASC NULLS LAST, cd_marital_status@1 ASC NULLS LAST, cd_education_status@2 ASC NULLS LAST, cd_purchase_estimate@4 ASC NULLS LAST, cd_credit_rating@6 ASC NULLS LAST, cd_dep_count@8 ASC NULLS LAST, cd_dep_employed_count@10 ASC NULLS LAST, cd_dep_college_count@12 ASC NULLS LAST], preserve_partitioning=[true] @@ -1024,74 +942,67 @@ mod tests { │ [Stage 9] => NetworkShuffleExec: output_partitions=3, input_tasks=3 └────────────────────────────────────────────────── ┌───── Stage 9 ── Tasks: t0:[p0..p2] t1:[p0..p2] t2:[p0..p2] - │ RepartitionExec: partitioning=Hash([cd_gender@0, cd_marital_status@1, cd_education_status@2, cd_purchase_estimate@3, cd_credit_rating@4, cd_dep_count@5, cd_dep_employed_count@6, cd_dep_college_count@7], 3), input_partitions=2 + │ RepartitionExec: partitioning=Hash([cd_gender@0, cd_marital_status@1, cd_education_status@2, cd_purchase_estimate@3, cd_credit_rating@4, cd_dep_count@5, cd_dep_employed_count@6, cd_dep_college_count@7], 3), input_partitions=3 │ AggregateExec: mode=Partial, gby=[cd_gender@0 as cd_gender, cd_marital_status@1 as cd_marital_status, cd_education_status@2 as cd_education_status, cd_purchase_estimate@3 as cd_purchase_estimate, cd_credit_rating@4 as cd_credit_rating, cd_dep_count@5 as cd_dep_count, cd_dep_employed_count@6 as cd_dep_employed_count, cd_dep_college_count@7 as cd_dep_college_count], aggr=[count(Int64(1))] │ FilterExec: mark@8 OR mark@9, projection=[cd_gender@0, cd_marital_status@1, cd_education_status@2, cd_purchase_estimate@3, cd_credit_rating@4, cd_dep_count@5, cd_dep_employed_count@6, cd_dep_college_count@7] │ HashJoinExec: mode=CollectLeft, join_type=RightMark, on=[(cs_ship_customer_sk@0, CAST(c.c_customer_sk AS Float64)@10)], projection=[cd_gender@1, cd_marital_status@2, cd_education_status@3, cd_purchase_estimate@4, cd_credit_rating@5, cd_dep_count@6, cd_dep_employed_count@7, cd_dep_college_count@8, mark@9, mark@11] │ CoalescePartitionsExec - │ [Stage 2] => NetworkBroadcastExec: partitions_per_consumer=2, stage_partitions=6, input_tasks=3 + │ [Stage 2] => NetworkBroadcastExec: partitions_per_consumer=3, stage_partitions=9, input_tasks=3 │ ProjectionExec: expr=[c_customer_sk@0 as c_customer_sk, cd_gender@1 as cd_gender, cd_marital_status@2 as cd_marital_status, cd_education_status@3 as cd_education_status, cd_purchase_estimate@4 as cd_purchase_estimate, cd_credit_rating@5 as cd_credit_rating, cd_dep_count@6 as cd_dep_count, cd_dep_employed_count@7 as cd_dep_employed_count, cd_dep_college_count@8 as cd_dep_college_count, mark@9 as mark, CAST(c_customer_sk@0 AS Float64) as CAST(c.c_customer_sk AS Float64)] │ HashJoinExec: mode=CollectLeft, join_type=RightMark, on=[(ws_bill_customer_sk@0, CAST(c.c_customer_sk AS Float64)@9)], projection=[c_customer_sk@0, cd_gender@1, cd_marital_status@2, cd_education_status@3, cd_purchase_estimate@4, cd_credit_rating@5, cd_dep_count@6, cd_dep_employed_count@7, cd_dep_college_count@8, mark@10] │ CoalescePartitionsExec - │ [Stage 4] => NetworkBroadcastExec: partitions_per_consumer=2, stage_partitions=6, input_tasks=3 + │ [Stage 4] => NetworkBroadcastExec: partitions_per_consumer=3, stage_partitions=9, input_tasks=3 │ ProjectionExec: expr=[c_customer_sk@0 as c_customer_sk, cd_gender@1 as cd_gender, cd_marital_status@2 as cd_marital_status, cd_education_status@3 as cd_education_status, cd_purchase_estimate@4 as cd_purchase_estimate, cd_credit_rating@5 as cd_credit_rating, cd_dep_count@6 as cd_dep_count, cd_dep_employed_count@7 as cd_dep_employed_count, cd_dep_college_count@8 as cd_dep_college_count, CAST(c_customer_sk@0 AS Float64) as CAST(c.c_customer_sk AS Float64)] │ HashJoinExec: mode=CollectLeft, join_type=RightSemi, on=[(ss_customer_sk@0, CAST(c.c_customer_sk AS Float64)@9)], projection=[c_customer_sk@0, cd_gender@1, cd_marital_status@2, cd_education_status@3, cd_purchase_estimate@4, cd_credit_rating@5, cd_dep_count@6, cd_dep_employed_count@7, cd_dep_college_count@8] │ CoalescePartitionsExec - │ [Stage 6] => NetworkBroadcastExec: partitions_per_consumer=2, stage_partitions=6, input_tasks=3 + │ [Stage 6] => NetworkBroadcastExec: partitions_per_consumer=3, stage_partitions=9, input_tasks=3 │ ProjectionExec: expr=[c_customer_sk@0 as c_customer_sk, cd_gender@1 as cd_gender, cd_marital_status@2 as cd_marital_status, cd_education_status@3 as cd_education_status, cd_purchase_estimate@4 as cd_purchase_estimate, cd_credit_rating@5 as cd_credit_rating, cd_dep_count@6 as cd_dep_count, cd_dep_employed_count@7 as cd_dep_employed_count, cd_dep_college_count@8 as cd_dep_college_count, CAST(c_customer_sk@0 AS Float64) as CAST(c.c_customer_sk AS Float64)] │ HashJoinExec: mode=CollectLeft, join_type=Inner, on=[(c_current_cdemo_sk@1, CAST(customer_demographics.cd_demo_sk AS Float64)@9)], projection=[c_customer_sk@0, cd_gender@3, cd_marital_status@4, cd_education_status@5, cd_purchase_estimate@6, cd_credit_rating@7, cd_dep_count@8, cd_dep_employed_count@9, cd_dep_college_count@10] │ CoalescePartitionsExec │ [Stage 8] => NetworkBroadcastExec: partitions_per_consumer=3, stage_partitions=9, input_tasks=2 - │ PartitionIsolatorExec: tasks=3 partitions=6 - │ DataSourceExec: file_groups={6 groups: [[/testdata/tpcds/plans_sf1_partitions4/customer_demographics/part-0.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/customer_demographics/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/customer_demographics/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/customer_demographics/part-2.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/customer_demographics/part-2.parquet:..], ...]}, projection=[cd_demo_sk, cd_gender, cd_marital_status, cd_education_status, cd_purchase_estimate, cd_credit_rating, cd_dep_count, cd_dep_employed_count, cd_dep_college_count, CAST(cd_demo_sk@0 AS Float64) as CAST(customer_demographics.cd_demo_sk AS Float64)], file_type=parquet, predicate=DynamicFilter [ empty ] AND DynamicFilter [ empty ] + │ DistributedLeafExec: DataSourceExec: file_groups={3 groups: [[/testdata/tpcds/plans_sf1_partitions4/customer_demographics/part-0.parquet:.., /testdata/tpcds/plans_sf1_partitions4/customer_demographics/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/customer_demographics/part-1.parquet:.., /testdata/tpcds/plans_sf1_partitions4/customer_demographics/part-2.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/customer_demographics/part-2.parquet:.., /testdata/tpcds/plans_sf1_partitions4/customer_demographics/part-3.parquet:..]]}, projection=[cd_demo_sk, cd_gender, cd_marital_status, cd_education_status, cd_purchase_estimate, cd_credit_rating, cd_dep_count, cd_dep_employed_count, cd_dep_college_count, CAST(cd_demo_sk@0 AS Float64) as CAST(customer_demographics.cd_demo_sk AS Float64)], file_type=parquet, predicate=DynamicFilter [ empty ] AND DynamicFilter [ empty ] └────────────────────────────────────────────────── - ┌───── Stage 2 ── Tasks: t0:[p0..p5] t1:[p6..p11] t2:[p12..p17] - │ BroadcastExec: input_partitions=2, consumer_tasks=3, output_partitions=6 + ┌───── Stage 2 ── Tasks: t0:[p0..p8] t1:[p9..p17] t2:[p18..p26] + │ BroadcastExec: input_partitions=3, consumer_tasks=3, output_partitions=9 │ HashJoinExec: mode=CollectLeft, join_type=Inner, on=[(CAST(date_dim.d_date_sk AS Float64)@1, cs_sold_date_sk@0)], projection=[cs_ship_customer_sk@3] │ CoalescePartitionsExec │ [Stage 1] => NetworkBroadcastExec: partitions_per_consumer=3, stage_partitions=9, input_tasks=2 - │ PartitionIsolatorExec: tasks=3 partitions=6 - │ DataSourceExec: file_groups={6 groups: [[/testdata/tpcds/plans_sf1_partitions4/catalog_sales/part-0.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/catalog_sales/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/catalog_sales/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/catalog_sales/part-2.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/catalog_sales/part-2.parquet:..], ...]}, projection=[cs_sold_date_sk, cs_ship_customer_sk], file_type=parquet, predicate=DynamicFilter [ empty ] + │ DistributedLeafExec: DataSourceExec: file_groups={3 groups: [[/testdata/tpcds/plans_sf1_partitions4/catalog_sales/part-0.parquet:.., /testdata/tpcds/plans_sf1_partitions4/catalog_sales/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/catalog_sales/part-1.parquet:.., /testdata/tpcds/plans_sf1_partitions4/catalog_sales/part-2.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/catalog_sales/part-2.parquet:.., /testdata/tpcds/plans_sf1_partitions4/catalog_sales/part-3.parquet:..]]}, projection=[cs_sold_date_sk, cs_ship_customer_sk], file_type=parquet, predicate=DynamicFilter [ empty ] └────────────────────────────────────────────────── ┌───── Stage 1 ── Tasks: t0:[p0..p8] t1:[p9..p17] │ BroadcastExec: input_partitions=3, consumer_tasks=3, output_partitions=9 │ ProjectionExec: expr=[d_date_sk@0 as d_date_sk, CAST(d_date_sk@0 AS Float64) as CAST(date_dim.d_date_sk AS Float64)] │ FilterExec: d_year@1 = 2002 AND d_moy@2 >= 1 AND d_moy@2 <= 4, projection=[d_date_sk@0] │ RepartitionExec: partitioning=RoundRobinBatch(3), input_partitions=2 - │ PartitionIsolatorExec: tasks=2 partitions=4 - │ DataSourceExec: file_groups={4 groups: [[/testdata/tpcds/plans_sf1_partitions4/date_dim/part-0.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-2.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-3.parquet]]}, projection=[d_date_sk, d_year, d_moy], file_type=parquet, predicate=d_year@6 = 2002 AND d_moy@8 >= 1 AND d_moy@8 <= 4, pruning_predicate=d_year_null_count@2 != row_count@3 AND d_year_min@0 <= 2002 AND 2002 <= d_year_max@1 AND d_moy_null_count@5 != row_count@3 AND d_moy_max@4 >= 1 AND d_moy_null_count@5 != row_count@3 AND d_moy_min@6 <= 4, required_guarantees=[d_year in (2002)] + │ DistributedLeafExec: DataSourceExec: file_groups={2 groups: [[/testdata/tpcds/plans_sf1_partitions4/date_dim/part-0.parquet, /testdata/tpcds/plans_sf1_partitions4/date_dim/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-2.parquet, /testdata/tpcds/plans_sf1_partitions4/date_dim/part-3.parquet]]}, projection=[d_date_sk, d_year, d_moy], file_type=parquet, predicate=d_year@6 = 2002 AND d_moy@8 >= 1 AND d_moy@8 <= 4, pruning_predicate=d_year_null_count@2 != row_count@3 AND d_year_min@0 <= 2002 AND 2002 <= d_year_max@1 AND d_moy_null_count@5 != row_count@3 AND d_moy_max@4 >= 1 AND d_moy_null_count@5 != row_count@3 AND d_moy_min@6 <= 4, required_guarantees=[d_year in (2002)] └────────────────────────────────────────────────── - ┌───── Stage 4 ── Tasks: t0:[p0..p5] t1:[p6..p11] t2:[p12..p17] - │ BroadcastExec: input_partitions=2, consumer_tasks=3, output_partitions=6 + ┌───── Stage 4 ── Tasks: t0:[p0..p8] t1:[p9..p17] t2:[p18..p26] + │ BroadcastExec: input_partitions=3, consumer_tasks=3, output_partitions=9 │ HashJoinExec: mode=CollectLeft, join_type=Inner, on=[(CAST(date_dim.d_date_sk AS Float64)@1, ws_sold_date_sk@0)], projection=[ws_bill_customer_sk@3] │ CoalescePartitionsExec │ [Stage 3] => NetworkBroadcastExec: partitions_per_consumer=3, stage_partitions=9, input_tasks=2 - │ PartitionIsolatorExec: tasks=3 partitions=6 - │ DataSourceExec: file_groups={6 groups: [[/testdata/tpcds/plans_sf1_partitions4/web_sales/part-0.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/web_sales/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/web_sales/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/web_sales/part-2.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/web_sales/part-2.parquet:..], ...]}, projection=[ws_sold_date_sk, ws_bill_customer_sk], file_type=parquet, predicate=DynamicFilter [ empty ] + │ DistributedLeafExec: DataSourceExec: file_groups={3 groups: [[/testdata/tpcds/plans_sf1_partitions4/web_sales/part-0.parquet:.., /testdata/tpcds/plans_sf1_partitions4/web_sales/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/web_sales/part-1.parquet:.., /testdata/tpcds/plans_sf1_partitions4/web_sales/part-2.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/web_sales/part-2.parquet:.., /testdata/tpcds/plans_sf1_partitions4/web_sales/part-3.parquet:..]]}, projection=[ws_sold_date_sk, ws_bill_customer_sk], file_type=parquet, predicate=DynamicFilter [ empty ] └────────────────────────────────────────────────── ┌───── Stage 3 ── Tasks: t0:[p0..p8] t1:[p9..p17] │ BroadcastExec: input_partitions=3, consumer_tasks=3, output_partitions=9 │ ProjectionExec: expr=[d_date_sk@0 as d_date_sk, CAST(d_date_sk@0 AS Float64) as CAST(date_dim.d_date_sk AS Float64)] │ FilterExec: d_year@1 = 2002 AND d_moy@2 >= 1 AND d_moy@2 <= 4, projection=[d_date_sk@0] │ RepartitionExec: partitioning=RoundRobinBatch(3), input_partitions=2 - │ PartitionIsolatorExec: tasks=2 partitions=4 - │ DataSourceExec: file_groups={4 groups: [[/testdata/tpcds/plans_sf1_partitions4/date_dim/part-0.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-2.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-3.parquet]]}, projection=[d_date_sk, d_year, d_moy], file_type=parquet, predicate=d_year@6 = 2002 AND d_moy@8 >= 1 AND d_moy@8 <= 4, pruning_predicate=d_year_null_count@2 != row_count@3 AND d_year_min@0 <= 2002 AND 2002 <= d_year_max@1 AND d_moy_null_count@5 != row_count@3 AND d_moy_max@4 >= 1 AND d_moy_null_count@5 != row_count@3 AND d_moy_min@6 <= 4, required_guarantees=[d_year in (2002)] + │ DistributedLeafExec: DataSourceExec: file_groups={2 groups: [[/testdata/tpcds/plans_sf1_partitions4/date_dim/part-0.parquet, /testdata/tpcds/plans_sf1_partitions4/date_dim/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-2.parquet, /testdata/tpcds/plans_sf1_partitions4/date_dim/part-3.parquet]]}, projection=[d_date_sk, d_year, d_moy], file_type=parquet, predicate=d_year@6 = 2002 AND d_moy@8 >= 1 AND d_moy@8 <= 4, pruning_predicate=d_year_null_count@2 != row_count@3 AND d_year_min@0 <= 2002 AND 2002 <= d_year_max@1 AND d_moy_null_count@5 != row_count@3 AND d_moy_max@4 >= 1 AND d_moy_null_count@5 != row_count@3 AND d_moy_min@6 <= 4, required_guarantees=[d_year in (2002)] └────────────────────────────────────────────────── - ┌───── Stage 6 ── Tasks: t0:[p0..p5] t1:[p6..p11] t2:[p12..p17] - │ BroadcastExec: input_partitions=2, consumer_tasks=3, output_partitions=6 + ┌───── Stage 6 ── Tasks: t0:[p0..p8] t1:[p9..p17] t2:[p18..p26] + │ BroadcastExec: input_partitions=3, consumer_tasks=3, output_partitions=9 │ HashJoinExec: mode=CollectLeft, join_type=Inner, on=[(CAST(date_dim.d_date_sk AS Float64)@1, ss_sold_date_sk@0)], projection=[ss_customer_sk@3] │ CoalescePartitionsExec │ [Stage 5] => NetworkBroadcastExec: partitions_per_consumer=3, stage_partitions=9, input_tasks=2 - │ PartitionIsolatorExec: tasks=3 partitions=6 - │ DataSourceExec: file_groups={6 groups: [[/testdata/tpcds/plans_sf1_partitions4/store_sales/part-0.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_sales/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_sales/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_sales/part-2.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_sales/part-2.parquet:..], ...]}, projection=[ss_sold_date_sk, ss_customer_sk], file_type=parquet, predicate=DynamicFilter [ empty ] + │ DistributedLeafExec: DataSourceExec: file_groups={3 groups: [[/testdata/tpcds/plans_sf1_partitions4/store_sales/part-0.parquet:.., /testdata/tpcds/plans_sf1_partitions4/store_sales/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_sales/part-1.parquet:.., /testdata/tpcds/plans_sf1_partitions4/store_sales/part-2.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_sales/part-2.parquet:.., /testdata/tpcds/plans_sf1_partitions4/store_sales/part-3.parquet:..]]}, projection=[ss_sold_date_sk, ss_customer_sk], file_type=parquet, predicate=DynamicFilter [ empty ] └────────────────────────────────────────────────── ┌───── Stage 5 ── Tasks: t0:[p0..p8] t1:[p9..p17] │ BroadcastExec: input_partitions=3, consumer_tasks=3, output_partitions=9 │ ProjectionExec: expr=[d_date_sk@0 as d_date_sk, CAST(d_date_sk@0 AS Float64) as CAST(date_dim.d_date_sk AS Float64)] │ FilterExec: d_year@1 = 2002 AND d_moy@2 >= 1 AND d_moy@2 <= 4, projection=[d_date_sk@0] │ RepartitionExec: partitioning=RoundRobinBatch(3), input_partitions=2 - │ PartitionIsolatorExec: tasks=2 partitions=4 - │ DataSourceExec: file_groups={4 groups: [[/testdata/tpcds/plans_sf1_partitions4/date_dim/part-0.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-2.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-3.parquet]]}, projection=[d_date_sk, d_year, d_moy], file_type=parquet, predicate=d_year@6 = 2002 AND d_moy@8 >= 1 AND d_moy@8 <= 4, pruning_predicate=d_year_null_count@2 != row_count@3 AND d_year_min@0 <= 2002 AND 2002 <= d_year_max@1 AND d_moy_null_count@5 != row_count@3 AND d_moy_max@4 >= 1 AND d_moy_null_count@5 != row_count@3 AND d_moy_min@6 <= 4, required_guarantees=[d_year in (2002)] + │ DistributedLeafExec: DataSourceExec: file_groups={2 groups: [[/testdata/tpcds/plans_sf1_partitions4/date_dim/part-0.parquet, /testdata/tpcds/plans_sf1_partitions4/date_dim/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-2.parquet, /testdata/tpcds/plans_sf1_partitions4/date_dim/part-3.parquet]]}, projection=[d_date_sk, d_year, d_moy], file_type=parquet, predicate=d_year@6 = 2002 AND d_moy@8 >= 1 AND d_moy@8 <= 4, pruning_predicate=d_year_null_count@2 != row_count@3 AND d_year_min@0 <= 2002 AND 2002 <= d_year_max@1 AND d_moy_null_count@5 != row_count@3 AND d_moy_max@4 >= 1 AND d_moy_null_count@5 != row_count@3 AND d_moy_min@6 <= 4, required_guarantees=[d_year in (2002)] └────────────────────────────────────────────────── ┌───── Stage 8 ── Tasks: t0:[p0..p8] t1:[p9..p17] │ BroadcastExec: input_partitions=3, consumer_tasks=3, output_partitions=9 @@ -1099,15 +1010,13 @@ mod tests { │ CoalescePartitionsExec │ [Stage 7] => NetworkBroadcastExec: partitions_per_consumer=3, stage_partitions=6, input_tasks=2 │ RepartitionExec: partitioning=RoundRobinBatch(3), input_partitions=2 - │ PartitionIsolatorExec: tasks=2 partitions=4 - │ DataSourceExec: file_groups={4 groups: [[/testdata/tpcds/plans_sf1_partitions4/customer/part-0.parquet], [/testdata/tpcds/plans_sf1_partitions4/customer/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/customer/part-2.parquet], [/testdata/tpcds/plans_sf1_partitions4/customer/part-3.parquet]]}, projection=[c_customer_sk, c_current_cdemo_sk, c_current_addr_sk], file_type=parquet, predicate=DynamicFilter [ empty ] + │ DistributedLeafExec: DataSourceExec: file_groups={2 groups: [[/testdata/tpcds/plans_sf1_partitions4/customer/part-0.parquet, /testdata/tpcds/plans_sf1_partitions4/customer/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/customer/part-2.parquet, /testdata/tpcds/plans_sf1_partitions4/customer/part-3.parquet]]}, projection=[c_customer_sk, c_current_cdemo_sk, c_current_addr_sk], file_type=parquet, predicate=DynamicFilter [ empty ] └────────────────────────────────────────────────── ┌───── Stage 7 ── Tasks: t0:[p0..p5] t1:[p6..p11] │ BroadcastExec: input_partitions=3, consumer_tasks=2, output_partitions=6 │ FilterExec: ca_county@1 IN (SET) ([Rush County, Toole County, Jefferson County, Dona Ana County, La Porte County]), projection=[ca_address_sk@0] │ RepartitionExec: partitioning=RoundRobinBatch(3), input_partitions=2 - │ PartitionIsolatorExec: tasks=2 partitions=4 - │ DataSourceExec: file_groups={4 groups: [[/testdata/tpcds/plans_sf1_partitions4/customer_address/part-0.parquet], [/testdata/tpcds/plans_sf1_partitions4/customer_address/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/customer_address/part-2.parquet], [/testdata/tpcds/plans_sf1_partitions4/customer_address/part-3.parquet]]}, projection=[ca_address_sk, ca_county], file_type=parquet, predicate=ca_county@7 IN (SET) ([Rush County, Toole County, Jefferson County, Dona Ana County, La Porte County]), pruning_predicate=ca_county_null_count@2 != row_count@3 AND ca_county_min@0 <= Rush County AND Rush County <= ca_county_max@1 OR ca_county_null_count@2 != row_count@3 AND ca_county_min@0 <= Toole County AND Toole County <= ca_county_max@1 OR ca_county_null_count@2 != row_count@3 AND ca_county_min@0 <= Jefferson County AND Jefferson County <= ca_county_max@1 OR ca_county_null_count@2 != row_count@3 AND ca_county_min@0 <= Dona Ana County AND Dona Ana County <= ca_county_max@1 OR ca_county_null_count@2 != row_count@3 AND ca_county_min@0 <= La Porte County AND La Porte County <= ca_county_max@1, required_guarantees=[ca_county in (Dona Ana County, Jefferson County, La Porte County, Rush County, Toole County)] + │ DistributedLeafExec: DataSourceExec: file_groups={2 groups: [[/testdata/tpcds/plans_sf1_partitions4/customer_address/part-0.parquet, /testdata/tpcds/plans_sf1_partitions4/customer_address/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/customer_address/part-2.parquet, /testdata/tpcds/plans_sf1_partitions4/customer_address/part-3.parquet]]}, projection=[ca_address_sk, ca_county], file_type=parquet, predicate=ca_county@7 IN (SET) ([Rush County, Toole County, Jefferson County, Dona Ana County, La Porte County]), pruning_predicate=ca_county_null_count@2 != row_count@3 AND ca_county_min@0 <= Rush County AND Rush County <= ca_county_max@1 OR ca_county_null_count@2 != row_count@3 AND ca_county_min@0 <= Toole County AND Toole County <= ca_county_max@1 OR ca_county_null_count@2 != row_count@3 AND ca_county_min@0 <= Jefferson County AND Jefferson County <= ca_county_max@1 OR ca_county_null_count@2 != row_count@3 AND ca_county_min@0 <= Dona Ana County AND Dona Ana County <= ca_county_max@1 OR ca_county_null_count@2 != row_count@3 AND ca_county_min@0 <= La Porte County AND La Porte County <= ca_county_max@1, required_guarantees=[ca_county in (Dona Ana County, Jefferson County, La Porte County, Rush County, Toole County)] └────────────────────────────────────────────────── "); Ok(()) @@ -1115,7 +1024,7 @@ mod tests { #[tokio::test] async fn test_tpcds_11() -> Result<()> { let display = test_tpcds_query("q11").await?; - assert_snapshot!(display, @" + assert_snapshot!(display, @r" ┌───── DistributedExec ── Tasks: t0:[p0] │ SortPreservingMergeExec: [customer_id@0 ASC, customer_first_name@1 ASC, customer_last_name@2 ASC, customer_preferred_cust_flag@3 ASC], fetch=100 │ [Stage 16] => NetworkCoalesceExec: output_partitions=6, input_tasks=2 @@ -1151,7 +1060,7 @@ mod tests { │ [Stage 3] => NetworkShuffleExec: output_partitions=3, input_tasks=3 └────────────────────────────────────────────────── ┌───── Stage 3 ── Tasks: t0:[p0..p5] t1:[p0..p5] t2:[p0..p5] - │ RepartitionExec: partitioning=Hash([c_customer_id@0, c_first_name@1, c_last_name@2, c_preferred_cust_flag@3, c_birth_country@4, c_login@5, c_email_address@6, d_year@7], 6), input_partitions=2 + │ RepartitionExec: partitioning=Hash([c_customer_id@0, c_first_name@1, c_last_name@2, c_preferred_cust_flag@3, c_birth_country@4, c_login@5, c_email_address@6, d_year@7], 6), input_partitions=3 │ AggregateExec: mode=Partial, gby=[c_customer_id@0 as c_customer_id, c_first_name@1 as c_first_name, c_last_name@2 as c_last_name, c_preferred_cust_flag@3 as c_preferred_cust_flag, c_birth_country@4 as c_birth_country, c_login@5 as c_login, c_email_address@6 as c_email_address, d_year@9 as d_year], aggr=[sum(web_sales.ws_ext_list_price - web_sales.ws_ext_discount_amt)], ordering_mode=PartiallySorted([7]) │ ProjectionExec: expr=[c_customer_id@1 as c_customer_id, c_first_name@2 as c_first_name, c_last_name@3 as c_last_name, c_preferred_cust_flag@4 as c_preferred_cust_flag, c_birth_country@5 as c_birth_country, c_login@6 as c_login, c_email_address@7 as c_email_address, ws_ext_discount_amt@8 as ws_ext_discount_amt, ws_ext_list_price@9 as ws_ext_list_price, d_year@0 as d_year] │ HashJoinExec: mode=CollectLeft, join_type=Inner, on=[(CAST(date_dim.d_date_sk AS Float64)@2, ws_sold_date_sk@7)], projection=[d_year@1, c_customer_id@3, c_first_name@4, c_last_name@5, c_preferred_cust_flag@6, c_birth_country@7, c_login@8, c_email_address@9, ws_ext_discount_amt@11, ws_ext_list_price@12] @@ -1160,23 +1069,20 @@ mod tests { │ HashJoinExec: mode=CollectLeft, join_type=Inner, on=[(CAST(customer.c_customer_sk AS Float64)@8, ws_bill_customer_sk@1)], projection=[c_customer_id@1, c_first_name@2, c_last_name@3, c_preferred_cust_flag@4, c_birth_country@5, c_login@6, c_email_address@7, ws_sold_date_sk@9, ws_ext_discount_amt@11, ws_ext_list_price@12] │ CoalescePartitionsExec │ [Stage 2] => NetworkBroadcastExec: partitions_per_consumer=3, stage_partitions=9, input_tasks=2 - │ PartitionIsolatorExec: tasks=3 partitions=6 - │ DataSourceExec: file_groups={6 groups: [[/testdata/tpcds/plans_sf1_partitions4/web_sales/part-0.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/web_sales/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/web_sales/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/web_sales/part-2.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/web_sales/part-2.parquet:..], ...]}, projection=[ws_sold_date_sk, ws_bill_customer_sk, ws_ext_discount_amt, ws_ext_list_price], file_type=parquet, predicate=DynamicFilter [ empty ] AND DynamicFilter [ empty ] + │ DistributedLeafExec: DataSourceExec: file_groups={3 groups: [[/testdata/tpcds/plans_sf1_partitions4/web_sales/part-0.parquet:.., /testdata/tpcds/plans_sf1_partitions4/web_sales/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/web_sales/part-1.parquet:.., /testdata/tpcds/plans_sf1_partitions4/web_sales/part-2.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/web_sales/part-2.parquet:.., /testdata/tpcds/plans_sf1_partitions4/web_sales/part-3.parquet:..]]}, projection=[ws_sold_date_sk, ws_bill_customer_sk, ws_ext_discount_amt, ws_ext_list_price], file_type=parquet, predicate=DynamicFilter [ empty ] AND DynamicFilter [ empty ] └────────────────────────────────────────────────── ┌───── Stage 1 ── Tasks: t0:[p0..p8] t1:[p9..p17] │ BroadcastExec: input_partitions=3, consumer_tasks=3, output_partitions=9 │ ProjectionExec: expr=[d_date_sk@0 as d_date_sk, d_year@1 as d_year, CAST(d_date_sk@0 AS Float64) as CAST(date_dim.d_date_sk AS Float64)] │ FilterExec: d_year@1 = 2001 │ RepartitionExec: partitioning=RoundRobinBatch(3), input_partitions=2 - │ PartitionIsolatorExec: tasks=2 partitions=4 - │ DataSourceExec: file_groups={4 groups: [[/testdata/tpcds/plans_sf1_partitions4/date_dim/part-0.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-2.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-3.parquet]]}, projection=[d_date_sk, d_year], file_type=parquet, predicate=d_year@6 = 2001, pruning_predicate=d_year_null_count@2 != row_count@3 AND d_year_min@0 <= 2001 AND 2001 <= d_year_max@1, required_guarantees=[d_year in (2001)] + │ DistributedLeafExec: DataSourceExec: file_groups={2 groups: [[/testdata/tpcds/plans_sf1_partitions4/date_dim/part-0.parquet, /testdata/tpcds/plans_sf1_partitions4/date_dim/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-2.parquet, /testdata/tpcds/plans_sf1_partitions4/date_dim/part-3.parquet]]}, projection=[d_date_sk, d_year], file_type=parquet, predicate=d_year@6 = 2001, pruning_predicate=d_year_null_count@2 != row_count@3 AND d_year_min@0 <= 2001 AND 2001 <= d_year_max@1, required_guarantees=[d_year in (2001)] └────────────────────────────────────────────────── ┌───── Stage 2 ── Tasks: t0:[p0..p8] t1:[p9..p17] │ BroadcastExec: input_partitions=3, consumer_tasks=3, output_partitions=9 │ ProjectionExec: expr=[c_customer_sk@0 as c_customer_sk, c_customer_id@1 as c_customer_id, c_first_name@2 as c_first_name, c_last_name@3 as c_last_name, c_preferred_cust_flag@4 as c_preferred_cust_flag, c_birth_country@5 as c_birth_country, c_login@6 as c_login, c_email_address@7 as c_email_address, CAST(c_customer_sk@0 AS Float64) as CAST(customer.c_customer_sk AS Float64)] │ RepartitionExec: partitioning=RoundRobinBatch(3), input_partitions=2 - │ PartitionIsolatorExec: tasks=2 partitions=4 - │ DataSourceExec: file_groups={4 groups: [[/testdata/tpcds/plans_sf1_partitions4/customer/part-0.parquet], [/testdata/tpcds/plans_sf1_partitions4/customer/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/customer/part-2.parquet], [/testdata/tpcds/plans_sf1_partitions4/customer/part-3.parquet]]}, projection=[c_customer_sk, c_customer_id, c_first_name, c_last_name, c_preferred_cust_flag, c_birth_country, c_login, c_email_address], file_type=parquet + │ DistributedLeafExec: DataSourceExec: file_groups={2 groups: [[/testdata/tpcds/plans_sf1_partitions4/customer/part-0.parquet, /testdata/tpcds/plans_sf1_partitions4/customer/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/customer/part-2.parquet, /testdata/tpcds/plans_sf1_partitions4/customer/part-3.parquet]]}, projection=[c_customer_sk, c_customer_id, c_first_name, c_last_name, c_preferred_cust_flag, c_birth_country, c_login, c_email_address], file_type=parquet └────────────────────────────────────────────────── ┌───── Stage 8 ── Tasks: t0:[p0..p5] t1:[p6..p11] │ BroadcastExec: input_partitions=3, consumer_tasks=2, output_partitions=6 @@ -1187,7 +1093,7 @@ mod tests { │ [Stage 7] => NetworkShuffleExec: output_partitions=3, input_tasks=3 └────────────────────────────────────────────────── ┌───── Stage 7 ── Tasks: t0:[p0..p5] t1:[p0..p5] t2:[p0..p5] - │ RepartitionExec: partitioning=Hash([c_customer_id@0, c_first_name@1, c_last_name@2, c_preferred_cust_flag@3, c_birth_country@4, c_login@5, c_email_address@6, d_year@7], 6), input_partitions=2 + │ RepartitionExec: partitioning=Hash([c_customer_id@0, c_first_name@1, c_last_name@2, c_preferred_cust_flag@3, c_birth_country@4, c_login@5, c_email_address@6, d_year@7], 6), input_partitions=3 │ AggregateExec: mode=Partial, gby=[c_customer_id@0 as c_customer_id, c_first_name@1 as c_first_name, c_last_name@2 as c_last_name, c_preferred_cust_flag@3 as c_preferred_cust_flag, c_birth_country@4 as c_birth_country, c_login@5 as c_login, c_email_address@6 as c_email_address, d_year@9 as d_year], aggr=[sum(store_sales.ss_ext_list_price - store_sales.ss_ext_discount_amt)], ordering_mode=PartiallySorted([7]) │ ProjectionExec: expr=[c_customer_id@1 as c_customer_id, c_first_name@2 as c_first_name, c_last_name@3 as c_last_name, c_preferred_cust_flag@4 as c_preferred_cust_flag, c_birth_country@5 as c_birth_country, c_login@6 as c_login, c_email_address@7 as c_email_address, ss_ext_discount_amt@8 as ss_ext_discount_amt, ss_ext_list_price@9 as ss_ext_list_price, d_year@0 as d_year] │ HashJoinExec: mode=CollectLeft, join_type=Inner, on=[(CAST(date_dim.d_date_sk AS Float64)@2, ss_sold_date_sk@7)], projection=[d_year@1, c_customer_id@3, c_first_name@4, c_last_name@5, c_preferred_cust_flag@6, c_birth_country@7, c_login@8, c_email_address@9, ss_ext_discount_amt@11, ss_ext_list_price@12] @@ -1196,26 +1102,23 @@ mod tests { │ HashJoinExec: mode=CollectLeft, join_type=Inner, on=[(CAST(customer.c_customer_sk AS Float64)@8, ss_customer_sk@1)], projection=[c_customer_id@1, c_first_name@2, c_last_name@3, c_preferred_cust_flag@4, c_birth_country@5, c_login@6, c_email_address@7, ss_sold_date_sk@9, ss_ext_discount_amt@11, ss_ext_list_price@12] │ CoalescePartitionsExec │ [Stage 6] => NetworkBroadcastExec: partitions_per_consumer=3, stage_partitions=9, input_tasks=2 - │ PartitionIsolatorExec: tasks=3 partitions=6 - │ DataSourceExec: file_groups={6 groups: [[/testdata/tpcds/plans_sf1_partitions4/store_sales/part-0.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_sales/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_sales/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_sales/part-2.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_sales/part-2.parquet:..], ...]}, projection=[ss_sold_date_sk, ss_customer_sk, ss_ext_discount_amt, ss_ext_list_price], file_type=parquet, predicate=DynamicFilter [ empty ] AND DynamicFilter [ empty ] + │ DistributedLeafExec: DataSourceExec: file_groups={3 groups: [[/testdata/tpcds/plans_sf1_partitions4/store_sales/part-0.parquet:.., /testdata/tpcds/plans_sf1_partitions4/store_sales/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_sales/part-1.parquet:.., /testdata/tpcds/plans_sf1_partitions4/store_sales/part-2.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_sales/part-2.parquet:.., /testdata/tpcds/plans_sf1_partitions4/store_sales/part-3.parquet:..]]}, projection=[ss_sold_date_sk, ss_customer_sk, ss_ext_discount_amt, ss_ext_list_price], file_type=parquet, predicate=DynamicFilter [ empty ] AND DynamicFilter [ empty ] └────────────────────────────────────────────────── ┌───── Stage 5 ── Tasks: t0:[p0..p8] t1:[p9..p17] │ BroadcastExec: input_partitions=3, consumer_tasks=3, output_partitions=9 │ ProjectionExec: expr=[d_date_sk@0 as d_date_sk, d_year@1 as d_year, CAST(d_date_sk@0 AS Float64) as CAST(date_dim.d_date_sk AS Float64)] │ FilterExec: d_year@1 = 2001 │ RepartitionExec: partitioning=RoundRobinBatch(3), input_partitions=2 - │ PartitionIsolatorExec: tasks=2 partitions=4 - │ DataSourceExec: file_groups={4 groups: [[/testdata/tpcds/plans_sf1_partitions4/date_dim/part-0.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-2.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-3.parquet]]}, projection=[d_date_sk, d_year], file_type=parquet, predicate=d_year@6 = 2001, pruning_predicate=d_year_null_count@2 != row_count@3 AND d_year_min@0 <= 2001 AND 2001 <= d_year_max@1, required_guarantees=[d_year in (2001)] + │ DistributedLeafExec: DataSourceExec: file_groups={2 groups: [[/testdata/tpcds/plans_sf1_partitions4/date_dim/part-0.parquet, /testdata/tpcds/plans_sf1_partitions4/date_dim/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-2.parquet, /testdata/tpcds/plans_sf1_partitions4/date_dim/part-3.parquet]]}, projection=[d_date_sk, d_year], file_type=parquet, predicate=d_year@6 = 2001, pruning_predicate=d_year_null_count@2 != row_count@3 AND d_year_min@0 <= 2001 AND 2001 <= d_year_max@1, required_guarantees=[d_year in (2001)] └────────────────────────────────────────────────── ┌───── Stage 6 ── Tasks: t0:[p0..p8] t1:[p9..p17] │ BroadcastExec: input_partitions=3, consumer_tasks=3, output_partitions=9 │ ProjectionExec: expr=[c_customer_sk@0 as c_customer_sk, c_customer_id@1 as c_customer_id, c_first_name@2 as c_first_name, c_last_name@3 as c_last_name, c_preferred_cust_flag@4 as c_preferred_cust_flag, c_birth_country@5 as c_birth_country, c_login@6 as c_login, c_email_address@7 as c_email_address, CAST(c_customer_sk@0 AS Float64) as CAST(customer.c_customer_sk AS Float64)] │ RepartitionExec: partitioning=RoundRobinBatch(3), input_partitions=2 - │ PartitionIsolatorExec: tasks=2 partitions=4 - │ DataSourceExec: file_groups={4 groups: [[/testdata/tpcds/plans_sf1_partitions4/customer/part-0.parquet], [/testdata/tpcds/plans_sf1_partitions4/customer/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/customer/part-2.parquet], [/testdata/tpcds/plans_sf1_partitions4/customer/part-3.parquet]]}, projection=[c_customer_sk, c_customer_id, c_first_name, c_last_name, c_preferred_cust_flag, c_birth_country, c_login, c_email_address], file_type=parquet, predicate=DynamicFilter [ empty ] + │ DistributedLeafExec: DataSourceExec: file_groups={2 groups: [[/testdata/tpcds/plans_sf1_partitions4/customer/part-0.parquet, /testdata/tpcds/plans_sf1_partitions4/customer/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/customer/part-2.parquet, /testdata/tpcds/plans_sf1_partitions4/customer/part-3.parquet]]}, projection=[c_customer_sk, c_customer_id, c_first_name, c_last_name, c_preferred_cust_flag, c_birth_country, c_login, c_email_address], file_type=parquet, predicate=DynamicFilter [ empty ] └────────────────────────────────────────────────── ┌───── Stage 11 ── Tasks: t0:[p0..p5] t1:[p0..p5] t2:[p0..p5] - │ RepartitionExec: partitioning=Hash([c_customer_id@0, c_first_name@1, c_last_name@2, c_preferred_cust_flag@3, c_birth_country@4, c_login@5, c_email_address@6, d_year@7], 6), input_partitions=2 + │ RepartitionExec: partitioning=Hash([c_customer_id@0, c_first_name@1, c_last_name@2, c_preferred_cust_flag@3, c_birth_country@4, c_login@5, c_email_address@6, d_year@7], 6), input_partitions=3 │ AggregateExec: mode=Partial, gby=[c_customer_id@0 as c_customer_id, c_first_name@1 as c_first_name, c_last_name@2 as c_last_name, c_preferred_cust_flag@3 as c_preferred_cust_flag, c_birth_country@4 as c_birth_country, c_login@5 as c_login, c_email_address@6 as c_email_address, d_year@9 as d_year], aggr=[sum(store_sales.ss_ext_list_price - store_sales.ss_ext_discount_amt)], ordering_mode=PartiallySorted([7]) │ ProjectionExec: expr=[c_customer_id@1 as c_customer_id, c_first_name@2 as c_first_name, c_last_name@3 as c_last_name, c_preferred_cust_flag@4 as c_preferred_cust_flag, c_birth_country@5 as c_birth_country, c_login@6 as c_login, c_email_address@7 as c_email_address, ss_ext_discount_amt@8 as ss_ext_discount_amt, ss_ext_list_price@9 as ss_ext_list_price, d_year@0 as d_year] │ HashJoinExec: mode=CollectLeft, join_type=Inner, on=[(CAST(date_dim.d_date_sk AS Float64)@2, ss_sold_date_sk@7)], projection=[d_year@1, c_customer_id@3, c_first_name@4, c_last_name@5, c_preferred_cust_flag@6, c_birth_country@7, c_login@8, c_email_address@9, ss_ext_discount_amt@11, ss_ext_list_price@12] @@ -1224,26 +1127,23 @@ mod tests { │ HashJoinExec: mode=CollectLeft, join_type=Inner, on=[(CAST(customer.c_customer_sk AS Float64)@8, ss_customer_sk@1)], projection=[c_customer_id@1, c_first_name@2, c_last_name@3, c_preferred_cust_flag@4, c_birth_country@5, c_login@6, c_email_address@7, ss_sold_date_sk@9, ss_ext_discount_amt@11, ss_ext_list_price@12] │ CoalescePartitionsExec │ [Stage 10] => NetworkBroadcastExec: partitions_per_consumer=3, stage_partitions=9, input_tasks=2 - │ PartitionIsolatorExec: tasks=3 partitions=6 - │ DataSourceExec: file_groups={6 groups: [[/testdata/tpcds/plans_sf1_partitions4/store_sales/part-0.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_sales/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_sales/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_sales/part-2.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_sales/part-2.parquet:..], ...]}, projection=[ss_sold_date_sk, ss_customer_sk, ss_ext_discount_amt, ss_ext_list_price], file_type=parquet, predicate=DynamicFilter [ empty ] AND DynamicFilter [ empty ] + │ DistributedLeafExec: DataSourceExec: file_groups={3 groups: [[/testdata/tpcds/plans_sf1_partitions4/store_sales/part-0.parquet:.., /testdata/tpcds/plans_sf1_partitions4/store_sales/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_sales/part-1.parquet:.., /testdata/tpcds/plans_sf1_partitions4/store_sales/part-2.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_sales/part-2.parquet:.., /testdata/tpcds/plans_sf1_partitions4/store_sales/part-3.parquet:..]]}, projection=[ss_sold_date_sk, ss_customer_sk, ss_ext_discount_amt, ss_ext_list_price], file_type=parquet, predicate=DynamicFilter [ empty ] AND DynamicFilter [ empty ] └────────────────────────────────────────────────── ┌───── Stage 9 ── Tasks: t0:[p0..p8] t1:[p9..p17] │ BroadcastExec: input_partitions=3, consumer_tasks=3, output_partitions=9 │ ProjectionExec: expr=[d_date_sk@0 as d_date_sk, d_year@1 as d_year, CAST(d_date_sk@0 AS Float64) as CAST(date_dim.d_date_sk AS Float64)] │ FilterExec: d_year@1 = 2002 │ RepartitionExec: partitioning=RoundRobinBatch(3), input_partitions=2 - │ PartitionIsolatorExec: tasks=2 partitions=4 - │ DataSourceExec: file_groups={4 groups: [[/testdata/tpcds/plans_sf1_partitions4/date_dim/part-0.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-2.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-3.parquet]]}, projection=[d_date_sk, d_year], file_type=parquet, predicate=d_year@6 = 2002, pruning_predicate=d_year_null_count@2 != row_count@3 AND d_year_min@0 <= 2002 AND 2002 <= d_year_max@1, required_guarantees=[d_year in (2002)] + │ DistributedLeafExec: DataSourceExec: file_groups={2 groups: [[/testdata/tpcds/plans_sf1_partitions4/date_dim/part-0.parquet, /testdata/tpcds/plans_sf1_partitions4/date_dim/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-2.parquet, /testdata/tpcds/plans_sf1_partitions4/date_dim/part-3.parquet]]}, projection=[d_date_sk, d_year], file_type=parquet, predicate=d_year@6 = 2002, pruning_predicate=d_year_null_count@2 != row_count@3 AND d_year_min@0 <= 2002 AND 2002 <= d_year_max@1, required_guarantees=[d_year in (2002)] └────────────────────────────────────────────────── ┌───── Stage 10 ── Tasks: t0:[p0..p8] t1:[p9..p17] │ BroadcastExec: input_partitions=3, consumer_tasks=3, output_partitions=9 │ ProjectionExec: expr=[c_customer_sk@0 as c_customer_sk, c_customer_id@1 as c_customer_id, c_first_name@2 as c_first_name, c_last_name@3 as c_last_name, c_preferred_cust_flag@4 as c_preferred_cust_flag, c_birth_country@5 as c_birth_country, c_login@6 as c_login, c_email_address@7 as c_email_address, CAST(c_customer_sk@0 AS Float64) as CAST(customer.c_customer_sk AS Float64)] │ RepartitionExec: partitioning=RoundRobinBatch(3), input_partitions=2 - │ PartitionIsolatorExec: tasks=2 partitions=4 - │ DataSourceExec: file_groups={4 groups: [[/testdata/tpcds/plans_sf1_partitions4/customer/part-0.parquet], [/testdata/tpcds/plans_sf1_partitions4/customer/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/customer/part-2.parquet], [/testdata/tpcds/plans_sf1_partitions4/customer/part-3.parquet]]}, projection=[c_customer_sk, c_customer_id, c_first_name, c_last_name, c_preferred_cust_flag, c_birth_country, c_login, c_email_address], file_type=parquet, predicate=DynamicFilter [ empty ] + │ DistributedLeafExec: DataSourceExec: file_groups={2 groups: [[/testdata/tpcds/plans_sf1_partitions4/customer/part-0.parquet, /testdata/tpcds/plans_sf1_partitions4/customer/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/customer/part-2.parquet, /testdata/tpcds/plans_sf1_partitions4/customer/part-3.parquet]]}, projection=[c_customer_sk, c_customer_id, c_first_name, c_last_name, c_preferred_cust_flag, c_birth_country, c_login, c_email_address], file_type=parquet, predicate=DynamicFilter [ empty ] └────────────────────────────────────────────────── ┌───── Stage 15 ── Tasks: t0:[p0..p5] t1:[p0..p5] t2:[p0..p5] - │ RepartitionExec: partitioning=Hash([c_customer_id@0, c_first_name@1, c_last_name@2, c_preferred_cust_flag@3, c_birth_country@4, c_login@5, c_email_address@6, d_year@7], 6), input_partitions=2 + │ RepartitionExec: partitioning=Hash([c_customer_id@0, c_first_name@1, c_last_name@2, c_preferred_cust_flag@3, c_birth_country@4, c_login@5, c_email_address@6, d_year@7], 6), input_partitions=3 │ AggregateExec: mode=Partial, gby=[c_customer_id@0 as c_customer_id, c_first_name@1 as c_first_name, c_last_name@2 as c_last_name, c_preferred_cust_flag@3 as c_preferred_cust_flag, c_birth_country@4 as c_birth_country, c_login@5 as c_login, c_email_address@6 as c_email_address, d_year@9 as d_year], aggr=[sum(web_sales.ws_ext_list_price - web_sales.ws_ext_discount_amt)], ordering_mode=PartiallySorted([7]) │ ProjectionExec: expr=[c_customer_id@1 as c_customer_id, c_first_name@2 as c_first_name, c_last_name@3 as c_last_name, c_preferred_cust_flag@4 as c_preferred_cust_flag, c_birth_country@5 as c_birth_country, c_login@6 as c_login, c_email_address@7 as c_email_address, ws_ext_discount_amt@8 as ws_ext_discount_amt, ws_ext_list_price@9 as ws_ext_list_price, d_year@0 as d_year] │ HashJoinExec: mode=CollectLeft, join_type=Inner, on=[(CAST(date_dim.d_date_sk AS Float64)@2, ws_sold_date_sk@7)], projection=[d_year@1, c_customer_id@3, c_first_name@4, c_last_name@5, c_preferred_cust_flag@6, c_birth_country@7, c_login@8, c_email_address@9, ws_ext_discount_amt@11, ws_ext_list_price@12] @@ -1252,23 +1152,20 @@ mod tests { │ HashJoinExec: mode=CollectLeft, join_type=Inner, on=[(CAST(customer.c_customer_sk AS Float64)@8, ws_bill_customer_sk@1)], projection=[c_customer_id@1, c_first_name@2, c_last_name@3, c_preferred_cust_flag@4, c_birth_country@5, c_login@6, c_email_address@7, ws_sold_date_sk@9, ws_ext_discount_amt@11, ws_ext_list_price@12] │ CoalescePartitionsExec │ [Stage 14] => NetworkBroadcastExec: partitions_per_consumer=3, stage_partitions=9, input_tasks=2 - │ PartitionIsolatorExec: tasks=3 partitions=6 - │ DataSourceExec: file_groups={6 groups: [[/testdata/tpcds/plans_sf1_partitions4/web_sales/part-0.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/web_sales/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/web_sales/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/web_sales/part-2.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/web_sales/part-2.parquet:..], ...]}, projection=[ws_sold_date_sk, ws_bill_customer_sk, ws_ext_discount_amt, ws_ext_list_price], file_type=parquet, predicate=DynamicFilter [ empty ] AND DynamicFilter [ empty ] + │ DistributedLeafExec: DataSourceExec: file_groups={3 groups: [[/testdata/tpcds/plans_sf1_partitions4/web_sales/part-0.parquet:.., /testdata/tpcds/plans_sf1_partitions4/web_sales/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/web_sales/part-1.parquet:.., /testdata/tpcds/plans_sf1_partitions4/web_sales/part-2.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/web_sales/part-2.parquet:.., /testdata/tpcds/plans_sf1_partitions4/web_sales/part-3.parquet:..]]}, projection=[ws_sold_date_sk, ws_bill_customer_sk, ws_ext_discount_amt, ws_ext_list_price], file_type=parquet, predicate=DynamicFilter [ empty ] AND DynamicFilter [ empty ] └────────────────────────────────────────────────── ┌───── Stage 13 ── Tasks: t0:[p0..p8] t1:[p9..p17] │ BroadcastExec: input_partitions=3, consumer_tasks=3, output_partitions=9 │ ProjectionExec: expr=[d_date_sk@0 as d_date_sk, d_year@1 as d_year, CAST(d_date_sk@0 AS Float64) as CAST(date_dim.d_date_sk AS Float64)] │ FilterExec: d_year@1 = 2002 │ RepartitionExec: partitioning=RoundRobinBatch(3), input_partitions=2 - │ PartitionIsolatorExec: tasks=2 partitions=4 - │ DataSourceExec: file_groups={4 groups: [[/testdata/tpcds/plans_sf1_partitions4/date_dim/part-0.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-2.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-3.parquet]]}, projection=[d_date_sk, d_year], file_type=parquet, predicate=d_year@6 = 2002, pruning_predicate=d_year_null_count@2 != row_count@3 AND d_year_min@0 <= 2002 AND 2002 <= d_year_max@1, required_guarantees=[d_year in (2002)] + │ DistributedLeafExec: DataSourceExec: file_groups={2 groups: [[/testdata/tpcds/plans_sf1_partitions4/date_dim/part-0.parquet, /testdata/tpcds/plans_sf1_partitions4/date_dim/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-2.parquet, /testdata/tpcds/plans_sf1_partitions4/date_dim/part-3.parquet]]}, projection=[d_date_sk, d_year], file_type=parquet, predicate=d_year@6 = 2002, pruning_predicate=d_year_null_count@2 != row_count@3 AND d_year_min@0 <= 2002 AND 2002 <= d_year_max@1, required_guarantees=[d_year in (2002)] └────────────────────────────────────────────────── ┌───── Stage 14 ── Tasks: t0:[p0..p8] t1:[p9..p17] │ BroadcastExec: input_partitions=3, consumer_tasks=3, output_partitions=9 │ ProjectionExec: expr=[c_customer_sk@0 as c_customer_sk, c_customer_id@1 as c_customer_id, c_first_name@2 as c_first_name, c_last_name@3 as c_last_name, c_preferred_cust_flag@4 as c_preferred_cust_flag, c_birth_country@5 as c_birth_country, c_login@6 as c_login, c_email_address@7 as c_email_address, CAST(c_customer_sk@0 AS Float64) as CAST(customer.c_customer_sk AS Float64)] │ RepartitionExec: partitioning=RoundRobinBatch(3), input_partitions=2 - │ PartitionIsolatorExec: tasks=2 partitions=4 - │ DataSourceExec: file_groups={4 groups: [[/testdata/tpcds/plans_sf1_partitions4/customer/part-0.parquet], [/testdata/tpcds/plans_sf1_partitions4/customer/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/customer/part-2.parquet], [/testdata/tpcds/plans_sf1_partitions4/customer/part-3.parquet]]}, projection=[c_customer_sk, c_customer_id, c_first_name, c_last_name, c_preferred_cust_flag, c_birth_country, c_login, c_email_address], file_type=parquet, predicate=DynamicFilter [ empty ] + │ DistributedLeafExec: DataSourceExec: file_groups={2 groups: [[/testdata/tpcds/plans_sf1_partitions4/customer/part-0.parquet, /testdata/tpcds/plans_sf1_partitions4/customer/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/customer/part-2.parquet, /testdata/tpcds/plans_sf1_partitions4/customer/part-3.parquet]]}, projection=[c_customer_sk, c_customer_id, c_first_name, c_last_name, c_preferred_cust_flag, c_birth_country, c_login, c_email_address], file_type=parquet, predicate=DynamicFilter [ empty ] └────────────────────────────────────────────────── "); Ok(()) @@ -1291,7 +1188,7 @@ mod tests { │ [Stage 3] => NetworkShuffleExec: output_partitions=3, input_tasks=3 └────────────────────────────────────────────────── ┌───── Stage 3 ── Tasks: t0:[p0..p5] t1:[p0..p5] t2:[p0..p5] - │ RepartitionExec: partitioning=Hash([i_item_id@0, i_item_desc@1, i_category@2, i_class@3, i_current_price@4], 6), input_partitions=2 + │ RepartitionExec: partitioning=Hash([i_item_id@0, i_item_desc@1, i_category@2, i_class@3, i_current_price@4], 6), input_partitions=3 │ AggregateExec: mode=Partial, gby=[i_item_id@1 as i_item_id, i_item_desc@2 as i_item_desc, i_category@5 as i_category, i_class@4 as i_class, i_current_price@3 as i_current_price], aggr=[sum(web_sales.ws_ext_sales_price)] │ HashJoinExec: mode=CollectLeft, join_type=Inner, on=[(CAST(date_dim.d_date_sk AS Float64)@1, ws_sold_date_sk@0)], projection=[ws_ext_sales_price@3, i_item_id@4, i_item_desc@5, i_current_price@6, i_class@7, i_category@8] │ CoalescePartitionsExec @@ -1300,23 +1197,20 @@ mod tests { │ HashJoinExec: mode=CollectLeft, join_type=Inner, on=[(i_item_sk@0, ws_item_sk@1)], projection=[i_item_id@1, i_item_desc@2, i_current_price@3, i_class@4, i_category@5, ws_sold_date_sk@6, ws_ext_sales_price@8] │ CoalescePartitionsExec │ [Stage 2] => NetworkBroadcastExec: partitions_per_consumer=3, stage_partitions=9, input_tasks=2 - │ PartitionIsolatorExec: tasks=3 partitions=6 - │ DataSourceExec: file_groups={6 groups: [[/testdata/tpcds/plans_sf1_partitions4/web_sales/part-0.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/web_sales/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/web_sales/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/web_sales/part-2.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/web_sales/part-2.parquet:..], ...]}, projection=[ws_sold_date_sk, ws_item_sk, ws_ext_sales_price], file_type=parquet, predicate=DynamicFilter [ empty ] AND DynamicFilter [ empty ] + │ DistributedLeafExec: DataSourceExec: file_groups={3 groups: [[/testdata/tpcds/plans_sf1_partitions4/web_sales/part-0.parquet:.., /testdata/tpcds/plans_sf1_partitions4/web_sales/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/web_sales/part-1.parquet:.., /testdata/tpcds/plans_sf1_partitions4/web_sales/part-2.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/web_sales/part-2.parquet:.., /testdata/tpcds/plans_sf1_partitions4/web_sales/part-3.parquet:..]]}, projection=[ws_sold_date_sk, ws_item_sk, ws_ext_sales_price], file_type=parquet, predicate=DynamicFilter [ empty ] AND DynamicFilter [ empty ] └────────────────────────────────────────────────── ┌───── Stage 1 ── Tasks: t0:[p0..p8] t1:[p9..p17] │ BroadcastExec: input_partitions=3, consumer_tasks=3, output_partitions=9 │ ProjectionExec: expr=[d_date_sk@0 as d_date_sk, CAST(d_date_sk@0 AS Float64) as CAST(date_dim.d_date_sk AS Float64)] │ FilterExec: d_date@1 >= 1999-02-22 AND d_date@1 <= 1999-03-24, projection=[d_date_sk@0] │ RepartitionExec: partitioning=RoundRobinBatch(3), input_partitions=2 - │ PartitionIsolatorExec: tasks=2 partitions=4 - │ DataSourceExec: file_groups={4 groups: [[/testdata/tpcds/plans_sf1_partitions4/date_dim/part-0.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-2.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-3.parquet]]}, projection=[d_date_sk, d_date], file_type=parquet, predicate=d_date@2 >= 1999-02-22 AND d_date@2 <= 1999-03-24, pruning_predicate=d_date_null_count@1 != row_count@2 AND d_date_max@0 >= 1999-02-22 AND d_date_null_count@1 != row_count@2 AND d_date_min@3 <= 1999-03-24, required_guarantees=[] + │ DistributedLeafExec: DataSourceExec: file_groups={2 groups: [[/testdata/tpcds/plans_sf1_partitions4/date_dim/part-0.parquet, /testdata/tpcds/plans_sf1_partitions4/date_dim/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-2.parquet, /testdata/tpcds/plans_sf1_partitions4/date_dim/part-3.parquet]]}, projection=[d_date_sk, d_date], file_type=parquet, predicate=d_date@2 >= 1999-02-22 AND d_date@2 <= 1999-03-24, pruning_predicate=d_date_null_count@1 != row_count@2 AND d_date_max@0 >= 1999-02-22 AND d_date_null_count@1 != row_count@2 AND d_date_min@3 <= 1999-03-24, required_guarantees=[] └────────────────────────────────────────────────── ┌───── Stage 2 ── Tasks: t0:[p0..p8] t1:[p9..p17] │ BroadcastExec: input_partitions=3, consumer_tasks=3, output_partitions=9 │ FilterExec: i_category@5 = Sports OR i_category@5 = Books OR i_category@5 = Home │ RepartitionExec: partitioning=RoundRobinBatch(3), input_partitions=2 - │ PartitionIsolatorExec: tasks=2 partitions=4 - │ DataSourceExec: file_groups={4 groups: [[/testdata/tpcds/plans_sf1_partitions4/item/part-0.parquet], [/testdata/tpcds/plans_sf1_partitions4/item/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/item/part-2.parquet], [/testdata/tpcds/plans_sf1_partitions4/item/part-3.parquet]]}, projection=[i_item_sk, i_item_id, i_item_desc, i_current_price, i_class, i_category], file_type=parquet, predicate=i_category@12 = Sports OR i_category@12 = Books OR i_category@12 = Home, pruning_predicate=i_category_null_count@2 != row_count@3 AND i_category_min@0 <= Sports AND Sports <= i_category_max@1 OR i_category_null_count@2 != row_count@3 AND i_category_min@0 <= Books AND Books <= i_category_max@1 OR i_category_null_count@2 != row_count@3 AND i_category_min@0 <= Home AND Home <= i_category_max@1, required_guarantees=[i_category in (Books, Home, Sports)] + │ DistributedLeafExec: DataSourceExec: file_groups={2 groups: [[/testdata/tpcds/plans_sf1_partitions4/item/part-0.parquet, /testdata/tpcds/plans_sf1_partitions4/item/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/item/part-2.parquet, /testdata/tpcds/plans_sf1_partitions4/item/part-3.parquet]]}, projection=[i_item_sk, i_item_id, i_item_desc, i_current_price, i_class, i_category], file_type=parquet, predicate=i_category@12 = Sports OR i_category@12 = Books OR i_category@12 = Home, pruning_predicate=i_category_null_count@2 != row_count@3 AND i_category_min@0 <= Sports AND Sports <= i_category_max@1 OR i_category_null_count@2 != row_count@3 AND i_category_min@0 <= Books AND Books <= i_category_max@1 OR i_category_null_count@2 != row_count@3 AND i_category_min@0 <= Home AND Home <= i_category_max@1, required_guarantees=[i_category in (Books, Home, Sports)] └────────────────────────────────────────────────── "#); Ok(()) @@ -1324,7 +1218,7 @@ mod tests { #[tokio::test] async fn test_tpcds_13() -> Result<()> { let display = test_tpcds_query("q13").await?; - assert_snapshot!(display, @" + assert_snapshot!(display, @r" ┌───── DistributedExec ── Tasks: t0:[p0] │ ProjectionExec: expr=[avg(store_sales.ss_quantity)@0 as avg1, avg(store_sales.ss_ext_sales_price)@1 as avg2, avg(store_sales.ss_ext_wholesale_cost)@2 as avg3, sum(store_sales.ss_ext_wholesale_cost)@3 as sum(store_sales.ss_ext_wholesale_cost)] │ AggregateExec: mode=Final, gby=[], aggr=[avg(store_sales.ss_quantity), avg(store_sales.ss_ext_sales_price), avg(store_sales.ss_ext_wholesale_cost), sum(store_sales.ss_ext_wholesale_cost)] @@ -1339,8 +1233,7 @@ mod tests { │ ProjectionExec: expr=[d_date_sk@0 as d_date_sk, CAST(d_date_sk@0 AS Float64) as CAST(date_dim.d_date_sk AS Float64)] │ FilterExec: d_year@1 = 2001, projection=[d_date_sk@0] │ RepartitionExec: partitioning=RoundRobinBatch(3), input_partitions=2 - │ PartitionIsolatorExec: tasks=3 partitions=4 - │ DataSourceExec: file_groups={4 groups: [[/testdata/tpcds/plans_sf1_partitions4/date_dim/part-0.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-2.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-3.parquet]]}, projection=[d_date_sk, d_year], file_type=parquet, predicate=d_year@6 = 2001 AND DynamicFilter [ empty ], pruning_predicate=d_year_null_count@2 != row_count@3 AND d_year_min@0 <= 2001 AND 2001 <= d_year_max@1, required_guarantees=[d_year in (2001)] + │ DistributedLeafExec: DataSourceExec: file_groups={2 groups: [[/testdata/tpcds/plans_sf1_partitions4/date_dim/part-0.parquet, /testdata/tpcds/plans_sf1_partitions4/date_dim/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-2.parquet, /testdata/tpcds/plans_sf1_partitions4/date_dim/part-3.parquet]]}, projection=[d_date_sk, d_year], file_type=parquet, predicate=d_year@6 = 2001 AND DynamicFilter [ empty ], pruning_predicate=d_year_null_count@2 != row_count@3 AND d_year_min@0 <= 2001 AND 2001 <= d_year_max@1, required_guarantees=[d_year in (2001)] └────────────────────────────────────────────────── ┌───── Stage 5 ── Tasks: t0:[p0..p8] t1:[p9..p17] t2:[p18..p26] │ BroadcastExec: input_partitions=3, consumer_tasks=3, output_partitions=9 @@ -1352,44 +1245,39 @@ mod tests { │ ProjectionExec: expr=[__common_expr_5@0 OR ca_state@2 = OH OR ca_state@2 = OR OR ca_state@2 = NM OR ca_state@2 = KY OR ca_state@2 = VA OR __common_expr_5@0 OR ca_state@2 = MS as __common_expr_4, ca_address_sk@1 as ca_address_sk, ca_state@2 as ca_state, ca_country@3 as ca_country] │ ProjectionExec: expr=[ca_state@1 = TX as __common_expr_5, ca_address_sk@0 as ca_address_sk, ca_state@1 as ca_state, ca_country@2 as ca_country] │ RepartitionExec: partitioning=RoundRobinBatch(3), input_partitions=2 - │ PartitionIsolatorExec: tasks=3 partitions=4 - │ DataSourceExec: file_groups={4 groups: [[/testdata/tpcds/plans_sf1_partitions4/customer_address/part-0.parquet], [/testdata/tpcds/plans_sf1_partitions4/customer_address/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/customer_address/part-2.parquet], [/testdata/tpcds/plans_sf1_partitions4/customer_address/part-3.parquet]]}, projection=[ca_address_sk, ca_state, ca_country], file_type=parquet, predicate=(ca_state@8 = TX OR ca_state@8 = OH OR ca_state@8 = OR OR ca_state@8 = NM OR ca_state@8 = KY OR ca_state@8 = VA OR ca_state@8 = TX OR ca_state@8 = MS) AND (ca_state@8 = TX OR ca_state@8 = OH OR ca_state@8 = OR OR ca_state@8 = NM OR ca_state@8 = KY OR ca_state@8 = VA OR ca_state@8 = TX OR ca_state@8 = MS) AND ca_country@10 = United States AND ca_state@8 IN (SET) ([TX, OH, OR, NM, KY, VA, MS]) AND DynamicFilter [ empty ], pruning_predicate=(ca_state_null_count@2 != row_count@3 AND ca_state_min@0 <= TX AND TX <= ca_state_max@1 OR ca_state_null_count@2 != row_count@3 AND ca_state_min@0 <= OH AND OH <= ca_state_max@1 OR ca_state_null_count@2 != row_count@3 AND ca_state_min@0 <= OR AND OR <= ca_state_max@1 OR ca_state_null_count@2 != row_count@3 AND ca_state_min@0 <= NM AND NM <= ca_state_max@1 OR ca_state_null_count@2 != row_count@3 AND ca_state_min@0 <= KY AND KY <= ca_state_max@1 OR ca_state_null_count@2 != row_count@3 AND ca_state_min@0 <= VA AND VA <= ca_state_max@1 OR ca_state_null_count@2 != row_count@3 AND ca_state_min@0 <= TX AND TX <= ca_state_max@1 OR ca_state_null_count@2 != row_count@3 AND ca_state_min@0 <= MS AND MS <= ca_state_max@1) AND (ca_state_null_count@2 != row_count@3 AND ca_state_min@0 <= TX AND TX <= ca_state_max@1 OR ca_state_null_count@2 != row_count@3 AND ca_state_min@0 <= OH AND OH <= ca_state_max@1 OR ca_state_null_count@2 != row_count@3 AND ca_state_min@0 <= OR AND OR <= ca_state_max@1 OR ca_state_null_count@2 != row_count@3 AND ca_state_min@0 <= NM AND NM <= ca_state_max@1 OR ca_state_null_count@2 != row_count@3 AND ca_state_min@0 <= KY AND KY <= ca_state_max@1 OR ca_state_null_count@2 != row_count@3 AND ca_state_min@0 <= VA AND VA <= ca_state_max@1 OR ca_state_null_count@2 != row_count@3 AND ca_state_min@0 <= TX AND TX <= ca_state_max@1 OR ca_state_null_count@2 != row_count@3 AND ca_state_min@0 <= MS AND MS <= ca_state_max@1) AND ca_country_null_count@6 != row_count@3 AND ca_country_min@4 <= United States AND United States <= ca_country_max@5 AND (ca_state_null_count@2 != row_count@3 AND ca_state_min@0 <= TX AND TX <= ca_state_max@1 OR ca_state_null_count@2 != row_count@3 AND ca_state_min@0 <= OH AND OH <= ca_state_max@1 OR ca_state_null_count@2 != row_count@3 AND ca_state_min@0 <= OR AND OR <= ca_state_max@1 OR ca_state_null_count@2 != row_count@3 AND ca_state_min@0 <= NM AND NM <= ca_state_max@1 OR ca_state_null_count@2 != row_count@3 AND ca_state_min@0 <= KY AND KY <= ca_state_max@1 OR ca_state_null_count@2 != row_count@3 AND ca_state_min@0 <= VA AND VA <= ca_state_max@1 OR ca_state_null_count@2 != row_count@3 AND ca_state_min@0 <= MS AND MS <= ca_state_max@1), required_guarantees=[ca_country in (United States), ca_state in (KY, MS, NM, OH, OR, TX, VA)] + │ DistributedLeafExec: DataSourceExec: file_groups={2 groups: [[/testdata/tpcds/plans_sf1_partitions4/customer_address/part-0.parquet, /testdata/tpcds/plans_sf1_partitions4/customer_address/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/customer_address/part-2.parquet, /testdata/tpcds/plans_sf1_partitions4/customer_address/part-3.parquet]]}, projection=[ca_address_sk, ca_state, ca_country], file_type=parquet, predicate=(ca_state@8 = TX OR ca_state@8 = OH OR ca_state@8 = OR OR ca_state@8 = NM OR ca_state@8 = KY OR ca_state@8 = VA OR ca_state@8 = TX OR ca_state@8 = MS) AND (ca_state@8 = TX OR ca_state@8 = OH OR ca_state@8 = OR OR ca_state@8 = NM OR ca_state@8 = KY OR ca_state@8 = VA OR ca_state@8 = TX OR ca_state@8 = MS) AND ca_country@10 = United States AND ca_state@8 IN (SET) ([TX, OH, OR, NM, KY, VA, MS]) AND DynamicFilter [ empty ], pruning_predicate=(ca_state_null_count@2 != row_count@3 AND ca_state_min@0 <= TX AND TX <= ca_state_max@1 OR ca_state_null_count@2 != row_count@3 AND ca_state_min@0 <= OH AND OH <= ca_state_max@1 OR ca_state_null_count@2 != row_count@3 AND ca_state_min@0 <= OR AND OR <= ca_state_max@1 OR ca_state_null_count@2 != row_count@3 AND ca_state_min@0 <= NM AND NM <= ca_state_max@1 OR ca_state_null_count@2 != row_count@3 AND ca_state_min@0 <= KY AND KY <= ca_state_max@1 OR ca_state_null_count@2 != row_count@3 AND ca_state_min@0 <= VA AND VA <= ca_state_max@1 OR ca_state_null_count@2 != row_count@3 AND ca_state_min@0 <= TX AND TX <= ca_state_max@1 OR ca_state_null_count@2 != row_count@3 AND ca_state_min@0 <= MS AND MS <= ca_state_max@1) AND (ca_state_null_count@2 != row_count@3 AND ca_state_min@0 <= TX AND TX <= ca_state_max@1 OR ca_state_null_count@2 != row_count@3 AND ca_state_min@0 <= OH AND OH <= ca_state_max@1 OR ca_state_null_count@2 != row_count@3 AND ca_state_min@0 <= OR AND OR <= ca_state_max@1 OR ca_state_null_count@2 != row_count@3 AND ca_state_min@0 <= NM AND NM <= ca_state_max@1 OR ca_state_null_count@2 != row_count@3 AND ca_state_min@0 <= KY AND KY <= ca_state_max@1 OR ca_state_null_count@2 != row_count@3 AND ca_state_min@0 <= VA AND VA <= ca_state_max@1 OR ca_state_null_count@2 != row_count@3 AND ca_state_min@0 <= TX AND TX <= ca_state_max@1 OR ca_state_null_count@2 != row_count@3 AND ca_state_min@0 <= MS AND MS <= ca_state_max@1) AND ca_country_null_count@6 != row_count@3 AND ca_country_min@4 <= United States AND United States <= ca_country_max@5 AND (ca_state_null_count@2 != row_count@3 AND ca_state_min@0 <= TX AND TX <= ca_state_max@1 OR ca_state_null_count@2 != row_count@3 AND ca_state_min@0 <= OH AND OH <= ca_state_max@1 OR ca_state_null_count@2 != row_count@3 AND ca_state_min@0 <= OR AND OR <= ca_state_max@1 OR ca_state_null_count@2 != row_count@3 AND ca_state_min@0 <= NM AND NM <= ca_state_max@1 OR ca_state_null_count@2 != row_count@3 AND ca_state_min@0 <= KY AND KY <= ca_state_max@1 OR ca_state_null_count@2 != row_count@3 AND ca_state_min@0 <= VA AND VA <= ca_state_max@1 OR ca_state_null_count@2 != row_count@3 AND ca_state_min@0 <= MS AND MS <= ca_state_max@1), required_guarantees=[ca_country in (United States), ca_state in (KY, MS, NM, OH, OR, TX, VA)] └────────────────────────────────────────────────── ┌───── Stage 4 ── Tasks: t0:[p0..p8] t1:[p9..p17] t2:[p18..p26] │ BroadcastExec: input_partitions=3, consumer_tasks=3, output_partitions=9 │ HashJoinExec: mode=CollectLeft, join_type=Inner, on=[(ss_hdemo_sk@1, CAST(household_demographics.hd_demo_sk AS Float64)@2)], filter=cd_marital_status@1 = M AND cd_education_status@2 = Advanced Degree AND ss_sales_price@0 >= Some(10000),5,2 AND ss_sales_price@0 <= Some(15000),5,2 AND hd_dep_count@3 = 3 OR cd_marital_status@1 = S AND cd_education_status@2 = College AND ss_sales_price@0 >= Some(5000),5,2 AND ss_sales_price@0 <= Some(10000),5,2 AND hd_dep_count@3 = 1 OR cd_marital_status@1 = W AND cd_education_status@2 = 2 yr Degree AND ss_sales_price@0 >= Some(15000),5,2 AND ss_sales_price@0 <= Some(20000),5,2 AND hd_dep_count@3 = 1, projection=[ss_sold_date_sk@0, ss_addr_sk@2, ss_quantity@3, ss_ext_sales_price@5, ss_ext_wholesale_cost@6, ss_net_profit@7] │ CoalescePartitionsExec - │ [Stage 3] => NetworkBroadcastExec: partitions_per_consumer=2, stage_partitions=6, input_tasks=3 + │ [Stage 3] => NetworkBroadcastExec: partitions_per_consumer=3, stage_partitions=9, input_tasks=3 │ ProjectionExec: expr=[hd_demo_sk@0 as hd_demo_sk, hd_dep_count@1 as hd_dep_count, CAST(hd_demo_sk@0 AS Float64) as CAST(household_demographics.hd_demo_sk AS Float64)] │ FilterExec: (__common_expr_3@0 OR hd_dep_count@2 = 1) AND __common_expr_3@0, projection=[hd_demo_sk@1, hd_dep_count@2] │ ProjectionExec: expr=[hd_dep_count@1 = 3 OR hd_dep_count@1 = 1 as __common_expr_3, hd_demo_sk@0 as hd_demo_sk, hd_dep_count@1 as hd_dep_count] │ RepartitionExec: partitioning=RoundRobinBatch(3), input_partitions=2 - │ PartitionIsolatorExec: tasks=3 partitions=4 - │ DataSourceExec: file_groups={4 groups: [[/testdata/tpcds/plans_sf1_partitions4/household_demographics/part-0.parquet], [/testdata/tpcds/plans_sf1_partitions4/household_demographics/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/household_demographics/part-2.parquet], [/testdata/tpcds/plans_sf1_partitions4/household_demographics/part-3.parquet]]}, projection=[hd_demo_sk, hd_dep_count], file_type=parquet, predicate=(hd_dep_count@3 = 3 OR hd_dep_count@3 = 1 OR hd_dep_count@3 = 1) AND (hd_dep_count@3 = 3 OR hd_dep_count@3 = 1) AND DynamicFilter [ empty ], pruning_predicate=(hd_dep_count_null_count@2 != row_count@3 AND hd_dep_count_min@0 <= 3 AND 3 <= hd_dep_count_max@1 OR hd_dep_count_null_count@2 != row_count@3 AND hd_dep_count_min@0 <= 1 AND 1 <= hd_dep_count_max@1 OR hd_dep_count_null_count@2 != row_count@3 AND hd_dep_count_min@0 <= 1 AND 1 <= hd_dep_count_max@1) AND (hd_dep_count_null_count@2 != row_count@3 AND hd_dep_count_min@0 <= 3 AND 3 <= hd_dep_count_max@1 OR hd_dep_count_null_count@2 != row_count@3 AND hd_dep_count_min@0 <= 1 AND 1 <= hd_dep_count_max@1), required_guarantees=[hd_dep_count in (1, 3)] + │ DistributedLeafExec: DataSourceExec: file_groups={2 groups: [[/testdata/tpcds/plans_sf1_partitions4/household_demographics/part-0.parquet, /testdata/tpcds/plans_sf1_partitions4/household_demographics/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/household_demographics/part-2.parquet, /testdata/tpcds/plans_sf1_partitions4/household_demographics/part-3.parquet]]}, projection=[hd_demo_sk, hd_dep_count], file_type=parquet, predicate=(hd_dep_count@3 = 3 OR hd_dep_count@3 = 1 OR hd_dep_count@3 = 1) AND (hd_dep_count@3 = 3 OR hd_dep_count@3 = 1) AND DynamicFilter [ empty ], pruning_predicate=(hd_dep_count_null_count@2 != row_count@3 AND hd_dep_count_min@0 <= 3 AND 3 <= hd_dep_count_max@1 OR hd_dep_count_null_count@2 != row_count@3 AND hd_dep_count_min@0 <= 1 AND 1 <= hd_dep_count_max@1 OR hd_dep_count_null_count@2 != row_count@3 AND hd_dep_count_min@0 <= 1 AND 1 <= hd_dep_count_max@1) AND (hd_dep_count_null_count@2 != row_count@3 AND hd_dep_count_min@0 <= 3 AND 3 <= hd_dep_count_max@1 OR hd_dep_count_null_count@2 != row_count@3 AND hd_dep_count_min@0 <= 1 AND 1 <= hd_dep_count_max@1), required_guarantees=[hd_dep_count in (1, 3)] └────────────────────────────────────────────────── - ┌───── Stage 3 ── Tasks: t0:[p0..p5] t1:[p6..p11] t2:[p12..p17] - │ BroadcastExec: input_partitions=2, consumer_tasks=3, output_partitions=6 + ┌───── Stage 3 ── Tasks: t0:[p0..p8] t1:[p9..p17] t2:[p18..p26] + │ BroadcastExec: input_partitions=3, consumer_tasks=3, output_partitions=9 │ HashJoinExec: mode=CollectLeft, join_type=Inner, on=[(ss_cdemo_sk@1, CAST(customer_demographics.cd_demo_sk AS Float64)@3)], filter=cd_marital_status@1 = M AND cd_education_status@2 = Advanced Degree AND ss_sales_price@0 >= Some(10000),5,2 AND ss_sales_price@0 <= Some(15000),5,2 OR cd_marital_status@1 = S AND cd_education_status@2 = College AND ss_sales_price@0 >= Some(5000),5,2 AND ss_sales_price@0 <= Some(10000),5,2 OR cd_marital_status@1 = W AND cd_education_status@2 = 2 yr Degree AND ss_sales_price@0 >= Some(15000),5,2 AND ss_sales_price@0 <= Some(20000),5,2, projection=[ss_sold_date_sk@0, ss_hdemo_sk@2, ss_addr_sk@3, ss_quantity@4, ss_sales_price@5, ss_ext_sales_price@6, ss_ext_wholesale_cost@7, ss_net_profit@8, cd_marital_status@10, cd_education_status@11] │ CoalescePartitionsExec - │ [Stage 2] => NetworkBroadcastExec: partitions_per_consumer=2, stage_partitions=6, input_tasks=3 + │ [Stage 2] => NetworkBroadcastExec: partitions_per_consumer=3, stage_partitions=9, input_tasks=3 │ ProjectionExec: expr=[cd_demo_sk@0 as cd_demo_sk, cd_marital_status@1 as cd_marital_status, cd_education_status@2 as cd_education_status, CAST(cd_demo_sk@0 AS Float64) as CAST(customer_demographics.cd_demo_sk AS Float64)] │ FilterExec: cd_marital_status@1 = M AND cd_education_status@2 = Advanced Degree OR cd_marital_status@1 = S AND cd_education_status@2 = College OR cd_marital_status@1 = W AND cd_education_status@2 = 2 yr Degree - │ PartitionIsolatorExec: tasks=3 partitions=6 - │ DataSourceExec: file_groups={6 groups: [[/testdata/tpcds/plans_sf1_partitions4/customer_demographics/part-0.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/customer_demographics/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/customer_demographics/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/customer_demographics/part-2.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/customer_demographics/part-2.parquet:..], ...]}, projection=[cd_demo_sk, cd_marital_status, cd_education_status], file_type=parquet, predicate=(cd_marital_status@2 = M AND cd_education_status@3 = Advanced Degree OR cd_marital_status@2 = S AND cd_education_status@3 = College OR cd_marital_status@2 = W AND cd_education_status@3 = 2 yr Degree) AND DynamicFilter [ empty ], pruning_predicate=cd_marital_status_null_count@2 != row_count@3 AND cd_marital_status_min@0 <= M AND M <= cd_marital_status_max@1 AND cd_education_status_null_count@6 != row_count@3 AND cd_education_status_min@4 <= Advanced Degree AND Advanced Degree <= cd_education_status_max@5 OR cd_marital_status_null_count@2 != row_count@3 AND cd_marital_status_min@0 <= S AND S <= cd_marital_status_max@1 AND cd_education_status_null_count@6 != row_count@3 AND cd_education_status_min@4 <= College AND College <= cd_education_status_max@5 OR cd_marital_status_null_count@2 != row_count@3 AND cd_marital_status_min@0 <= W AND W <= cd_marital_status_max@1 AND cd_education_status_null_count@6 != row_count@3 AND cd_education_status_min@4 <= 2 yr Degree AND 2 yr Degree <= cd_education_status_max@5, required_guarantees=[cd_education_status in (2 yr Degree, Advanced Degree, College), cd_marital_status in (M, S, W)] + │ DistributedLeafExec: DataSourceExec: file_groups={3 groups: [[/testdata/tpcds/plans_sf1_partitions4/customer_demographics/part-0.parquet:.., /testdata/tpcds/plans_sf1_partitions4/customer_demographics/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/customer_demographics/part-1.parquet:.., /testdata/tpcds/plans_sf1_partitions4/customer_demographics/part-2.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/customer_demographics/part-2.parquet:.., /testdata/tpcds/plans_sf1_partitions4/customer_demographics/part-3.parquet:..]]}, projection=[cd_demo_sk, cd_marital_status, cd_education_status], file_type=parquet, predicate=(cd_marital_status@2 = M AND cd_education_status@3 = Advanced Degree OR cd_marital_status@2 = S AND cd_education_status@3 = College OR cd_marital_status@2 = W AND cd_education_status@3 = 2 yr Degree) AND DynamicFilter [ empty ], pruning_predicate=cd_marital_status_null_count@2 != row_count@3 AND cd_marital_status_min@0 <= M AND M <= cd_marital_status_max@1 AND cd_education_status_null_count@6 != row_count@3 AND cd_education_status_min@4 <= Advanced Degree AND Advanced Degree <= cd_education_status_max@5 OR cd_marital_status_null_count@2 != row_count@3 AND cd_marital_status_min@0 <= S AND S <= cd_marital_status_max@1 AND cd_education_status_null_count@6 != row_count@3 AND cd_education_status_min@4 <= College AND College <= cd_education_status_max@5 OR cd_marital_status_null_count@2 != row_count@3 AND cd_marital_status_min@0 <= W AND W <= cd_marital_status_max@1 AND cd_education_status_null_count@6 != row_count@3 AND cd_education_status_min@4 <= 2 yr Degree AND 2 yr Degree <= cd_education_status_max@5, required_guarantees=[cd_education_status in (2 yr Degree, Advanced Degree, College), cd_marital_status in (M, S, W)] └────────────────────────────────────────────────── - ┌───── Stage 2 ── Tasks: t0:[p0..p5] t1:[p6..p11] t2:[p12..p17] - │ BroadcastExec: input_partitions=2, consumer_tasks=3, output_partitions=6 + ┌───── Stage 2 ── Tasks: t0:[p0..p8] t1:[p9..p17] t2:[p18..p26] + │ BroadcastExec: input_partitions=3, consumer_tasks=3, output_partitions=9 │ HashJoinExec: mode=CollectLeft, join_type=Inner, on=[(CAST(store.s_store_sk AS Float64)@1, ss_store_sk@4)], projection=[ss_sold_date_sk@2, ss_cdemo_sk@3, ss_hdemo_sk@4, ss_addr_sk@5, ss_quantity@7, ss_sales_price@8, ss_ext_sales_price@9, ss_ext_wholesale_cost@10, ss_net_profit@11] │ CoalescePartitionsExec │ [Stage 1] => NetworkBroadcastExec: partitions_per_consumer=2, stage_partitions=6, input_tasks=2 │ FilterExec: (ss_net_profit@9 >= Some(10000),6,2 AND ss_net_profit@9 <= Some(20000),6,2 OR ss_net_profit@9 >= Some(15000),6,2 AND ss_net_profit@9 <= Some(30000),6,2 OR ss_net_profit@9 >= Some(5000),6,2 AND ss_net_profit@9 <= Some(25000),6,2) AND (ss_sales_price@6 >= Some(10000),5,2 AND ss_sales_price@6 <= Some(15000),5,2 OR ss_sales_price@6 >= Some(5000),5,2 AND ss_sales_price@6 <= Some(10000),5,2 OR ss_sales_price@6 >= Some(15000),5,2 AND ss_sales_price@6 <= Some(20000),5,2) - │ PartitionIsolatorExec: tasks=3 partitions=6 - │ DataSourceExec: file_groups={6 groups: [[/testdata/tpcds/plans_sf1_partitions4/store_sales/part-0.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_sales/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_sales/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_sales/part-2.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_sales/part-2.parquet:..], ...]}, projection=[ss_sold_date_sk, ss_cdemo_sk, ss_hdemo_sk, ss_addr_sk, ss_store_sk, ss_quantity, ss_sales_price, ss_ext_sales_price, ss_ext_wholesale_cost, ss_net_profit], file_type=parquet, predicate=(ss_net_profit@22 >= Some(10000),6,2 AND ss_net_profit@22 <= Some(20000),6,2 OR ss_net_profit@22 >= Some(15000),6,2 AND ss_net_profit@22 <= Some(30000),6,2 OR ss_net_profit@22 >= Some(5000),6,2 AND ss_net_profit@22 <= Some(25000),6,2) AND (ss_sales_price@13 >= Some(10000),5,2 AND ss_sales_price@13 <= Some(15000),5,2 OR ss_sales_price@13 >= Some(5000),5,2 AND ss_sales_price@13 <= Some(10000),5,2 OR ss_sales_price@13 >= Some(15000),5,2 AND ss_sales_price@13 <= Some(20000),5,2) AND DynamicFilter [ empty ], pruning_predicate=(ss_net_profit_null_count@1 != row_count@2 AND ss_net_profit_max@0 >= Some(10000),6,2 AND ss_net_profit_null_count@1 != row_count@2 AND ss_net_profit_min@3 <= Some(20000),6,2 OR ss_net_profit_null_count@1 != row_count@2 AND ss_net_profit_max@0 >= Some(15000),6,2 AND ss_net_profit_null_count@1 != row_count@2 AND ss_net_profit_min@3 <= Some(30000),6,2 OR ss_net_profit_null_count@1 != row_count@2 AND ss_net_profit_max@0 >= Some(5000),6,2 AND ss_net_profit_null_count@1 != row_count@2 AND ss_net_profit_min@3 <= Some(25000),6,2) AND (ss_sales_price_null_count@5 != row_count@2 AND ss_sales_price_max@4 >= Some(10000),5,2 AND ss_sales_price_null_count@5 != row_count@2 AND ss_sales_price_min@6 <= Some(15000),5,2 OR ss_sales_price_null_count@5 != row_count@2 AND ss_sales_price_max@4 >= Some(5000),5,2 AND ss_sales_price_null_count@5 != row_count@2 AND ss_sales_price_min@6 <= Some(10000),5,2 OR ss_sales_price_null_count@5 != row_count@2 AND ss_sales_price_max@4 >= Some(15000),5,2 AND ss_sales_price_null_count@5 != row_count@2 AND ss_sales_price_min@6 <= Some(20000),5,2), required_guarantees=[] + │ DistributedLeafExec: DataSourceExec: file_groups={3 groups: [[/testdata/tpcds/plans_sf1_partitions4/store_sales/part-0.parquet:.., /testdata/tpcds/plans_sf1_partitions4/store_sales/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_sales/part-1.parquet:.., /testdata/tpcds/plans_sf1_partitions4/store_sales/part-2.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_sales/part-2.parquet:.., /testdata/tpcds/plans_sf1_partitions4/store_sales/part-3.parquet:..]]}, projection=[ss_sold_date_sk, ss_cdemo_sk, ss_hdemo_sk, ss_addr_sk, ss_store_sk, ss_quantity, ss_sales_price, ss_ext_sales_price, ss_ext_wholesale_cost, ss_net_profit], file_type=parquet, predicate=(ss_net_profit@22 >= Some(10000),6,2 AND ss_net_profit@22 <= Some(20000),6,2 OR ss_net_profit@22 >= Some(15000),6,2 AND ss_net_profit@22 <= Some(30000),6,2 OR ss_net_profit@22 >= Some(5000),6,2 AND ss_net_profit@22 <= Some(25000),6,2) AND (ss_sales_price@13 >= Some(10000),5,2 AND ss_sales_price@13 <= Some(15000),5,2 OR ss_sales_price@13 >= Some(5000),5,2 AND ss_sales_price@13 <= Some(10000),5,2 OR ss_sales_price@13 >= Some(15000),5,2 AND ss_sales_price@13 <= Some(20000),5,2) AND DynamicFilter [ empty ], pruning_predicate=(ss_net_profit_null_count@1 != row_count@2 AND ss_net_profit_max@0 >= Some(10000),6,2 AND ss_net_profit_null_count@1 != row_count@2 AND ss_net_profit_min@3 <= Some(20000),6,2 OR ss_net_profit_null_count@1 != row_count@2 AND ss_net_profit_max@0 >= Some(15000),6,2 AND ss_net_profit_null_count@1 != row_count@2 AND ss_net_profit_min@3 <= Some(30000),6,2 OR ss_net_profit_null_count@1 != row_count@2 AND ss_net_profit_max@0 >= Some(5000),6,2 AND ss_net_profit_null_count@1 != row_count@2 AND ss_net_profit_min@3 <= Some(25000),6,2) AND (ss_sales_price_null_count@5 != row_count@2 AND ss_sales_price_max@4 >= Some(10000),5,2 AND ss_sales_price_null_count@5 != row_count@2 AND ss_sales_price_min@6 <= Some(15000),5,2 OR ss_sales_price_null_count@5 != row_count@2 AND ss_sales_price_max@4 >= Some(5000),5,2 AND ss_sales_price_null_count@5 != row_count@2 AND ss_sales_price_min@6 <= Some(10000),5,2 OR ss_sales_price_null_count@5 != row_count@2 AND ss_sales_price_max@4 >= Some(15000),5,2 AND ss_sales_price_null_count@5 != row_count@2 AND ss_sales_price_min@6 <= Some(20000),5,2), required_guarantees=[] └────────────────────────────────────────────────── ┌───── Stage 1 ── Tasks: t0:[p0..p5] t1:[p6..p11] │ BroadcastExec: input_partitions=2, consumer_tasks=3, output_partitions=6 - │ PartitionIsolatorExec: tasks=2 partitions=4 - │ DataSourceExec: file_groups={4 groups: [[/testdata/tpcds/plans_sf1_partitions4/store/part-0.parquet], [/testdata/tpcds/plans_sf1_partitions4/store/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/store/part-2.parquet], [/testdata/tpcds/plans_sf1_partitions4/store/part-3.parquet]]}, projection=[s_store_sk, CAST(s_store_sk@0 AS Float64) as CAST(store.s_store_sk AS Float64)], file_type=parquet + │ DistributedLeafExec: DataSourceExec: file_groups={2 groups: [[/testdata/tpcds/plans_sf1_partitions4/store/part-0.parquet, /testdata/tpcds/plans_sf1_partitions4/store/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/store/part-2.parquet, /testdata/tpcds/plans_sf1_partitions4/store/part-3.parquet]]}, projection=[s_store_sk, CAST(s_store_sk@0 AS Float64) as CAST(store.s_store_sk AS Float64)], file_type=parquet └────────────────────────────────────────────────── "); Ok(()) @@ -1417,7 +1305,7 @@ mod tests { │ AggregateExec: mode=Partial, gby=[i_brand_id@2 as i_brand_id, i_class_id@3 as i_class_id, i_category_id@4 as i_category_id], aggr=[sum(store_sales.ss_quantity * store_sales.ss_list_price), count(Int64(1))] │ HashJoinExec: mode=CollectLeft, join_type=LeftSemi, on=[(ss_item_sk@0, ss_item_sk@0)], projection=[ss_quantity@1, ss_list_price@2, i_brand_id@3, i_class_id@4, i_category_id@5] │ CoalescePartitionsExec - │ [Stage 7] => NetworkCoalesceExec: output_partitions=6, input_tasks=3 + │ [Stage 7] => NetworkCoalesceExec: output_partitions=9, input_tasks=3 │ ProjectionExec: expr=[i_item_sk@0 as ss_item_sk] │ HashJoinExec: mode=CollectLeft, join_type=Inner, on=[(brand_id@0, i_brand_id@1), (class_id@1, i_class_id@2), (category_id@2, i_category_id@3)], projection=[i_item_sk@3] │ CoalescePartitionsExec @@ -1435,7 +1323,7 @@ mod tests { │ AggregateExec: mode=Partial, gby=[i_brand_id@2 as i_brand_id, i_class_id@3 as i_class_id, i_category_id@4 as i_category_id], aggr=[sum(catalog_sales.cs_quantity * catalog_sales.cs_list_price), count(Int64(1))] │ HashJoinExec: mode=CollectLeft, join_type=LeftSemi, on=[(cs_item_sk@0, ss_item_sk@0)], projection=[cs_quantity@1, cs_list_price@2, i_brand_id@3, i_class_id@4, i_category_id@5] │ CoalescePartitionsExec - │ [Stage 24] => NetworkCoalesceExec: output_partitions=6, input_tasks=3 + │ [Stage 24] => NetworkCoalesceExec: output_partitions=9, input_tasks=3 │ ProjectionExec: expr=[i_item_sk@0 as ss_item_sk] │ HashJoinExec: mode=CollectLeft, join_type=Inner, on=[(brand_id@0, i_brand_id@1), (class_id@1, i_class_id@2), (category_id@2, i_category_id@3)], projection=[i_item_sk@3] │ CoalescePartitionsExec @@ -1453,7 +1341,7 @@ mod tests { │ AggregateExec: mode=Partial, gby=[i_brand_id@2 as i_brand_id, i_class_id@3 as i_class_id, i_category_id@4 as i_category_id], aggr=[sum(web_sales.ws_quantity * web_sales.ws_list_price), count(Int64(1))] │ HashJoinExec: mode=CollectLeft, join_type=LeftSemi, on=[(ws_item_sk@0, ss_item_sk@0)], projection=[ws_quantity@1, ws_list_price@2, i_brand_id@3, i_class_id@4, i_category_id@5] │ CoalescePartitionsExec - │ [Stage 41] => NetworkCoalesceExec: output_partitions=6, input_tasks=3 + │ [Stage 41] => NetworkCoalesceExec: output_partitions=9, input_tasks=3 │ ProjectionExec: expr=[i_item_sk@0 as ss_item_sk] │ HashJoinExec: mode=CollectLeft, join_type=Inner, on=[(brand_id@0, i_brand_id@1), (class_id@1, i_class_id@2), (category_id@2, i_category_id@3)], projection=[i_item_sk@3] │ CoalescePartitionsExec @@ -1468,8 +1356,7 @@ mod tests { │ HashJoinExec: mode=CollectLeft, join_type=Inner, on=[(CAST(date_dim.d_date_sk AS Float64)@1, ss_sold_date_sk@0)], projection=[ss_quantity@3, ss_list_price@4] │ CoalescePartitionsExec │ [Stage 1] => NetworkBroadcastExec: partitions_per_consumer=3, stage_partitions=6, input_tasks=2 - │ PartitionIsolatorExec: tasks=2 partitions=6 - │ DataSourceExec: file_groups={6 groups: [[/testdata/tpcds/plans_sf1_partitions4/store_sales/part-0.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_sales/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_sales/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_sales/part-2.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_sales/part-2.parquet:..], ...]}, projection=[ss_sold_date_sk, ss_quantity, ss_list_price], file_type=parquet, predicate=DynamicFilter [ empty ] + │ DistributedLeafExec: DataSourceExec: file_groups={3 groups: [[/testdata/tpcds/plans_sf1_partitions4/store_sales/part-0.parquet:.., /testdata/tpcds/plans_sf1_partitions4/store_sales/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_sales/part-1.parquet:.., /testdata/tpcds/plans_sf1_partitions4/store_sales/part-2.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_sales/part-2.parquet:.., /testdata/tpcds/plans_sf1_partitions4/store_sales/part-3.parquet:..]]}, projection=[ss_sold_date_sk, ss_quantity, ss_list_price], file_type=parquet, predicate=DynamicFilter [ empty ] │ ProjectionExec: expr=[cs_quantity@0 as quantity, cs_list_price@1 as list_price] │ HashJoinExec: mode=CollectLeft, join_type=Inner, on=[(CAST(date_dim.d_date_sk AS Float64)@1, cs_sold_date_sk@0)], projection=[cs_quantity@3, cs_list_price@4] │ CoalescePartitionsExec @@ -1486,26 +1373,23 @@ mod tests { │ ProjectionExec: expr=[d_date_sk@0 as d_date_sk, CAST(d_date_sk@0 AS Float64) as CAST(date_dim.d_date_sk AS Float64)] │ FilterExec: d_year@1 >= 1999 AND d_year@1 <= 2001, projection=[d_date_sk@0] │ RepartitionExec: partitioning=RoundRobinBatch(3), input_partitions=2 - │ PartitionIsolatorExec: tasks=2 partitions=4 - │ DataSourceExec: file_groups={4 groups: [[/testdata/tpcds/plans_sf1_partitions4/date_dim/part-0.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-2.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-3.parquet]]}, projection=[d_date_sk, d_year], file_type=parquet, predicate=d_year@6 >= 1999 AND d_year@6 <= 2001, pruning_predicate=d_year_null_count@1 != row_count@2 AND d_year_max@0 >= 1999 AND d_year_null_count@1 != row_count@2 AND d_year_min@3 <= 2001, required_guarantees=[] + │ DistributedLeafExec: DataSourceExec: file_groups={2 groups: [[/testdata/tpcds/plans_sf1_partitions4/date_dim/part-0.parquet, /testdata/tpcds/plans_sf1_partitions4/date_dim/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-2.parquet, /testdata/tpcds/plans_sf1_partitions4/date_dim/part-3.parquet]]}, projection=[d_date_sk, d_year], file_type=parquet, predicate=d_year@6 >= 1999 AND d_year@6 <= 2001, pruning_predicate=d_year_null_count@1 != row_count@2 AND d_year_max@0 >= 1999 AND d_year_null_count@1 != row_count@2 AND d_year_min@3 <= 2001, required_guarantees=[] └────────────────────────────────────────────────── ┌───── Stage 2 ── Tasks: t0:[p0..p2] t1:[p3..p5] │ BroadcastExec: input_partitions=3, consumer_tasks=1, output_partitions=3 │ ProjectionExec: expr=[d_date_sk@0 as d_date_sk, CAST(d_date_sk@0 AS Float64) as CAST(date_dim.d_date_sk AS Float64)] │ FilterExec: d_year@1 >= 1999 AND d_year@1 <= 2001, projection=[d_date_sk@0] │ RepartitionExec: partitioning=RoundRobinBatch(3), input_partitions=2 - │ PartitionIsolatorExec: tasks=2 partitions=4 - │ DataSourceExec: file_groups={4 groups: [[/testdata/tpcds/plans_sf1_partitions4/date_dim/part-0.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-2.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-3.parquet]]}, projection=[d_date_sk, d_year], file_type=parquet, predicate=d_year@6 >= 1999 AND d_year@6 <= 2001, pruning_predicate=d_year_null_count@1 != row_count@2 AND d_year_max@0 >= 1999 AND d_year_null_count@1 != row_count@2 AND d_year_min@3 <= 2001, required_guarantees=[] + │ DistributedLeafExec: DataSourceExec: file_groups={2 groups: [[/testdata/tpcds/plans_sf1_partitions4/date_dim/part-0.parquet, /testdata/tpcds/plans_sf1_partitions4/date_dim/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-2.parquet, /testdata/tpcds/plans_sf1_partitions4/date_dim/part-3.parquet]]}, projection=[d_date_sk, d_year], file_type=parquet, predicate=d_year@6 >= 1999 AND d_year@6 <= 2001, pruning_predicate=d_year_null_count@1 != row_count@2 AND d_year_max@0 >= 1999 AND d_year_null_count@1 != row_count@2 AND d_year_min@3 <= 2001, required_guarantees=[] └────────────────────────────────────────────────── ┌───── Stage 3 ── Tasks: t0:[p0..p2] t1:[p3..p5] │ BroadcastExec: input_partitions=3, consumer_tasks=1, output_partitions=3 │ ProjectionExec: expr=[d_date_sk@0 as d_date_sk, CAST(d_date_sk@0 AS Float64) as CAST(date_dim.d_date_sk AS Float64)] │ FilterExec: d_year@1 >= 1999 AND d_year@1 <= 2001, projection=[d_date_sk@0] │ RepartitionExec: partitioning=RoundRobinBatch(3), input_partitions=2 - │ PartitionIsolatorExec: tasks=2 partitions=4 - │ DataSourceExec: file_groups={4 groups: [[/testdata/tpcds/plans_sf1_partitions4/date_dim/part-0.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-2.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-3.parquet]]}, projection=[d_date_sk, d_year], file_type=parquet, predicate=d_year@6 >= 1999 AND d_year@6 <= 2001, pruning_predicate=d_year_null_count@1 != row_count@2 AND d_year_max@0 >= 1999 AND d_year_null_count@1 != row_count@2 AND d_year_min@3 <= 2001, required_guarantees=[] + │ DistributedLeafExec: DataSourceExec: file_groups={2 groups: [[/testdata/tpcds/plans_sf1_partitions4/date_dim/part-0.parquet, /testdata/tpcds/plans_sf1_partitions4/date_dim/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-2.parquet, /testdata/tpcds/plans_sf1_partitions4/date_dim/part-3.parquet]]}, projection=[d_date_sk, d_year], file_type=parquet, predicate=d_year@6 >= 1999 AND d_year@6 <= 2001, pruning_predicate=d_year_null_count@1 != row_count@2 AND d_year_max@0 >= 1999 AND d_year_null_count@1 != row_count@2 AND d_year_min@3 <= 2001, required_guarantees=[] └────────────────────────────────────────────────── - ┌───── Stage 7 ── Tasks: t0:[p0..p1] t1:[p2..p3] t2:[p4..p5] + ┌───── Stage 7 ── Tasks: t0:[p0..p2] t1:[p3..p5] t2:[p6..p8] │ HashJoinExec: mode=CollectLeft, join_type=Inner, on=[(CAST(date_dim.d_date_sk AS Float64)@1, ss_sold_date_sk@0)], projection=[ss_item_sk@3, ss_quantity@4, ss_list_price@5, i_brand_id@6, i_class_id@7, i_category_id@8] │ CoalescePartitionsExec │ [Stage 5] => NetworkBroadcastExec: partitions_per_consumer=3, stage_partitions=9, input_tasks=2 @@ -1513,36 +1397,33 @@ mod tests { │ HashJoinExec: mode=CollectLeft, join_type=Inner, on=[(i_item_sk@0, ss_item_sk@1)], projection=[i_brand_id@1, i_class_id@2, i_category_id@3, ss_sold_date_sk@4, ss_item_sk@5, ss_quantity@6, ss_list_price@7] │ CoalescePartitionsExec │ [Stage 6] => NetworkBroadcastExec: partitions_per_consumer=2, stage_partitions=6, input_tasks=2 - │ PartitionIsolatorExec: tasks=3 partitions=6 - │ DataSourceExec: file_groups={6 groups: [[/testdata/tpcds/plans_sf1_partitions4/store_sales/part-0.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_sales/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_sales/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_sales/part-2.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_sales/part-2.parquet:..], ...]}, projection=[ss_sold_date_sk, ss_item_sk, ss_quantity, ss_list_price], file_type=parquet, predicate=DynamicFilter [ empty ] AND DynamicFilter [ empty ] + │ DistributedLeafExec: DataSourceExec: file_groups={3 groups: [[/testdata/tpcds/plans_sf1_partitions4/store_sales/part-0.parquet:.., /testdata/tpcds/plans_sf1_partitions4/store_sales/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_sales/part-1.parquet:.., /testdata/tpcds/plans_sf1_partitions4/store_sales/part-2.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_sales/part-2.parquet:.., /testdata/tpcds/plans_sf1_partitions4/store_sales/part-3.parquet:..]]}, projection=[ss_sold_date_sk, ss_item_sk, ss_quantity, ss_list_price], file_type=parquet, predicate=DynamicFilter [ empty ] AND DynamicFilter [ empty ] └────────────────────────────────────────────────── ┌───── Stage 5 ── Tasks: t0:[p0..p8] t1:[p9..p17] │ BroadcastExec: input_partitions=3, consumer_tasks=3, output_partitions=9 │ ProjectionExec: expr=[d_date_sk@0 as d_date_sk, CAST(d_date_sk@0 AS Float64) as CAST(date_dim.d_date_sk AS Float64)] │ FilterExec: d_year@1 = 2001 AND d_moy@2 = 11, projection=[d_date_sk@0] │ RepartitionExec: partitioning=RoundRobinBatch(3), input_partitions=2 - │ PartitionIsolatorExec: tasks=2 partitions=4 - │ DataSourceExec: file_groups={4 groups: [[/testdata/tpcds/plans_sf1_partitions4/date_dim/part-0.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-2.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-3.parquet]]}, projection=[d_date_sk, d_year, d_moy], file_type=parquet, predicate=d_year@6 = 2001 AND d_moy@8 = 11, pruning_predicate=d_year_null_count@2 != row_count@3 AND d_year_min@0 <= 2001 AND 2001 <= d_year_max@1 AND d_moy_null_count@6 != row_count@3 AND d_moy_min@4 <= 11 AND 11 <= d_moy_max@5, required_guarantees=[d_moy in (11), d_year in (2001)] + │ DistributedLeafExec: DataSourceExec: file_groups={2 groups: [[/testdata/tpcds/plans_sf1_partitions4/date_dim/part-0.parquet, /testdata/tpcds/plans_sf1_partitions4/date_dim/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-2.parquet, /testdata/tpcds/plans_sf1_partitions4/date_dim/part-3.parquet]]}, projection=[d_date_sk, d_year, d_moy], file_type=parquet, predicate=d_year@6 = 2001 AND d_moy@8 = 11, pruning_predicate=d_year_null_count@2 != row_count@3 AND d_year_min@0 <= 2001 AND 2001 <= d_year_max@1 AND d_moy_null_count@6 != row_count@3 AND d_moy_min@4 <= 11 AND 11 <= d_moy_max@5, required_guarantees=[d_moy in (11), d_year in (2001)] └────────────────────────────────────────────────── ┌───── Stage 6 ── Tasks: t0:[p0..p5] t1:[p6..p11] │ BroadcastExec: input_partitions=2, consumer_tasks=3, output_partitions=6 - │ PartitionIsolatorExec: tasks=2 partitions=4 - │ DataSourceExec: file_groups={4 groups: [[/testdata/tpcds/plans_sf1_partitions4/item/part-0.parquet], [/testdata/tpcds/plans_sf1_partitions4/item/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/item/part-2.parquet], [/testdata/tpcds/plans_sf1_partitions4/item/part-3.parquet]]}, projection=[i_item_sk, i_brand_id, i_class_id, i_category_id], file_type=parquet + │ DistributedLeafExec: DataSourceExec: file_groups={2 groups: [[/testdata/tpcds/plans_sf1_partitions4/item/part-0.parquet, /testdata/tpcds/plans_sf1_partitions4/item/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/item/part-2.parquet, /testdata/tpcds/plans_sf1_partitions4/item/part-3.parquet]]}, projection=[i_item_sk, i_brand_id, i_class_id, i_category_id], file_type=parquet └────────────────────────────────────────────────── ┌───── Stage 17 ── Tasks: t0:[p0..p2] t1:[p3..p5] t2:[p6..p8] │ BroadcastExec: input_partitions=3, consumer_tasks=1, output_partitions=3 │ HashJoinExec: mode=CollectLeft, join_type=RightSemi, on=[(i_brand_id@0, brand_id@0), (i_class_id@1, class_id@1), (i_category_id@2, category_id@2)], NullsEqual: true │ CoalescePartitionsExec - │ [Stage 10] => NetworkBroadcastExec: partitions_per_consumer=2, stage_partitions=6, input_tasks=3 + │ [Stage 10] => NetworkBroadcastExec: partitions_per_consumer=3, stage_partitions=9, input_tasks=3 │ AggregateExec: mode=SinglePartitioned, gby=[brand_id@0 as brand_id, class_id@1 as class_id, category_id@2 as category_id], aggr=[] │ HashJoinExec: mode=CollectLeft, join_type=RightSemi, on=[(i_brand_id@0, brand_id@0), (i_class_id@1, class_id@1), (i_category_id@2, category_id@2)], NullsEqual: true │ CoalescePartitionsExec - │ [Stage 13] => NetworkBroadcastExec: partitions_per_consumer=2, stage_partitions=6, input_tasks=3 + │ [Stage 13] => NetworkBroadcastExec: partitions_per_consumer=3, stage_partitions=9, input_tasks=3 │ AggregateExec: mode=FinalPartitioned, gby=[brand_id@0 as brand_id, class_id@1 as class_id, category_id@2 as category_id], aggr=[] │ [Stage 16] => NetworkShuffleExec: output_partitions=3, input_tasks=3 └────────────────────────────────────────────────── - ┌───── Stage 10 ── Tasks: t0:[p0..p5] t1:[p6..p11] t2:[p12..p17] - │ BroadcastExec: input_partitions=2, consumer_tasks=3, output_partitions=6 + ┌───── Stage 10 ── Tasks: t0:[p0..p8] t1:[p9..p17] t2:[p18..p26] + │ BroadcastExec: input_partitions=3, consumer_tasks=3, output_partitions=9 │ HashJoinExec: mode=CollectLeft, join_type=Inner, on=[(CAST(d3.d_date_sk AS Float64)@1, ws_sold_date_sk@0)], projection=[i_brand_id@3, i_class_id@4, i_category_id@5] │ CoalescePartitionsExec │ [Stage 8] => NetworkBroadcastExec: partitions_per_consumer=3, stage_partitions=9, input_tasks=2 @@ -1550,24 +1431,21 @@ mod tests { │ HashJoinExec: mode=CollectLeft, join_type=Inner, on=[(i_item_sk@0, ws_item_sk@1)], projection=[i_brand_id@1, i_class_id@2, i_category_id@3, ws_sold_date_sk@4] │ CoalescePartitionsExec │ [Stage 9] => NetworkBroadcastExec: partitions_per_consumer=2, stage_partitions=6, input_tasks=2 - │ PartitionIsolatorExec: tasks=3 partitions=6 - │ DataSourceExec: file_groups={6 groups: [[/testdata/tpcds/plans_sf1_partitions4/web_sales/part-0.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/web_sales/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/web_sales/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/web_sales/part-2.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/web_sales/part-2.parquet:..], ...]}, projection=[ws_sold_date_sk, ws_item_sk], file_type=parquet, predicate=DynamicFilter [ empty ] AND DynamicFilter [ empty ] + │ DistributedLeafExec: DataSourceExec: file_groups={3 groups: [[/testdata/tpcds/plans_sf1_partitions4/web_sales/part-0.parquet:.., /testdata/tpcds/plans_sf1_partitions4/web_sales/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/web_sales/part-1.parquet:.., /testdata/tpcds/plans_sf1_partitions4/web_sales/part-2.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/web_sales/part-2.parquet:.., /testdata/tpcds/plans_sf1_partitions4/web_sales/part-3.parquet:..]]}, projection=[ws_sold_date_sk, ws_item_sk], file_type=parquet, predicate=DynamicFilter [ empty ] AND DynamicFilter [ empty ] └────────────────────────────────────────────────── ┌───── Stage 8 ── Tasks: t0:[p0..p8] t1:[p9..p17] │ BroadcastExec: input_partitions=3, consumer_tasks=3, output_partitions=9 │ ProjectionExec: expr=[d_date_sk@0 as d_date_sk, CAST(d_date_sk@0 AS Float64) as CAST(d3.d_date_sk AS Float64)] │ FilterExec: d_year@1 >= 1999 AND d_year@1 <= 2001, projection=[d_date_sk@0] │ RepartitionExec: partitioning=RoundRobinBatch(3), input_partitions=2 - │ PartitionIsolatorExec: tasks=2 partitions=4 - │ DataSourceExec: file_groups={4 groups: [[/testdata/tpcds/plans_sf1_partitions4/date_dim/part-0.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-2.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-3.parquet]]}, projection=[d_date_sk, d_year], file_type=parquet, predicate=d_year@6 >= 1999 AND d_year@6 <= 2001, pruning_predicate=d_year_null_count@1 != row_count@2 AND d_year_max@0 >= 1999 AND d_year_null_count@1 != row_count@2 AND d_year_min@3 <= 2001, required_guarantees=[] + │ DistributedLeafExec: DataSourceExec: file_groups={2 groups: [[/testdata/tpcds/plans_sf1_partitions4/date_dim/part-0.parquet, /testdata/tpcds/plans_sf1_partitions4/date_dim/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-2.parquet, /testdata/tpcds/plans_sf1_partitions4/date_dim/part-3.parquet]]}, projection=[d_date_sk, d_year], file_type=parquet, predicate=d_year@6 >= 1999 AND d_year@6 <= 2001, pruning_predicate=d_year_null_count@1 != row_count@2 AND d_year_max@0 >= 1999 AND d_year_null_count@1 != row_count@2 AND d_year_min@3 <= 2001, required_guarantees=[] └────────────────────────────────────────────────── ┌───── Stage 9 ── Tasks: t0:[p0..p5] t1:[p6..p11] │ BroadcastExec: input_partitions=2, consumer_tasks=3, output_partitions=6 - │ PartitionIsolatorExec: tasks=2 partitions=4 - │ DataSourceExec: file_groups={4 groups: [[/testdata/tpcds/plans_sf1_partitions4/item/part-0.parquet], [/testdata/tpcds/plans_sf1_partitions4/item/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/item/part-2.parquet], [/testdata/tpcds/plans_sf1_partitions4/item/part-3.parquet]]}, projection=[i_item_sk, i_brand_id, i_class_id, i_category_id], file_type=parquet + │ DistributedLeafExec: DataSourceExec: file_groups={2 groups: [[/testdata/tpcds/plans_sf1_partitions4/item/part-0.parquet, /testdata/tpcds/plans_sf1_partitions4/item/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/item/part-2.parquet, /testdata/tpcds/plans_sf1_partitions4/item/part-3.parquet]]}, projection=[i_item_sk, i_brand_id, i_class_id, i_category_id], file_type=parquet └────────────────────────────────────────────────── - ┌───── Stage 13 ── Tasks: t0:[p0..p5] t1:[p6..p11] t2:[p12..p17] - │ BroadcastExec: input_partitions=2, consumer_tasks=3, output_partitions=6 + ┌───── Stage 13 ── Tasks: t0:[p0..p8] t1:[p9..p17] t2:[p18..p26] + │ BroadcastExec: input_partitions=3, consumer_tasks=3, output_partitions=9 │ HashJoinExec: mode=CollectLeft, join_type=Inner, on=[(CAST(d2.d_date_sk AS Float64)@1, cs_sold_date_sk@0)], projection=[i_brand_id@3, i_class_id@4, i_category_id@5] │ CoalescePartitionsExec │ [Stage 11] => NetworkBroadcastExec: partitions_per_consumer=3, stage_partitions=9, input_tasks=2 @@ -1575,24 +1453,21 @@ mod tests { │ HashJoinExec: mode=CollectLeft, join_type=Inner, on=[(i_item_sk@0, cs_item_sk@1)], projection=[i_brand_id@1, i_class_id@2, i_category_id@3, cs_sold_date_sk@4] │ CoalescePartitionsExec │ [Stage 12] => NetworkBroadcastExec: partitions_per_consumer=2, stage_partitions=6, input_tasks=2 - │ PartitionIsolatorExec: tasks=3 partitions=6 - │ DataSourceExec: file_groups={6 groups: [[/testdata/tpcds/plans_sf1_partitions4/catalog_sales/part-0.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/catalog_sales/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/catalog_sales/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/catalog_sales/part-2.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/catalog_sales/part-2.parquet:..], ...]}, projection=[cs_sold_date_sk, cs_item_sk], file_type=parquet, predicate=DynamicFilter [ empty ] AND DynamicFilter [ empty ] + │ DistributedLeafExec: DataSourceExec: file_groups={3 groups: [[/testdata/tpcds/plans_sf1_partitions4/catalog_sales/part-0.parquet:.., /testdata/tpcds/plans_sf1_partitions4/catalog_sales/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/catalog_sales/part-1.parquet:.., /testdata/tpcds/plans_sf1_partitions4/catalog_sales/part-2.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/catalog_sales/part-2.parquet:.., /testdata/tpcds/plans_sf1_partitions4/catalog_sales/part-3.parquet:..]]}, projection=[cs_sold_date_sk, cs_item_sk], file_type=parquet, predicate=DynamicFilter [ empty ] AND DynamicFilter [ empty ] └────────────────────────────────────────────────── ┌───── Stage 11 ── Tasks: t0:[p0..p8] t1:[p9..p17] │ BroadcastExec: input_partitions=3, consumer_tasks=3, output_partitions=9 │ ProjectionExec: expr=[d_date_sk@0 as d_date_sk, CAST(d_date_sk@0 AS Float64) as CAST(d2.d_date_sk AS Float64)] │ FilterExec: d_year@1 >= 1999 AND d_year@1 <= 2001, projection=[d_date_sk@0] │ RepartitionExec: partitioning=RoundRobinBatch(3), input_partitions=2 - │ PartitionIsolatorExec: tasks=2 partitions=4 - │ DataSourceExec: file_groups={4 groups: [[/testdata/tpcds/plans_sf1_partitions4/date_dim/part-0.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-2.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-3.parquet]]}, projection=[d_date_sk, d_year], file_type=parquet, predicate=d_year@6 >= 1999 AND d_year@6 <= 2001, pruning_predicate=d_year_null_count@1 != row_count@2 AND d_year_max@0 >= 1999 AND d_year_null_count@1 != row_count@2 AND d_year_min@3 <= 2001, required_guarantees=[] + │ DistributedLeafExec: DataSourceExec: file_groups={2 groups: [[/testdata/tpcds/plans_sf1_partitions4/date_dim/part-0.parquet, /testdata/tpcds/plans_sf1_partitions4/date_dim/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-2.parquet, /testdata/tpcds/plans_sf1_partitions4/date_dim/part-3.parquet]]}, projection=[d_date_sk, d_year], file_type=parquet, predicate=d_year@6 >= 1999 AND d_year@6 <= 2001, pruning_predicate=d_year_null_count@1 != row_count@2 AND d_year_max@0 >= 1999 AND d_year_null_count@1 != row_count@2 AND d_year_min@3 <= 2001, required_guarantees=[] └────────────────────────────────────────────────── ┌───── Stage 12 ── Tasks: t0:[p0..p5] t1:[p6..p11] │ BroadcastExec: input_partitions=2, consumer_tasks=3, output_partitions=6 - │ PartitionIsolatorExec: tasks=2 partitions=4 - │ DataSourceExec: file_groups={4 groups: [[/testdata/tpcds/plans_sf1_partitions4/item/part-0.parquet], [/testdata/tpcds/plans_sf1_partitions4/item/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/item/part-2.parquet], [/testdata/tpcds/plans_sf1_partitions4/item/part-3.parquet]]}, projection=[i_item_sk, i_brand_id, i_class_id, i_category_id], file_type=parquet + │ DistributedLeafExec: DataSourceExec: file_groups={2 groups: [[/testdata/tpcds/plans_sf1_partitions4/item/part-0.parquet, /testdata/tpcds/plans_sf1_partitions4/item/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/item/part-2.parquet, /testdata/tpcds/plans_sf1_partitions4/item/part-3.parquet]]}, projection=[i_item_sk, i_brand_id, i_class_id, i_category_id], file_type=parquet └────────────────────────────────────────────────── ┌───── Stage 16 ── Tasks: t0:[p0..p8] t1:[p0..p8] t2:[p0..p8] - │ RepartitionExec: partitioning=Hash([brand_id@0, class_id@1, category_id@2], 9), input_partitions=2 + │ RepartitionExec: partitioning=Hash([brand_id@0, class_id@1, category_id@2], 9), input_partitions=3 │ AggregateExec: mode=Partial, gby=[brand_id@0 as brand_id, class_id@1 as class_id, category_id@2 as category_id], aggr=[] │ ProjectionExec: expr=[i_brand_id@0 as brand_id, i_class_id@1 as class_id, i_category_id@2 as category_id] │ HashJoinExec: mode=CollectLeft, join_type=Inner, on=[(CAST(d1.d_date_sk AS Float64)@1, ss_sold_date_sk@0)], projection=[i_brand_id@3, i_class_id@4, i_category_id@5] @@ -1602,21 +1477,18 @@ mod tests { │ HashJoinExec: mode=CollectLeft, join_type=Inner, on=[(i_item_sk@0, ss_item_sk@1)], projection=[i_brand_id@1, i_class_id@2, i_category_id@3, ss_sold_date_sk@4] │ CoalescePartitionsExec │ [Stage 15] => NetworkBroadcastExec: partitions_per_consumer=2, stage_partitions=6, input_tasks=2 - │ PartitionIsolatorExec: tasks=3 partitions=6 - │ DataSourceExec: file_groups={6 groups: [[/testdata/tpcds/plans_sf1_partitions4/store_sales/part-0.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_sales/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_sales/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_sales/part-2.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_sales/part-2.parquet:..], ...]}, projection=[ss_sold_date_sk, ss_item_sk], file_type=parquet, predicate=DynamicFilter [ empty ] AND DynamicFilter [ empty ] + │ DistributedLeafExec: DataSourceExec: file_groups={3 groups: [[/testdata/tpcds/plans_sf1_partitions4/store_sales/part-0.parquet:.., /testdata/tpcds/plans_sf1_partitions4/store_sales/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_sales/part-1.parquet:.., /testdata/tpcds/plans_sf1_partitions4/store_sales/part-2.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_sales/part-2.parquet:.., /testdata/tpcds/plans_sf1_partitions4/store_sales/part-3.parquet:..]]}, projection=[ss_sold_date_sk, ss_item_sk], file_type=parquet, predicate=DynamicFilter [ empty ] AND DynamicFilter [ empty ] └────────────────────────────────────────────────── ┌───── Stage 14 ── Tasks: t0:[p0..p8] t1:[p9..p17] │ BroadcastExec: input_partitions=3, consumer_tasks=3, output_partitions=9 │ ProjectionExec: expr=[d_date_sk@0 as d_date_sk, CAST(d_date_sk@0 AS Float64) as CAST(d1.d_date_sk AS Float64)] │ FilterExec: d_year@1 >= 1999 AND d_year@1 <= 2001, projection=[d_date_sk@0] │ RepartitionExec: partitioning=RoundRobinBatch(3), input_partitions=2 - │ PartitionIsolatorExec: tasks=2 partitions=4 - │ DataSourceExec: file_groups={4 groups: [[/testdata/tpcds/plans_sf1_partitions4/date_dim/part-0.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-2.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-3.parquet]]}, projection=[d_date_sk, d_year], file_type=parquet, predicate=d_year@6 >= 1999 AND d_year@6 <= 2001, pruning_predicate=d_year_null_count@1 != row_count@2 AND d_year_max@0 >= 1999 AND d_year_null_count@1 != row_count@2 AND d_year_min@3 <= 2001, required_guarantees=[] + │ DistributedLeafExec: DataSourceExec: file_groups={2 groups: [[/testdata/tpcds/plans_sf1_partitions4/date_dim/part-0.parquet, /testdata/tpcds/plans_sf1_partitions4/date_dim/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-2.parquet, /testdata/tpcds/plans_sf1_partitions4/date_dim/part-3.parquet]]}, projection=[d_date_sk, d_year], file_type=parquet, predicate=d_year@6 >= 1999 AND d_year@6 <= 2001, pruning_predicate=d_year_null_count@1 != row_count@2 AND d_year_max@0 >= 1999 AND d_year_null_count@1 != row_count@2 AND d_year_min@3 <= 2001, required_guarantees=[] └────────────────────────────────────────────────── ┌───── Stage 15 ── Tasks: t0:[p0..p5] t1:[p6..p11] │ BroadcastExec: input_partitions=2, consumer_tasks=3, output_partitions=6 - │ PartitionIsolatorExec: tasks=2 partitions=4 - │ DataSourceExec: file_groups={4 groups: [[/testdata/tpcds/plans_sf1_partitions4/item/part-0.parquet], [/testdata/tpcds/plans_sf1_partitions4/item/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/item/part-2.parquet], [/testdata/tpcds/plans_sf1_partitions4/item/part-3.parquet]]}, projection=[i_item_sk, i_brand_id, i_class_id, i_category_id], file_type=parquet + │ DistributedLeafExec: DataSourceExec: file_groups={2 groups: [[/testdata/tpcds/plans_sf1_partitions4/item/part-0.parquet, /testdata/tpcds/plans_sf1_partitions4/item/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/item/part-2.parquet, /testdata/tpcds/plans_sf1_partitions4/item/part-3.parquet]]}, projection=[i_item_sk, i_brand_id, i_class_id, i_category_id], file_type=parquet └────────────────────────────────────────────────── ┌───── Stage 21 ── Tasks: t0:[p0..p2] t1:[p3..p5] t2:[p6..p8] t3:[p9..p11] │ AggregateExec: mode=Partial, gby=[], aggr=[avg(sq2.quantity * sq2.list_price)] @@ -1625,8 +1497,7 @@ mod tests { │ HashJoinExec: mode=CollectLeft, join_type=Inner, on=[(CAST(date_dim.d_date_sk AS Float64)@1, ss_sold_date_sk@0)], projection=[ss_quantity@3, ss_list_price@4] │ CoalescePartitionsExec │ [Stage 18] => NetworkBroadcastExec: partitions_per_consumer=3, stage_partitions=6, input_tasks=2 - │ PartitionIsolatorExec: tasks=2 partitions=6 - │ DataSourceExec: file_groups={6 groups: [[/testdata/tpcds/plans_sf1_partitions4/store_sales/part-0.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_sales/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_sales/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_sales/part-2.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_sales/part-2.parquet:..], ...]}, projection=[ss_sold_date_sk, ss_quantity, ss_list_price], file_type=parquet, predicate=DynamicFilter [ empty ] + │ DistributedLeafExec: DataSourceExec: file_groups={3 groups: [[/testdata/tpcds/plans_sf1_partitions4/store_sales/part-0.parquet:.., /testdata/tpcds/plans_sf1_partitions4/store_sales/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_sales/part-1.parquet:.., /testdata/tpcds/plans_sf1_partitions4/store_sales/part-2.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_sales/part-2.parquet:.., /testdata/tpcds/plans_sf1_partitions4/store_sales/part-3.parquet:..]]}, projection=[ss_sold_date_sk, ss_quantity, ss_list_price], file_type=parquet, predicate=DynamicFilter [ empty ] │ ProjectionExec: expr=[cs_quantity@0 as quantity, cs_list_price@1 as list_price] │ HashJoinExec: mode=CollectLeft, join_type=Inner, on=[(CAST(date_dim.d_date_sk AS Float64)@1, cs_sold_date_sk@0)], projection=[cs_quantity@3, cs_list_price@4] │ CoalescePartitionsExec @@ -1643,26 +1514,23 @@ mod tests { │ ProjectionExec: expr=[d_date_sk@0 as d_date_sk, CAST(d_date_sk@0 AS Float64) as CAST(date_dim.d_date_sk AS Float64)] │ FilterExec: d_year@1 >= 1999 AND d_year@1 <= 2001, projection=[d_date_sk@0] │ RepartitionExec: partitioning=RoundRobinBatch(3), input_partitions=2 - │ PartitionIsolatorExec: tasks=2 partitions=4 - │ DataSourceExec: file_groups={4 groups: [[/testdata/tpcds/plans_sf1_partitions4/date_dim/part-0.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-2.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-3.parquet]]}, projection=[d_date_sk, d_year], file_type=parquet, predicate=d_year@6 >= 1999 AND d_year@6 <= 2001, pruning_predicate=d_year_null_count@1 != row_count@2 AND d_year_max@0 >= 1999 AND d_year_null_count@1 != row_count@2 AND d_year_min@3 <= 2001, required_guarantees=[] + │ DistributedLeafExec: DataSourceExec: file_groups={2 groups: [[/testdata/tpcds/plans_sf1_partitions4/date_dim/part-0.parquet, /testdata/tpcds/plans_sf1_partitions4/date_dim/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-2.parquet, /testdata/tpcds/plans_sf1_partitions4/date_dim/part-3.parquet]]}, projection=[d_date_sk, d_year], file_type=parquet, predicate=d_year@6 >= 1999 AND d_year@6 <= 2001, pruning_predicate=d_year_null_count@1 != row_count@2 AND d_year_max@0 >= 1999 AND d_year_null_count@1 != row_count@2 AND d_year_min@3 <= 2001, required_guarantees=[] └────────────────────────────────────────────────── ┌───── Stage 19 ── Tasks: t0:[p0..p2] t1:[p3..p5] │ BroadcastExec: input_partitions=3, consumer_tasks=1, output_partitions=3 │ ProjectionExec: expr=[d_date_sk@0 as d_date_sk, CAST(d_date_sk@0 AS Float64) as CAST(date_dim.d_date_sk AS Float64)] │ FilterExec: d_year@1 >= 1999 AND d_year@1 <= 2001, projection=[d_date_sk@0] │ RepartitionExec: partitioning=RoundRobinBatch(3), input_partitions=2 - │ PartitionIsolatorExec: tasks=2 partitions=4 - │ DataSourceExec: file_groups={4 groups: [[/testdata/tpcds/plans_sf1_partitions4/date_dim/part-0.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-2.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-3.parquet]]}, projection=[d_date_sk, d_year], file_type=parquet, predicate=d_year@6 >= 1999 AND d_year@6 <= 2001, pruning_predicate=d_year_null_count@1 != row_count@2 AND d_year_max@0 >= 1999 AND d_year_null_count@1 != row_count@2 AND d_year_min@3 <= 2001, required_guarantees=[] + │ DistributedLeafExec: DataSourceExec: file_groups={2 groups: [[/testdata/tpcds/plans_sf1_partitions4/date_dim/part-0.parquet, /testdata/tpcds/plans_sf1_partitions4/date_dim/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-2.parquet, /testdata/tpcds/plans_sf1_partitions4/date_dim/part-3.parquet]]}, projection=[d_date_sk, d_year], file_type=parquet, predicate=d_year@6 >= 1999 AND d_year@6 <= 2001, pruning_predicate=d_year_null_count@1 != row_count@2 AND d_year_max@0 >= 1999 AND d_year_null_count@1 != row_count@2 AND d_year_min@3 <= 2001, required_guarantees=[] └────────────────────────────────────────────────── ┌───── Stage 20 ── Tasks: t0:[p0..p2] t1:[p3..p5] │ BroadcastExec: input_partitions=3, consumer_tasks=1, output_partitions=3 │ ProjectionExec: expr=[d_date_sk@0 as d_date_sk, CAST(d_date_sk@0 AS Float64) as CAST(date_dim.d_date_sk AS Float64)] │ FilterExec: d_year@1 >= 1999 AND d_year@1 <= 2001, projection=[d_date_sk@0] │ RepartitionExec: partitioning=RoundRobinBatch(3), input_partitions=2 - │ PartitionIsolatorExec: tasks=2 partitions=4 - │ DataSourceExec: file_groups={4 groups: [[/testdata/tpcds/plans_sf1_partitions4/date_dim/part-0.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-2.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-3.parquet]]}, projection=[d_date_sk, d_year], file_type=parquet, predicate=d_year@6 >= 1999 AND d_year@6 <= 2001, pruning_predicate=d_year_null_count@1 != row_count@2 AND d_year_max@0 >= 1999 AND d_year_null_count@1 != row_count@2 AND d_year_min@3 <= 2001, required_guarantees=[] + │ DistributedLeafExec: DataSourceExec: file_groups={2 groups: [[/testdata/tpcds/plans_sf1_partitions4/date_dim/part-0.parquet, /testdata/tpcds/plans_sf1_partitions4/date_dim/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-2.parquet, /testdata/tpcds/plans_sf1_partitions4/date_dim/part-3.parquet]]}, projection=[d_date_sk, d_year], file_type=parquet, predicate=d_year@6 >= 1999 AND d_year@6 <= 2001, pruning_predicate=d_year_null_count@1 != row_count@2 AND d_year_max@0 >= 1999 AND d_year_null_count@1 != row_count@2 AND d_year_min@3 <= 2001, required_guarantees=[] └────────────────────────────────────────────────── - ┌───── Stage 24 ── Tasks: t0:[p0..p1] t1:[p2..p3] t2:[p4..p5] + ┌───── Stage 24 ── Tasks: t0:[p0..p2] t1:[p3..p5] t2:[p6..p8] │ HashJoinExec: mode=CollectLeft, join_type=Inner, on=[(CAST(date_dim.d_date_sk AS Float64)@1, cs_sold_date_sk@0)], projection=[cs_item_sk@3, cs_quantity@4, cs_list_price@5, i_brand_id@6, i_class_id@7, i_category_id@8] │ CoalescePartitionsExec │ [Stage 22] => NetworkBroadcastExec: partitions_per_consumer=3, stage_partitions=9, input_tasks=2 @@ -1670,36 +1538,33 @@ mod tests { │ HashJoinExec: mode=CollectLeft, join_type=Inner, on=[(i_item_sk@0, cs_item_sk@1)], projection=[i_brand_id@1, i_class_id@2, i_category_id@3, cs_sold_date_sk@4, cs_item_sk@5, cs_quantity@6, cs_list_price@7] │ CoalescePartitionsExec │ [Stage 23] => NetworkBroadcastExec: partitions_per_consumer=2, stage_partitions=6, input_tasks=2 - │ PartitionIsolatorExec: tasks=3 partitions=6 - │ DataSourceExec: file_groups={6 groups: [[/testdata/tpcds/plans_sf1_partitions4/catalog_sales/part-0.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/catalog_sales/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/catalog_sales/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/catalog_sales/part-2.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/catalog_sales/part-2.parquet:..], ...]}, projection=[cs_sold_date_sk, cs_item_sk, cs_quantity, cs_list_price], file_type=parquet, predicate=DynamicFilter [ empty ] AND DynamicFilter [ empty ] + │ DistributedLeafExec: DataSourceExec: file_groups={3 groups: [[/testdata/tpcds/plans_sf1_partitions4/catalog_sales/part-0.parquet:.., /testdata/tpcds/plans_sf1_partitions4/catalog_sales/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/catalog_sales/part-1.parquet:.., /testdata/tpcds/plans_sf1_partitions4/catalog_sales/part-2.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/catalog_sales/part-2.parquet:.., /testdata/tpcds/plans_sf1_partitions4/catalog_sales/part-3.parquet:..]]}, projection=[cs_sold_date_sk, cs_item_sk, cs_quantity, cs_list_price], file_type=parquet, predicate=DynamicFilter [ empty ] AND DynamicFilter [ empty ] └────────────────────────────────────────────────── ┌───── Stage 22 ── Tasks: t0:[p0..p8] t1:[p9..p17] │ BroadcastExec: input_partitions=3, consumer_tasks=3, output_partitions=9 │ ProjectionExec: expr=[d_date_sk@0 as d_date_sk, CAST(d_date_sk@0 AS Float64) as CAST(date_dim.d_date_sk AS Float64)] │ FilterExec: d_year@1 = 2001 AND d_moy@2 = 11, projection=[d_date_sk@0] │ RepartitionExec: partitioning=RoundRobinBatch(3), input_partitions=2 - │ PartitionIsolatorExec: tasks=2 partitions=4 - │ DataSourceExec: file_groups={4 groups: [[/testdata/tpcds/plans_sf1_partitions4/date_dim/part-0.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-2.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-3.parquet]]}, projection=[d_date_sk, d_year, d_moy], file_type=parquet, predicate=d_year@6 = 2001 AND d_moy@8 = 11, pruning_predicate=d_year_null_count@2 != row_count@3 AND d_year_min@0 <= 2001 AND 2001 <= d_year_max@1 AND d_moy_null_count@6 != row_count@3 AND d_moy_min@4 <= 11 AND 11 <= d_moy_max@5, required_guarantees=[d_moy in (11), d_year in (2001)] + │ DistributedLeafExec: DataSourceExec: file_groups={2 groups: [[/testdata/tpcds/plans_sf1_partitions4/date_dim/part-0.parquet, /testdata/tpcds/plans_sf1_partitions4/date_dim/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-2.parquet, /testdata/tpcds/plans_sf1_partitions4/date_dim/part-3.parquet]]}, projection=[d_date_sk, d_year, d_moy], file_type=parquet, predicate=d_year@6 = 2001 AND d_moy@8 = 11, pruning_predicate=d_year_null_count@2 != row_count@3 AND d_year_min@0 <= 2001 AND 2001 <= d_year_max@1 AND d_moy_null_count@6 != row_count@3 AND d_moy_min@4 <= 11 AND 11 <= d_moy_max@5, required_guarantees=[d_moy in (11), d_year in (2001)] └────────────────────────────────────────────────── ┌───── Stage 23 ── Tasks: t0:[p0..p5] t1:[p6..p11] │ BroadcastExec: input_partitions=2, consumer_tasks=3, output_partitions=6 - │ PartitionIsolatorExec: tasks=2 partitions=4 - │ DataSourceExec: file_groups={4 groups: [[/testdata/tpcds/plans_sf1_partitions4/item/part-0.parquet], [/testdata/tpcds/plans_sf1_partitions4/item/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/item/part-2.parquet], [/testdata/tpcds/plans_sf1_partitions4/item/part-3.parquet]]}, projection=[i_item_sk, i_brand_id, i_class_id, i_category_id], file_type=parquet + │ DistributedLeafExec: DataSourceExec: file_groups={2 groups: [[/testdata/tpcds/plans_sf1_partitions4/item/part-0.parquet, /testdata/tpcds/plans_sf1_partitions4/item/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/item/part-2.parquet, /testdata/tpcds/plans_sf1_partitions4/item/part-3.parquet]]}, projection=[i_item_sk, i_brand_id, i_class_id, i_category_id], file_type=parquet └────────────────────────────────────────────────── ┌───── Stage 34 ── Tasks: t0:[p0..p2] t1:[p3..p5] t2:[p6..p8] │ BroadcastExec: input_partitions=3, consumer_tasks=1, output_partitions=3 │ HashJoinExec: mode=CollectLeft, join_type=RightSemi, on=[(i_brand_id@0, brand_id@0), (i_class_id@1, class_id@1), (i_category_id@2, category_id@2)], NullsEqual: true │ CoalescePartitionsExec - │ [Stage 27] => NetworkBroadcastExec: partitions_per_consumer=2, stage_partitions=6, input_tasks=3 + │ [Stage 27] => NetworkBroadcastExec: partitions_per_consumer=3, stage_partitions=9, input_tasks=3 │ AggregateExec: mode=SinglePartitioned, gby=[brand_id@0 as brand_id, class_id@1 as class_id, category_id@2 as category_id], aggr=[] │ HashJoinExec: mode=CollectLeft, join_type=RightSemi, on=[(i_brand_id@0, brand_id@0), (i_class_id@1, class_id@1), (i_category_id@2, category_id@2)], NullsEqual: true │ CoalescePartitionsExec - │ [Stage 30] => NetworkBroadcastExec: partitions_per_consumer=2, stage_partitions=6, input_tasks=3 + │ [Stage 30] => NetworkBroadcastExec: partitions_per_consumer=3, stage_partitions=9, input_tasks=3 │ AggregateExec: mode=FinalPartitioned, gby=[brand_id@0 as brand_id, class_id@1 as class_id, category_id@2 as category_id], aggr=[] │ [Stage 33] => NetworkShuffleExec: output_partitions=3, input_tasks=3 └────────────────────────────────────────────────── - ┌───── Stage 27 ── Tasks: t0:[p0..p5] t1:[p6..p11] t2:[p12..p17] - │ BroadcastExec: input_partitions=2, consumer_tasks=3, output_partitions=6 + ┌───── Stage 27 ── Tasks: t0:[p0..p8] t1:[p9..p17] t2:[p18..p26] + │ BroadcastExec: input_partitions=3, consumer_tasks=3, output_partitions=9 │ HashJoinExec: mode=CollectLeft, join_type=Inner, on=[(CAST(d3.d_date_sk AS Float64)@1, ws_sold_date_sk@0)], projection=[i_brand_id@3, i_class_id@4, i_category_id@5] │ CoalescePartitionsExec │ [Stage 25] => NetworkBroadcastExec: partitions_per_consumer=3, stage_partitions=9, input_tasks=2 @@ -1707,24 +1572,21 @@ mod tests { │ HashJoinExec: mode=CollectLeft, join_type=Inner, on=[(i_item_sk@0, ws_item_sk@1)], projection=[i_brand_id@1, i_class_id@2, i_category_id@3, ws_sold_date_sk@4] │ CoalescePartitionsExec │ [Stage 26] => NetworkBroadcastExec: partitions_per_consumer=2, stage_partitions=6, input_tasks=2 - │ PartitionIsolatorExec: tasks=3 partitions=6 - │ DataSourceExec: file_groups={6 groups: [[/testdata/tpcds/plans_sf1_partitions4/web_sales/part-0.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/web_sales/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/web_sales/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/web_sales/part-2.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/web_sales/part-2.parquet:..], ...]}, projection=[ws_sold_date_sk, ws_item_sk], file_type=parquet, predicate=DynamicFilter [ empty ] AND DynamicFilter [ empty ] + │ DistributedLeafExec: DataSourceExec: file_groups={3 groups: [[/testdata/tpcds/plans_sf1_partitions4/web_sales/part-0.parquet:.., /testdata/tpcds/plans_sf1_partitions4/web_sales/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/web_sales/part-1.parquet:.., /testdata/tpcds/plans_sf1_partitions4/web_sales/part-2.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/web_sales/part-2.parquet:.., /testdata/tpcds/plans_sf1_partitions4/web_sales/part-3.parquet:..]]}, projection=[ws_sold_date_sk, ws_item_sk], file_type=parquet, predicate=DynamicFilter [ empty ] AND DynamicFilter [ empty ] └────────────────────────────────────────────────── ┌───── Stage 25 ── Tasks: t0:[p0..p8] t1:[p9..p17] │ BroadcastExec: input_partitions=3, consumer_tasks=3, output_partitions=9 │ ProjectionExec: expr=[d_date_sk@0 as d_date_sk, CAST(d_date_sk@0 AS Float64) as CAST(d3.d_date_sk AS Float64)] │ FilterExec: d_year@1 >= 1999 AND d_year@1 <= 2001, projection=[d_date_sk@0] │ RepartitionExec: partitioning=RoundRobinBatch(3), input_partitions=2 - │ PartitionIsolatorExec: tasks=2 partitions=4 - │ DataSourceExec: file_groups={4 groups: [[/testdata/tpcds/plans_sf1_partitions4/date_dim/part-0.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-2.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-3.parquet]]}, projection=[d_date_sk, d_year], file_type=parquet, predicate=d_year@6 >= 1999 AND d_year@6 <= 2001, pruning_predicate=d_year_null_count@1 != row_count@2 AND d_year_max@0 >= 1999 AND d_year_null_count@1 != row_count@2 AND d_year_min@3 <= 2001, required_guarantees=[] + │ DistributedLeafExec: DataSourceExec: file_groups={2 groups: [[/testdata/tpcds/plans_sf1_partitions4/date_dim/part-0.parquet, /testdata/tpcds/plans_sf1_partitions4/date_dim/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-2.parquet, /testdata/tpcds/plans_sf1_partitions4/date_dim/part-3.parquet]]}, projection=[d_date_sk, d_year], file_type=parquet, predicate=d_year@6 >= 1999 AND d_year@6 <= 2001, pruning_predicate=d_year_null_count@1 != row_count@2 AND d_year_max@0 >= 1999 AND d_year_null_count@1 != row_count@2 AND d_year_min@3 <= 2001, required_guarantees=[] └────────────────────────────────────────────────── ┌───── Stage 26 ── Tasks: t0:[p0..p5] t1:[p6..p11] │ BroadcastExec: input_partitions=2, consumer_tasks=3, output_partitions=6 - │ PartitionIsolatorExec: tasks=2 partitions=4 - │ DataSourceExec: file_groups={4 groups: [[/testdata/tpcds/plans_sf1_partitions4/item/part-0.parquet], [/testdata/tpcds/plans_sf1_partitions4/item/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/item/part-2.parquet], [/testdata/tpcds/plans_sf1_partitions4/item/part-3.parquet]]}, projection=[i_item_sk, i_brand_id, i_class_id, i_category_id], file_type=parquet + │ DistributedLeafExec: DataSourceExec: file_groups={2 groups: [[/testdata/tpcds/plans_sf1_partitions4/item/part-0.parquet, /testdata/tpcds/plans_sf1_partitions4/item/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/item/part-2.parquet, /testdata/tpcds/plans_sf1_partitions4/item/part-3.parquet]]}, projection=[i_item_sk, i_brand_id, i_class_id, i_category_id], file_type=parquet └────────────────────────────────────────────────── - ┌───── Stage 30 ── Tasks: t0:[p0..p5] t1:[p6..p11] t2:[p12..p17] - │ BroadcastExec: input_partitions=2, consumer_tasks=3, output_partitions=6 + ┌───── Stage 30 ── Tasks: t0:[p0..p8] t1:[p9..p17] t2:[p18..p26] + │ BroadcastExec: input_partitions=3, consumer_tasks=3, output_partitions=9 │ HashJoinExec: mode=CollectLeft, join_type=Inner, on=[(CAST(d2.d_date_sk AS Float64)@1, cs_sold_date_sk@0)], projection=[i_brand_id@3, i_class_id@4, i_category_id@5] │ CoalescePartitionsExec │ [Stage 28] => NetworkBroadcastExec: partitions_per_consumer=3, stage_partitions=9, input_tasks=2 @@ -1732,24 +1594,21 @@ mod tests { │ HashJoinExec: mode=CollectLeft, join_type=Inner, on=[(i_item_sk@0, cs_item_sk@1)], projection=[i_brand_id@1, i_class_id@2, i_category_id@3, cs_sold_date_sk@4] │ CoalescePartitionsExec │ [Stage 29] => NetworkBroadcastExec: partitions_per_consumer=2, stage_partitions=6, input_tasks=2 - │ PartitionIsolatorExec: tasks=3 partitions=6 - │ DataSourceExec: file_groups={6 groups: [[/testdata/tpcds/plans_sf1_partitions4/catalog_sales/part-0.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/catalog_sales/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/catalog_sales/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/catalog_sales/part-2.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/catalog_sales/part-2.parquet:..], ...]}, projection=[cs_sold_date_sk, cs_item_sk], file_type=parquet, predicate=DynamicFilter [ empty ] AND DynamicFilter [ empty ] + │ DistributedLeafExec: DataSourceExec: file_groups={3 groups: [[/testdata/tpcds/plans_sf1_partitions4/catalog_sales/part-0.parquet:.., /testdata/tpcds/plans_sf1_partitions4/catalog_sales/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/catalog_sales/part-1.parquet:.., /testdata/tpcds/plans_sf1_partitions4/catalog_sales/part-2.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/catalog_sales/part-2.parquet:.., /testdata/tpcds/plans_sf1_partitions4/catalog_sales/part-3.parquet:..]]}, projection=[cs_sold_date_sk, cs_item_sk], file_type=parquet, predicate=DynamicFilter [ empty ] AND DynamicFilter [ empty ] └────────────────────────────────────────────────── ┌───── Stage 28 ── Tasks: t0:[p0..p8] t1:[p9..p17] │ BroadcastExec: input_partitions=3, consumer_tasks=3, output_partitions=9 │ ProjectionExec: expr=[d_date_sk@0 as d_date_sk, CAST(d_date_sk@0 AS Float64) as CAST(d2.d_date_sk AS Float64)] │ FilterExec: d_year@1 >= 1999 AND d_year@1 <= 2001, projection=[d_date_sk@0] │ RepartitionExec: partitioning=RoundRobinBatch(3), input_partitions=2 - │ PartitionIsolatorExec: tasks=2 partitions=4 - │ DataSourceExec: file_groups={4 groups: [[/testdata/tpcds/plans_sf1_partitions4/date_dim/part-0.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-2.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-3.parquet]]}, projection=[d_date_sk, d_year], file_type=parquet, predicate=d_year@6 >= 1999 AND d_year@6 <= 2001, pruning_predicate=d_year_null_count@1 != row_count@2 AND d_year_max@0 >= 1999 AND d_year_null_count@1 != row_count@2 AND d_year_min@3 <= 2001, required_guarantees=[] + │ DistributedLeafExec: DataSourceExec: file_groups={2 groups: [[/testdata/tpcds/plans_sf1_partitions4/date_dim/part-0.parquet, /testdata/tpcds/plans_sf1_partitions4/date_dim/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-2.parquet, /testdata/tpcds/plans_sf1_partitions4/date_dim/part-3.parquet]]}, projection=[d_date_sk, d_year], file_type=parquet, predicate=d_year@6 >= 1999 AND d_year@6 <= 2001, pruning_predicate=d_year_null_count@1 != row_count@2 AND d_year_max@0 >= 1999 AND d_year_null_count@1 != row_count@2 AND d_year_min@3 <= 2001, required_guarantees=[] └────────────────────────────────────────────────── ┌───── Stage 29 ── Tasks: t0:[p0..p5] t1:[p6..p11] │ BroadcastExec: input_partitions=2, consumer_tasks=3, output_partitions=6 - │ PartitionIsolatorExec: tasks=2 partitions=4 - │ DataSourceExec: file_groups={4 groups: [[/testdata/tpcds/plans_sf1_partitions4/item/part-0.parquet], [/testdata/tpcds/plans_sf1_partitions4/item/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/item/part-2.parquet], [/testdata/tpcds/plans_sf1_partitions4/item/part-3.parquet]]}, projection=[i_item_sk, i_brand_id, i_class_id, i_category_id], file_type=parquet + │ DistributedLeafExec: DataSourceExec: file_groups={2 groups: [[/testdata/tpcds/plans_sf1_partitions4/item/part-0.parquet, /testdata/tpcds/plans_sf1_partitions4/item/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/item/part-2.parquet, /testdata/tpcds/plans_sf1_partitions4/item/part-3.parquet]]}, projection=[i_item_sk, i_brand_id, i_class_id, i_category_id], file_type=parquet └────────────────────────────────────────────────── ┌───── Stage 33 ── Tasks: t0:[p0..p8] t1:[p0..p8] t2:[p0..p8] - │ RepartitionExec: partitioning=Hash([brand_id@0, class_id@1, category_id@2], 9), input_partitions=2 + │ RepartitionExec: partitioning=Hash([brand_id@0, class_id@1, category_id@2], 9), input_partitions=3 │ AggregateExec: mode=Partial, gby=[brand_id@0 as brand_id, class_id@1 as class_id, category_id@2 as category_id], aggr=[] │ ProjectionExec: expr=[i_brand_id@0 as brand_id, i_class_id@1 as class_id, i_category_id@2 as category_id] │ HashJoinExec: mode=CollectLeft, join_type=Inner, on=[(CAST(d1.d_date_sk AS Float64)@1, ss_sold_date_sk@0)], projection=[i_brand_id@3, i_class_id@4, i_category_id@5] @@ -1759,21 +1618,18 @@ mod tests { │ HashJoinExec: mode=CollectLeft, join_type=Inner, on=[(i_item_sk@0, ss_item_sk@1)], projection=[i_brand_id@1, i_class_id@2, i_category_id@3, ss_sold_date_sk@4] │ CoalescePartitionsExec │ [Stage 32] => NetworkBroadcastExec: partitions_per_consumer=2, stage_partitions=6, input_tasks=2 - │ PartitionIsolatorExec: tasks=3 partitions=6 - │ DataSourceExec: file_groups={6 groups: [[/testdata/tpcds/plans_sf1_partitions4/store_sales/part-0.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_sales/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_sales/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_sales/part-2.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_sales/part-2.parquet:..], ...]}, projection=[ss_sold_date_sk, ss_item_sk], file_type=parquet, predicate=DynamicFilter [ empty ] AND DynamicFilter [ empty ] + │ DistributedLeafExec: DataSourceExec: file_groups={3 groups: [[/testdata/tpcds/plans_sf1_partitions4/store_sales/part-0.parquet:.., /testdata/tpcds/plans_sf1_partitions4/store_sales/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_sales/part-1.parquet:.., /testdata/tpcds/plans_sf1_partitions4/store_sales/part-2.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_sales/part-2.parquet:.., /testdata/tpcds/plans_sf1_partitions4/store_sales/part-3.parquet:..]]}, projection=[ss_sold_date_sk, ss_item_sk], file_type=parquet, predicate=DynamicFilter [ empty ] AND DynamicFilter [ empty ] └────────────────────────────────────────────────── ┌───── Stage 31 ── Tasks: t0:[p0..p8] t1:[p9..p17] │ BroadcastExec: input_partitions=3, consumer_tasks=3, output_partitions=9 │ ProjectionExec: expr=[d_date_sk@0 as d_date_sk, CAST(d_date_sk@0 AS Float64) as CAST(d1.d_date_sk AS Float64)] │ FilterExec: d_year@1 >= 1999 AND d_year@1 <= 2001, projection=[d_date_sk@0] │ RepartitionExec: partitioning=RoundRobinBatch(3), input_partitions=2 - │ PartitionIsolatorExec: tasks=2 partitions=4 - │ DataSourceExec: file_groups={4 groups: [[/testdata/tpcds/plans_sf1_partitions4/date_dim/part-0.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-2.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-3.parquet]]}, projection=[d_date_sk, d_year], file_type=parquet, predicate=d_year@6 >= 1999 AND d_year@6 <= 2001, pruning_predicate=d_year_null_count@1 != row_count@2 AND d_year_max@0 >= 1999 AND d_year_null_count@1 != row_count@2 AND d_year_min@3 <= 2001, required_guarantees=[] + │ DistributedLeafExec: DataSourceExec: file_groups={2 groups: [[/testdata/tpcds/plans_sf1_partitions4/date_dim/part-0.parquet, /testdata/tpcds/plans_sf1_partitions4/date_dim/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-2.parquet, /testdata/tpcds/plans_sf1_partitions4/date_dim/part-3.parquet]]}, projection=[d_date_sk, d_year], file_type=parquet, predicate=d_year@6 >= 1999 AND d_year@6 <= 2001, pruning_predicate=d_year_null_count@1 != row_count@2 AND d_year_max@0 >= 1999 AND d_year_null_count@1 != row_count@2 AND d_year_min@3 <= 2001, required_guarantees=[] └────────────────────────────────────────────────── ┌───── Stage 32 ── Tasks: t0:[p0..p5] t1:[p6..p11] │ BroadcastExec: input_partitions=2, consumer_tasks=3, output_partitions=6 - │ PartitionIsolatorExec: tasks=2 partitions=4 - │ DataSourceExec: file_groups={4 groups: [[/testdata/tpcds/plans_sf1_partitions4/item/part-0.parquet], [/testdata/tpcds/plans_sf1_partitions4/item/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/item/part-2.parquet], [/testdata/tpcds/plans_sf1_partitions4/item/part-3.parquet]]}, projection=[i_item_sk, i_brand_id, i_class_id, i_category_id], file_type=parquet + │ DistributedLeafExec: DataSourceExec: file_groups={2 groups: [[/testdata/tpcds/plans_sf1_partitions4/item/part-0.parquet, /testdata/tpcds/plans_sf1_partitions4/item/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/item/part-2.parquet, /testdata/tpcds/plans_sf1_partitions4/item/part-3.parquet]]}, projection=[i_item_sk, i_brand_id, i_class_id, i_category_id], file_type=parquet └────────────────────────────────────────────────── ┌───── Stage 38 ── Tasks: t0:[p0..p2] t1:[p3..p5] t2:[p6..p8] t3:[p9..p11] │ AggregateExec: mode=Partial, gby=[], aggr=[avg(sq2.quantity * sq2.list_price)] @@ -1782,8 +1638,7 @@ mod tests { │ HashJoinExec: mode=CollectLeft, join_type=Inner, on=[(CAST(date_dim.d_date_sk AS Float64)@1, ss_sold_date_sk@0)], projection=[ss_quantity@3, ss_list_price@4] │ CoalescePartitionsExec │ [Stage 35] => NetworkBroadcastExec: partitions_per_consumer=3, stage_partitions=6, input_tasks=2 - │ PartitionIsolatorExec: tasks=2 partitions=6 - │ DataSourceExec: file_groups={6 groups: [[/testdata/tpcds/plans_sf1_partitions4/store_sales/part-0.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_sales/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_sales/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_sales/part-2.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_sales/part-2.parquet:..], ...]}, projection=[ss_sold_date_sk, ss_quantity, ss_list_price], file_type=parquet, predicate=DynamicFilter [ empty ] + │ DistributedLeafExec: DataSourceExec: file_groups={3 groups: [[/testdata/tpcds/plans_sf1_partitions4/store_sales/part-0.parquet:.., /testdata/tpcds/plans_sf1_partitions4/store_sales/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_sales/part-1.parquet:.., /testdata/tpcds/plans_sf1_partitions4/store_sales/part-2.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_sales/part-2.parquet:.., /testdata/tpcds/plans_sf1_partitions4/store_sales/part-3.parquet:..]]}, projection=[ss_sold_date_sk, ss_quantity, ss_list_price], file_type=parquet, predicate=DynamicFilter [ empty ] │ ProjectionExec: expr=[cs_quantity@0 as quantity, cs_list_price@1 as list_price] │ HashJoinExec: mode=CollectLeft, join_type=Inner, on=[(CAST(date_dim.d_date_sk AS Float64)@1, cs_sold_date_sk@0)], projection=[cs_quantity@3, cs_list_price@4] │ CoalescePartitionsExec @@ -1800,26 +1655,23 @@ mod tests { │ ProjectionExec: expr=[d_date_sk@0 as d_date_sk, CAST(d_date_sk@0 AS Float64) as CAST(date_dim.d_date_sk AS Float64)] │ FilterExec: d_year@1 >= 1999 AND d_year@1 <= 2001, projection=[d_date_sk@0] │ RepartitionExec: partitioning=RoundRobinBatch(3), input_partitions=2 - │ PartitionIsolatorExec: tasks=2 partitions=4 - │ DataSourceExec: file_groups={4 groups: [[/testdata/tpcds/plans_sf1_partitions4/date_dim/part-0.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-2.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-3.parquet]]}, projection=[d_date_sk, d_year], file_type=parquet, predicate=d_year@6 >= 1999 AND d_year@6 <= 2001, pruning_predicate=d_year_null_count@1 != row_count@2 AND d_year_max@0 >= 1999 AND d_year_null_count@1 != row_count@2 AND d_year_min@3 <= 2001, required_guarantees=[] + │ DistributedLeafExec: DataSourceExec: file_groups={2 groups: [[/testdata/tpcds/plans_sf1_partitions4/date_dim/part-0.parquet, /testdata/tpcds/plans_sf1_partitions4/date_dim/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-2.parquet, /testdata/tpcds/plans_sf1_partitions4/date_dim/part-3.parquet]]}, projection=[d_date_sk, d_year], file_type=parquet, predicate=d_year@6 >= 1999 AND d_year@6 <= 2001, pruning_predicate=d_year_null_count@1 != row_count@2 AND d_year_max@0 >= 1999 AND d_year_null_count@1 != row_count@2 AND d_year_min@3 <= 2001, required_guarantees=[] └────────────────────────────────────────────────── ┌───── Stage 36 ── Tasks: t0:[p0..p2] t1:[p3..p5] │ BroadcastExec: input_partitions=3, consumer_tasks=1, output_partitions=3 │ ProjectionExec: expr=[d_date_sk@0 as d_date_sk, CAST(d_date_sk@0 AS Float64) as CAST(date_dim.d_date_sk AS Float64)] │ FilterExec: d_year@1 >= 1999 AND d_year@1 <= 2001, projection=[d_date_sk@0] │ RepartitionExec: partitioning=RoundRobinBatch(3), input_partitions=2 - │ PartitionIsolatorExec: tasks=2 partitions=4 - │ DataSourceExec: file_groups={4 groups: [[/testdata/tpcds/plans_sf1_partitions4/date_dim/part-0.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-2.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-3.parquet]]}, projection=[d_date_sk, d_year], file_type=parquet, predicate=d_year@6 >= 1999 AND d_year@6 <= 2001, pruning_predicate=d_year_null_count@1 != row_count@2 AND d_year_max@0 >= 1999 AND d_year_null_count@1 != row_count@2 AND d_year_min@3 <= 2001, required_guarantees=[] + │ DistributedLeafExec: DataSourceExec: file_groups={2 groups: [[/testdata/tpcds/plans_sf1_partitions4/date_dim/part-0.parquet, /testdata/tpcds/plans_sf1_partitions4/date_dim/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-2.parquet, /testdata/tpcds/plans_sf1_partitions4/date_dim/part-3.parquet]]}, projection=[d_date_sk, d_year], file_type=parquet, predicate=d_year@6 >= 1999 AND d_year@6 <= 2001, pruning_predicate=d_year_null_count@1 != row_count@2 AND d_year_max@0 >= 1999 AND d_year_null_count@1 != row_count@2 AND d_year_min@3 <= 2001, required_guarantees=[] └────────────────────────────────────────────────── ┌───── Stage 37 ── Tasks: t0:[p0..p2] t1:[p3..p5] │ BroadcastExec: input_partitions=3, consumer_tasks=1, output_partitions=3 │ ProjectionExec: expr=[d_date_sk@0 as d_date_sk, CAST(d_date_sk@0 AS Float64) as CAST(date_dim.d_date_sk AS Float64)] │ FilterExec: d_year@1 >= 1999 AND d_year@1 <= 2001, projection=[d_date_sk@0] │ RepartitionExec: partitioning=RoundRobinBatch(3), input_partitions=2 - │ PartitionIsolatorExec: tasks=2 partitions=4 - │ DataSourceExec: file_groups={4 groups: [[/testdata/tpcds/plans_sf1_partitions4/date_dim/part-0.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-2.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-3.parquet]]}, projection=[d_date_sk, d_year], file_type=parquet, predicate=d_year@6 >= 1999 AND d_year@6 <= 2001, pruning_predicate=d_year_null_count@1 != row_count@2 AND d_year_max@0 >= 1999 AND d_year_null_count@1 != row_count@2 AND d_year_min@3 <= 2001, required_guarantees=[] + │ DistributedLeafExec: DataSourceExec: file_groups={2 groups: [[/testdata/tpcds/plans_sf1_partitions4/date_dim/part-0.parquet, /testdata/tpcds/plans_sf1_partitions4/date_dim/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-2.parquet, /testdata/tpcds/plans_sf1_partitions4/date_dim/part-3.parquet]]}, projection=[d_date_sk, d_year], file_type=parquet, predicate=d_year@6 >= 1999 AND d_year@6 <= 2001, pruning_predicate=d_year_null_count@1 != row_count@2 AND d_year_max@0 >= 1999 AND d_year_null_count@1 != row_count@2 AND d_year_min@3 <= 2001, required_guarantees=[] └────────────────────────────────────────────────── - ┌───── Stage 41 ── Tasks: t0:[p0..p1] t1:[p2..p3] t2:[p4..p5] + ┌───── Stage 41 ── Tasks: t0:[p0..p2] t1:[p3..p5] t2:[p6..p8] │ HashJoinExec: mode=CollectLeft, join_type=Inner, on=[(CAST(date_dim.d_date_sk AS Float64)@1, ws_sold_date_sk@0)], projection=[ws_item_sk@3, ws_quantity@4, ws_list_price@5, i_brand_id@6, i_class_id@7, i_category_id@8] │ CoalescePartitionsExec │ [Stage 39] => NetworkBroadcastExec: partitions_per_consumer=3, stage_partitions=9, input_tasks=2 @@ -1827,36 +1679,33 @@ mod tests { │ HashJoinExec: mode=CollectLeft, join_type=Inner, on=[(i_item_sk@0, ws_item_sk@1)], projection=[i_brand_id@1, i_class_id@2, i_category_id@3, ws_sold_date_sk@4, ws_item_sk@5, ws_quantity@6, ws_list_price@7] │ CoalescePartitionsExec │ [Stage 40] => NetworkBroadcastExec: partitions_per_consumer=2, stage_partitions=6, input_tasks=2 - │ PartitionIsolatorExec: tasks=3 partitions=6 - │ DataSourceExec: file_groups={6 groups: [[/testdata/tpcds/plans_sf1_partitions4/web_sales/part-0.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/web_sales/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/web_sales/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/web_sales/part-2.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/web_sales/part-2.parquet:..], ...]}, projection=[ws_sold_date_sk, ws_item_sk, ws_quantity, ws_list_price], file_type=parquet, predicate=DynamicFilter [ empty ] AND DynamicFilter [ empty ] + │ DistributedLeafExec: DataSourceExec: file_groups={3 groups: [[/testdata/tpcds/plans_sf1_partitions4/web_sales/part-0.parquet:.., /testdata/tpcds/plans_sf1_partitions4/web_sales/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/web_sales/part-1.parquet:.., /testdata/tpcds/plans_sf1_partitions4/web_sales/part-2.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/web_sales/part-2.parquet:.., /testdata/tpcds/plans_sf1_partitions4/web_sales/part-3.parquet:..]]}, projection=[ws_sold_date_sk, ws_item_sk, ws_quantity, ws_list_price], file_type=parquet, predicate=DynamicFilter [ empty ] AND DynamicFilter [ empty ] └────────────────────────────────────────────────── ┌───── Stage 39 ── Tasks: t0:[p0..p8] t1:[p9..p17] │ BroadcastExec: input_partitions=3, consumer_tasks=3, output_partitions=9 │ ProjectionExec: expr=[d_date_sk@0 as d_date_sk, CAST(d_date_sk@0 AS Float64) as CAST(date_dim.d_date_sk AS Float64)] │ FilterExec: d_year@1 = 2001 AND d_moy@2 = 11, projection=[d_date_sk@0] │ RepartitionExec: partitioning=RoundRobinBatch(3), input_partitions=2 - │ PartitionIsolatorExec: tasks=2 partitions=4 - │ DataSourceExec: file_groups={4 groups: [[/testdata/tpcds/plans_sf1_partitions4/date_dim/part-0.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-2.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-3.parquet]]}, projection=[d_date_sk, d_year, d_moy], file_type=parquet, predicate=d_year@6 = 2001 AND d_moy@8 = 11, pruning_predicate=d_year_null_count@2 != row_count@3 AND d_year_min@0 <= 2001 AND 2001 <= d_year_max@1 AND d_moy_null_count@6 != row_count@3 AND d_moy_min@4 <= 11 AND 11 <= d_moy_max@5, required_guarantees=[d_moy in (11), d_year in (2001)] + │ DistributedLeafExec: DataSourceExec: file_groups={2 groups: [[/testdata/tpcds/plans_sf1_partitions4/date_dim/part-0.parquet, /testdata/tpcds/plans_sf1_partitions4/date_dim/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-2.parquet, /testdata/tpcds/plans_sf1_partitions4/date_dim/part-3.parquet]]}, projection=[d_date_sk, d_year, d_moy], file_type=parquet, predicate=d_year@6 = 2001 AND d_moy@8 = 11, pruning_predicate=d_year_null_count@2 != row_count@3 AND d_year_min@0 <= 2001 AND 2001 <= d_year_max@1 AND d_moy_null_count@6 != row_count@3 AND d_moy_min@4 <= 11 AND 11 <= d_moy_max@5, required_guarantees=[d_moy in (11), d_year in (2001)] └────────────────────────────────────────────────── ┌───── Stage 40 ── Tasks: t0:[p0..p5] t1:[p6..p11] │ BroadcastExec: input_partitions=2, consumer_tasks=3, output_partitions=6 - │ PartitionIsolatorExec: tasks=2 partitions=4 - │ DataSourceExec: file_groups={4 groups: [[/testdata/tpcds/plans_sf1_partitions4/item/part-0.parquet], [/testdata/tpcds/plans_sf1_partitions4/item/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/item/part-2.parquet], [/testdata/tpcds/plans_sf1_partitions4/item/part-3.parquet]]}, projection=[i_item_sk, i_brand_id, i_class_id, i_category_id], file_type=parquet + │ DistributedLeafExec: DataSourceExec: file_groups={2 groups: [[/testdata/tpcds/plans_sf1_partitions4/item/part-0.parquet, /testdata/tpcds/plans_sf1_partitions4/item/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/item/part-2.parquet, /testdata/tpcds/plans_sf1_partitions4/item/part-3.parquet]]}, projection=[i_item_sk, i_brand_id, i_class_id, i_category_id], file_type=parquet └────────────────────────────────────────────────── ┌───── Stage 51 ── Tasks: t0:[p0..p2] t1:[p3..p5] t2:[p6..p8] │ BroadcastExec: input_partitions=3, consumer_tasks=1, output_partitions=3 │ HashJoinExec: mode=CollectLeft, join_type=RightSemi, on=[(i_brand_id@0, brand_id@0), (i_class_id@1, class_id@1), (i_category_id@2, category_id@2)], NullsEqual: true │ CoalescePartitionsExec - │ [Stage 44] => NetworkBroadcastExec: partitions_per_consumer=2, stage_partitions=6, input_tasks=3 + │ [Stage 44] => NetworkBroadcastExec: partitions_per_consumer=3, stage_partitions=9, input_tasks=3 │ AggregateExec: mode=SinglePartitioned, gby=[brand_id@0 as brand_id, class_id@1 as class_id, category_id@2 as category_id], aggr=[] │ HashJoinExec: mode=CollectLeft, join_type=RightSemi, on=[(i_brand_id@0, brand_id@0), (i_class_id@1, class_id@1), (i_category_id@2, category_id@2)], NullsEqual: true │ CoalescePartitionsExec - │ [Stage 47] => NetworkBroadcastExec: partitions_per_consumer=2, stage_partitions=6, input_tasks=3 + │ [Stage 47] => NetworkBroadcastExec: partitions_per_consumer=3, stage_partitions=9, input_tasks=3 │ AggregateExec: mode=FinalPartitioned, gby=[brand_id@0 as brand_id, class_id@1 as class_id, category_id@2 as category_id], aggr=[] │ [Stage 50] => NetworkShuffleExec: output_partitions=3, input_tasks=3 └────────────────────────────────────────────────── - ┌───── Stage 44 ── Tasks: t0:[p0..p5] t1:[p6..p11] t2:[p12..p17] - │ BroadcastExec: input_partitions=2, consumer_tasks=3, output_partitions=6 + ┌───── Stage 44 ── Tasks: t0:[p0..p8] t1:[p9..p17] t2:[p18..p26] + │ BroadcastExec: input_partitions=3, consumer_tasks=3, output_partitions=9 │ HashJoinExec: mode=CollectLeft, join_type=Inner, on=[(CAST(d3.d_date_sk AS Float64)@1, ws_sold_date_sk@0)], projection=[i_brand_id@3, i_class_id@4, i_category_id@5] │ CoalescePartitionsExec │ [Stage 42] => NetworkBroadcastExec: partitions_per_consumer=3, stage_partitions=9, input_tasks=2 @@ -1864,24 +1713,21 @@ mod tests { │ HashJoinExec: mode=CollectLeft, join_type=Inner, on=[(i_item_sk@0, ws_item_sk@1)], projection=[i_brand_id@1, i_class_id@2, i_category_id@3, ws_sold_date_sk@4] │ CoalescePartitionsExec │ [Stage 43] => NetworkBroadcastExec: partitions_per_consumer=2, stage_partitions=6, input_tasks=2 - │ PartitionIsolatorExec: tasks=3 partitions=6 - │ DataSourceExec: file_groups={6 groups: [[/testdata/tpcds/plans_sf1_partitions4/web_sales/part-0.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/web_sales/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/web_sales/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/web_sales/part-2.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/web_sales/part-2.parquet:..], ...]}, projection=[ws_sold_date_sk, ws_item_sk], file_type=parquet, predicate=DynamicFilter [ empty ] AND DynamicFilter [ empty ] + │ DistributedLeafExec: DataSourceExec: file_groups={3 groups: [[/testdata/tpcds/plans_sf1_partitions4/web_sales/part-0.parquet:.., /testdata/tpcds/plans_sf1_partitions4/web_sales/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/web_sales/part-1.parquet:.., /testdata/tpcds/plans_sf1_partitions4/web_sales/part-2.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/web_sales/part-2.parquet:.., /testdata/tpcds/plans_sf1_partitions4/web_sales/part-3.parquet:..]]}, projection=[ws_sold_date_sk, ws_item_sk], file_type=parquet, predicate=DynamicFilter [ empty ] AND DynamicFilter [ empty ] └────────────────────────────────────────────────── ┌───── Stage 42 ── Tasks: t0:[p0..p8] t1:[p9..p17] │ BroadcastExec: input_partitions=3, consumer_tasks=3, output_partitions=9 │ ProjectionExec: expr=[d_date_sk@0 as d_date_sk, CAST(d_date_sk@0 AS Float64) as CAST(d3.d_date_sk AS Float64)] │ FilterExec: d_year@1 >= 1999 AND d_year@1 <= 2001, projection=[d_date_sk@0] │ RepartitionExec: partitioning=RoundRobinBatch(3), input_partitions=2 - │ PartitionIsolatorExec: tasks=2 partitions=4 - │ DataSourceExec: file_groups={4 groups: [[/testdata/tpcds/plans_sf1_partitions4/date_dim/part-0.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-2.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-3.parquet]]}, projection=[d_date_sk, d_year], file_type=parquet, predicate=d_year@6 >= 1999 AND d_year@6 <= 2001, pruning_predicate=d_year_null_count@1 != row_count@2 AND d_year_max@0 >= 1999 AND d_year_null_count@1 != row_count@2 AND d_year_min@3 <= 2001, required_guarantees=[] + │ DistributedLeafExec: DataSourceExec: file_groups={2 groups: [[/testdata/tpcds/plans_sf1_partitions4/date_dim/part-0.parquet, /testdata/tpcds/plans_sf1_partitions4/date_dim/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-2.parquet, /testdata/tpcds/plans_sf1_partitions4/date_dim/part-3.parquet]]}, projection=[d_date_sk, d_year], file_type=parquet, predicate=d_year@6 >= 1999 AND d_year@6 <= 2001, pruning_predicate=d_year_null_count@1 != row_count@2 AND d_year_max@0 >= 1999 AND d_year_null_count@1 != row_count@2 AND d_year_min@3 <= 2001, required_guarantees=[] └────────────────────────────────────────────────── ┌───── Stage 43 ── Tasks: t0:[p0..p5] t1:[p6..p11] │ BroadcastExec: input_partitions=2, consumer_tasks=3, output_partitions=6 - │ PartitionIsolatorExec: tasks=2 partitions=4 - │ DataSourceExec: file_groups={4 groups: [[/testdata/tpcds/plans_sf1_partitions4/item/part-0.parquet], [/testdata/tpcds/plans_sf1_partitions4/item/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/item/part-2.parquet], [/testdata/tpcds/plans_sf1_partitions4/item/part-3.parquet]]}, projection=[i_item_sk, i_brand_id, i_class_id, i_category_id], file_type=parquet + │ DistributedLeafExec: DataSourceExec: file_groups={2 groups: [[/testdata/tpcds/plans_sf1_partitions4/item/part-0.parquet, /testdata/tpcds/plans_sf1_partitions4/item/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/item/part-2.parquet, /testdata/tpcds/plans_sf1_partitions4/item/part-3.parquet]]}, projection=[i_item_sk, i_brand_id, i_class_id, i_category_id], file_type=parquet └────────────────────────────────────────────────── - ┌───── Stage 47 ── Tasks: t0:[p0..p5] t1:[p6..p11] t2:[p12..p17] - │ BroadcastExec: input_partitions=2, consumer_tasks=3, output_partitions=6 + ┌───── Stage 47 ── Tasks: t0:[p0..p8] t1:[p9..p17] t2:[p18..p26] + │ BroadcastExec: input_partitions=3, consumer_tasks=3, output_partitions=9 │ HashJoinExec: mode=CollectLeft, join_type=Inner, on=[(CAST(d2.d_date_sk AS Float64)@1, cs_sold_date_sk@0)], projection=[i_brand_id@3, i_class_id@4, i_category_id@5] │ CoalescePartitionsExec │ [Stage 45] => NetworkBroadcastExec: partitions_per_consumer=3, stage_partitions=9, input_tasks=2 @@ -1889,24 +1735,21 @@ mod tests { │ HashJoinExec: mode=CollectLeft, join_type=Inner, on=[(i_item_sk@0, cs_item_sk@1)], projection=[i_brand_id@1, i_class_id@2, i_category_id@3, cs_sold_date_sk@4] │ CoalescePartitionsExec │ [Stage 46] => NetworkBroadcastExec: partitions_per_consumer=2, stage_partitions=6, input_tasks=2 - │ PartitionIsolatorExec: tasks=3 partitions=6 - │ DataSourceExec: file_groups={6 groups: [[/testdata/tpcds/plans_sf1_partitions4/catalog_sales/part-0.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/catalog_sales/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/catalog_sales/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/catalog_sales/part-2.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/catalog_sales/part-2.parquet:..], ...]}, projection=[cs_sold_date_sk, cs_item_sk], file_type=parquet, predicate=DynamicFilter [ empty ] AND DynamicFilter [ empty ] + │ DistributedLeafExec: DataSourceExec: file_groups={3 groups: [[/testdata/tpcds/plans_sf1_partitions4/catalog_sales/part-0.parquet:.., /testdata/tpcds/plans_sf1_partitions4/catalog_sales/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/catalog_sales/part-1.parquet:.., /testdata/tpcds/plans_sf1_partitions4/catalog_sales/part-2.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/catalog_sales/part-2.parquet:.., /testdata/tpcds/plans_sf1_partitions4/catalog_sales/part-3.parquet:..]]}, projection=[cs_sold_date_sk, cs_item_sk], file_type=parquet, predicate=DynamicFilter [ empty ] AND DynamicFilter [ empty ] └────────────────────────────────────────────────── ┌───── Stage 45 ── Tasks: t0:[p0..p8] t1:[p9..p17] │ BroadcastExec: input_partitions=3, consumer_tasks=3, output_partitions=9 │ ProjectionExec: expr=[d_date_sk@0 as d_date_sk, CAST(d_date_sk@0 AS Float64) as CAST(d2.d_date_sk AS Float64)] │ FilterExec: d_year@1 >= 1999 AND d_year@1 <= 2001, projection=[d_date_sk@0] │ RepartitionExec: partitioning=RoundRobinBatch(3), input_partitions=2 - │ PartitionIsolatorExec: tasks=2 partitions=4 - │ DataSourceExec: file_groups={4 groups: [[/testdata/tpcds/plans_sf1_partitions4/date_dim/part-0.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-2.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-3.parquet]]}, projection=[d_date_sk, d_year], file_type=parquet, predicate=d_year@6 >= 1999 AND d_year@6 <= 2001, pruning_predicate=d_year_null_count@1 != row_count@2 AND d_year_max@0 >= 1999 AND d_year_null_count@1 != row_count@2 AND d_year_min@3 <= 2001, required_guarantees=[] + │ DistributedLeafExec: DataSourceExec: file_groups={2 groups: [[/testdata/tpcds/plans_sf1_partitions4/date_dim/part-0.parquet, /testdata/tpcds/plans_sf1_partitions4/date_dim/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-2.parquet, /testdata/tpcds/plans_sf1_partitions4/date_dim/part-3.parquet]]}, projection=[d_date_sk, d_year], file_type=parquet, predicate=d_year@6 >= 1999 AND d_year@6 <= 2001, pruning_predicate=d_year_null_count@1 != row_count@2 AND d_year_max@0 >= 1999 AND d_year_null_count@1 != row_count@2 AND d_year_min@3 <= 2001, required_guarantees=[] └────────────────────────────────────────────────── ┌───── Stage 46 ── Tasks: t0:[p0..p5] t1:[p6..p11] │ BroadcastExec: input_partitions=2, consumer_tasks=3, output_partitions=6 - │ PartitionIsolatorExec: tasks=2 partitions=4 - │ DataSourceExec: file_groups={4 groups: [[/testdata/tpcds/plans_sf1_partitions4/item/part-0.parquet], [/testdata/tpcds/plans_sf1_partitions4/item/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/item/part-2.parquet], [/testdata/tpcds/plans_sf1_partitions4/item/part-3.parquet]]}, projection=[i_item_sk, i_brand_id, i_class_id, i_category_id], file_type=parquet + │ DistributedLeafExec: DataSourceExec: file_groups={2 groups: [[/testdata/tpcds/plans_sf1_partitions4/item/part-0.parquet, /testdata/tpcds/plans_sf1_partitions4/item/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/item/part-2.parquet, /testdata/tpcds/plans_sf1_partitions4/item/part-3.parquet]]}, projection=[i_item_sk, i_brand_id, i_class_id, i_category_id], file_type=parquet └────────────────────────────────────────────────── ┌───── Stage 50 ── Tasks: t0:[p0..p8] t1:[p0..p8] t2:[p0..p8] - │ RepartitionExec: partitioning=Hash([brand_id@0, class_id@1, category_id@2], 9), input_partitions=2 + │ RepartitionExec: partitioning=Hash([brand_id@0, class_id@1, category_id@2], 9), input_partitions=3 │ AggregateExec: mode=Partial, gby=[brand_id@0 as brand_id, class_id@1 as class_id, category_id@2 as category_id], aggr=[] │ ProjectionExec: expr=[i_brand_id@0 as brand_id, i_class_id@1 as class_id, i_category_id@2 as category_id] │ HashJoinExec: mode=CollectLeft, join_type=Inner, on=[(CAST(d1.d_date_sk AS Float64)@1, ss_sold_date_sk@0)], projection=[i_brand_id@3, i_class_id@4, i_category_id@5] @@ -1916,21 +1759,18 @@ mod tests { │ HashJoinExec: mode=CollectLeft, join_type=Inner, on=[(i_item_sk@0, ss_item_sk@1)], projection=[i_brand_id@1, i_class_id@2, i_category_id@3, ss_sold_date_sk@4] │ CoalescePartitionsExec │ [Stage 49] => NetworkBroadcastExec: partitions_per_consumer=2, stage_partitions=6, input_tasks=2 - │ PartitionIsolatorExec: tasks=3 partitions=6 - │ DataSourceExec: file_groups={6 groups: [[/testdata/tpcds/plans_sf1_partitions4/store_sales/part-0.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_sales/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_sales/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_sales/part-2.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_sales/part-2.parquet:..], ...]}, projection=[ss_sold_date_sk, ss_item_sk], file_type=parquet, predicate=DynamicFilter [ empty ] AND DynamicFilter [ empty ] + │ DistributedLeafExec: DataSourceExec: file_groups={3 groups: [[/testdata/tpcds/plans_sf1_partitions4/store_sales/part-0.parquet:.., /testdata/tpcds/plans_sf1_partitions4/store_sales/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_sales/part-1.parquet:.., /testdata/tpcds/plans_sf1_partitions4/store_sales/part-2.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_sales/part-2.parquet:.., /testdata/tpcds/plans_sf1_partitions4/store_sales/part-3.parquet:..]]}, projection=[ss_sold_date_sk, ss_item_sk], file_type=parquet, predicate=DynamicFilter [ empty ] AND DynamicFilter [ empty ] └────────────────────────────────────────────────── ┌───── Stage 48 ── Tasks: t0:[p0..p8] t1:[p9..p17] │ BroadcastExec: input_partitions=3, consumer_tasks=3, output_partitions=9 │ ProjectionExec: expr=[d_date_sk@0 as d_date_sk, CAST(d_date_sk@0 AS Float64) as CAST(d1.d_date_sk AS Float64)] │ FilterExec: d_year@1 >= 1999 AND d_year@1 <= 2001, projection=[d_date_sk@0] │ RepartitionExec: partitioning=RoundRobinBatch(3), input_partitions=2 - │ PartitionIsolatorExec: tasks=2 partitions=4 - │ DataSourceExec: file_groups={4 groups: [[/testdata/tpcds/plans_sf1_partitions4/date_dim/part-0.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-2.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-3.parquet]]}, projection=[d_date_sk, d_year], file_type=parquet, predicate=d_year@6 >= 1999 AND d_year@6 <= 2001, pruning_predicate=d_year_null_count@1 != row_count@2 AND d_year_max@0 >= 1999 AND d_year_null_count@1 != row_count@2 AND d_year_min@3 <= 2001, required_guarantees=[] + │ DistributedLeafExec: DataSourceExec: file_groups={2 groups: [[/testdata/tpcds/plans_sf1_partitions4/date_dim/part-0.parquet, /testdata/tpcds/plans_sf1_partitions4/date_dim/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-2.parquet, /testdata/tpcds/plans_sf1_partitions4/date_dim/part-3.parquet]]}, projection=[d_date_sk, d_year], file_type=parquet, predicate=d_year@6 >= 1999 AND d_year@6 <= 2001, pruning_predicate=d_year_null_count@1 != row_count@2 AND d_year_max@0 >= 1999 AND d_year_null_count@1 != row_count@2 AND d_year_min@3 <= 2001, required_guarantees=[] └────────────────────────────────────────────────── ┌───── Stage 49 ── Tasks: t0:[p0..p5] t1:[p6..p11] │ BroadcastExec: input_partitions=2, consumer_tasks=3, output_partitions=6 - │ PartitionIsolatorExec: tasks=2 partitions=4 - │ DataSourceExec: file_groups={4 groups: [[/testdata/tpcds/plans_sf1_partitions4/item/part-0.parquet], [/testdata/tpcds/plans_sf1_partitions4/item/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/item/part-2.parquet], [/testdata/tpcds/plans_sf1_partitions4/item/part-3.parquet]]}, projection=[i_item_sk, i_brand_id, i_class_id, i_category_id], file_type=parquet + │ DistributedLeafExec: DataSourceExec: file_groups={2 groups: [[/testdata/tpcds/plans_sf1_partitions4/item/part-0.parquet, /testdata/tpcds/plans_sf1_partitions4/item/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/item/part-2.parquet, /testdata/tpcds/plans_sf1_partitions4/item/part-3.parquet]]}, projection=[i_item_sk, i_brand_id, i_class_id, i_category_id], file_type=parquet └────────────────────────────────────────────────── "); Ok(()) @@ -1938,7 +1778,7 @@ mod tests { #[tokio::test] async fn test_tpcds_15() -> Result<()> { let display = test_tpcds_query("q15").await?; - assert_snapshot!(display, @" + assert_snapshot!(display, @r" ┌───── DistributedExec ── Tasks: t0:[p0] │ SortPreservingMergeExec: [ca_zip@0 ASC], fetch=100 │ [Stage 6] => NetworkCoalesceExec: output_partitions=6, input_tasks=2 @@ -1968,25 +1808,21 @@ mod tests { │ ProjectionExec: expr=[d_date_sk@0 as d_date_sk, CAST(d_date_sk@0 AS Float64) as CAST(date_dim.d_date_sk AS Float64)] │ FilterExec: d_qoy@2 = 2 AND d_year@1 = 2001, projection=[d_date_sk@0] │ RepartitionExec: partitioning=RoundRobinBatch(3), input_partitions=2 - │ PartitionIsolatorExec: tasks=2 partitions=4 - │ DataSourceExec: file_groups={4 groups: [[/testdata/tpcds/plans_sf1_partitions4/date_dim/part-0.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-2.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-3.parquet]]}, projection=[d_date_sk, d_year, d_qoy], file_type=parquet, predicate=d_qoy@10 = 2 AND d_year@6 = 2001, pruning_predicate=d_qoy_null_count@2 != row_count@3 AND d_qoy_min@0 <= 2 AND 2 <= d_qoy_max@1 AND d_year_null_count@6 != row_count@3 AND d_year_min@4 <= 2001 AND 2001 <= d_year_max@5, required_guarantees=[d_qoy in (2), d_year in (2001)] + │ DistributedLeafExec: DataSourceExec: file_groups={2 groups: [[/testdata/tpcds/plans_sf1_partitions4/date_dim/part-0.parquet, /testdata/tpcds/plans_sf1_partitions4/date_dim/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-2.parquet, /testdata/tpcds/plans_sf1_partitions4/date_dim/part-3.parquet]]}, projection=[d_date_sk, d_year, d_qoy], file_type=parquet, predicate=d_qoy@10 = 2 AND d_year@6 = 2001, pruning_predicate=d_qoy_null_count@2 != row_count@3 AND d_qoy_min@0 <= 2 AND 2 <= d_qoy_max@1 AND d_year_null_count@6 != row_count@3 AND d_year_min@4 <= 2001 AND 2001 <= d_year_max@5, required_guarantees=[d_qoy in (2), d_year in (2001)] └────────────────────────────────────────────────── ┌───── Stage 2 ── Tasks: t0:[p0..p5] t1:[p6..p11] │ BroadcastExec: input_partitions=2, consumer_tasks=3, output_partitions=6 - │ PartitionIsolatorExec: tasks=2 partitions=4 - │ DataSourceExec: file_groups={4 groups: [[/testdata/tpcds/plans_sf1_partitions4/customer_address/part-0.parquet], [/testdata/tpcds/plans_sf1_partitions4/customer_address/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/customer_address/part-2.parquet], [/testdata/tpcds/plans_sf1_partitions4/customer_address/part-3.parquet]]}, projection=[ca_address_sk, ca_state, ca_zip], file_type=parquet + │ DistributedLeafExec: DataSourceExec: file_groups={2 groups: [[/testdata/tpcds/plans_sf1_partitions4/customer_address/part-0.parquet, /testdata/tpcds/plans_sf1_partitions4/customer_address/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/customer_address/part-2.parquet, /testdata/tpcds/plans_sf1_partitions4/customer_address/part-3.parquet]]}, projection=[ca_address_sk, ca_state, ca_zip], file_type=parquet └────────────────────────────────────────────────── ┌───── Stage 3 ── Tasks: t0:[p0..p8] t1:[p0..p8] │ RepartitionExec: partitioning=Hash([CAST(customer.c_customer_sk AS Float64)@2], 9), input_partitions=3 │ ProjectionExec: expr=[c_customer_sk@0 as c_customer_sk, c_current_addr_sk@1 as c_current_addr_sk, CAST(c_customer_sk@0 AS Float64) as CAST(customer.c_customer_sk AS Float64)] │ RepartitionExec: partitioning=RoundRobinBatch(3), input_partitions=2 - │ PartitionIsolatorExec: tasks=2 partitions=4 - │ DataSourceExec: file_groups={4 groups: [[/testdata/tpcds/plans_sf1_partitions4/customer/part-0.parquet], [/testdata/tpcds/plans_sf1_partitions4/customer/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/customer/part-2.parquet], [/testdata/tpcds/plans_sf1_partitions4/customer/part-3.parquet]]}, projection=[c_customer_sk, c_current_addr_sk], file_type=parquet, predicate=DynamicFilter [ empty ] + │ DistributedLeafExec: DataSourceExec: file_groups={2 groups: [[/testdata/tpcds/plans_sf1_partitions4/customer/part-0.parquet, /testdata/tpcds/plans_sf1_partitions4/customer/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/customer/part-2.parquet, /testdata/tpcds/plans_sf1_partitions4/customer/part-3.parquet]]}, projection=[c_customer_sk, c_current_addr_sk], file_type=parquet, predicate=DynamicFilter [ empty ] └────────────────────────────────────────────────── ┌───── Stage 4 ── Tasks: t0:[p0..p8] t1:[p0..p8] t2:[p0..p8] - │ RepartitionExec: partitioning=Hash([cs_bill_customer_sk@1], 9), input_partitions=2 - │ PartitionIsolatorExec: tasks=3 partitions=6 - │ DataSourceExec: file_groups={6 groups: [[/testdata/tpcds/plans_sf1_partitions4/catalog_sales/part-0.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/catalog_sales/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/catalog_sales/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/catalog_sales/part-2.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/catalog_sales/part-2.parquet:..], ...]}, projection=[cs_sold_date_sk, cs_bill_customer_sk, cs_sales_price], file_type=parquet, predicate=DynamicFilter [ empty ] AND DynamicFilter [ empty ] + │ RepartitionExec: partitioning=Hash([cs_bill_customer_sk@1], 9), input_partitions=3 + │ DistributedLeafExec: DataSourceExec: file_groups={3 groups: [[/testdata/tpcds/plans_sf1_partitions4/catalog_sales/part-0.parquet:.., /testdata/tpcds/plans_sf1_partitions4/catalog_sales/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/catalog_sales/part-1.parquet:.., /testdata/tpcds/plans_sf1_partitions4/catalog_sales/part-2.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/catalog_sales/part-2.parquet:.., /testdata/tpcds/plans_sf1_partitions4/catalog_sales/part-3.parquet:..]]}, projection=[cs_sold_date_sk, cs_bill_customer_sk, cs_sales_price], file_type=parquet, predicate=DynamicFilter [ empty ] AND DynamicFilter [ empty ] └────────────────────────────────────────────────── "); Ok(()) @@ -1994,7 +1830,7 @@ mod tests { #[tokio::test] async fn test_tpcds_16() -> Result<()> { let display = test_tpcds_query("q16").await?; - assert_snapshot!(display, @" + assert_snapshot!(display, @r" ┌───── DistributedExec ── Tasks: t0:[p0] │ ProjectionExec: expr=[count(alias1)@0 as order count, sum(alias2)@1 as total shipping cost, sum(alias3)@2 as total net profit] │ GlobalLimitExec: skip=0, fetch=100 @@ -2008,11 +1844,11 @@ mod tests { │ CoalescePartitionsExec │ HashJoinExec: mode=CollectLeft, join_type=LeftSemi, on=[(cs_order_number@1, cs_order_number@1)], filter=cs_warehouse_sk@0 != cs_warehouse_sk@1, projection=[cs_order_number@1, cs_ext_ship_cost@2, cs_net_profit@3] │ CoalescePartitionsExec - │ [Stage 4] => NetworkCoalesceExec: output_partitions=6, input_tasks=3 + │ [Stage 4] => NetworkCoalesceExec: output_partitions=9, input_tasks=3 │ DataSourceExec: file_groups={3 groups: [[/testdata/tpcds/plans_sf1_partitions4/catalog_sales/part-0.parquet:.., /testdata/tpcds/plans_sf1_partitions4/catalog_sales/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/catalog_sales/part-1.parquet:.., /testdata/tpcds/plans_sf1_partitions4/catalog_sales/part-2.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/catalog_sales/part-2.parquet:.., /testdata/tpcds/plans_sf1_partitions4/catalog_sales/part-3.parquet:..]]}, projection=[cs_warehouse_sk, cs_order_number], file_type=parquet │ DataSourceExec: file_groups={3 groups: [[/testdata/tpcds/plans_sf1_partitions4/catalog_returns/part-0.parquet:.., /testdata/tpcds/plans_sf1_partitions4/catalog_returns/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/catalog_returns/part-1.parquet:.., /testdata/tpcds/plans_sf1_partitions4/catalog_returns/part-2.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/catalog_returns/part-2.parquet:.., /testdata/tpcds/plans_sf1_partitions4/catalog_returns/part-3.parquet:..]]}, projection=[cr_order_number], file_type=parquet └────────────────────────────────────────────────── - ┌───── Stage 4 ── Tasks: t0:[p0..p1] t1:[p2..p3] t2:[p4..p5] + ┌───── Stage 4 ── Tasks: t0:[p0..p2] t1:[p3..p5] t2:[p6..p8] │ HashJoinExec: mode=CollectLeft, join_type=Inner, on=[(CAST(call_center.cc_call_center_sk AS Float64)@1, cs_call_center_sk@0)], projection=[cs_warehouse_sk@3, cs_order_number@4, cs_ext_ship_cost@5, cs_net_profit@6] │ CoalescePartitionsExec │ [Stage 1] => NetworkBroadcastExec: partitions_per_consumer=3, stage_partitions=9, input_tasks=2 @@ -2022,32 +1858,28 @@ mod tests { │ HashJoinExec: mode=CollectLeft, join_type=Inner, on=[(CAST(date_dim.d_date_sk AS Float64)@1, cs_ship_date_sk@0)], projection=[cs_ship_addr_sk@3, cs_call_center_sk@4, cs_warehouse_sk@5, cs_order_number@6, cs_ext_ship_cost@7, cs_net_profit@8] │ CoalescePartitionsExec │ [Stage 3] => NetworkBroadcastExec: partitions_per_consumer=3, stage_partitions=9, input_tasks=2 - │ PartitionIsolatorExec: tasks=3 partitions=6 - │ DataSourceExec: file_groups={6 groups: [[/testdata/tpcds/plans_sf1_partitions4/catalog_sales/part-0.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/catalog_sales/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/catalog_sales/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/catalog_sales/part-2.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/catalog_sales/part-2.parquet:..], ...]}, projection=[cs_ship_date_sk, cs_ship_addr_sk, cs_call_center_sk, cs_warehouse_sk, cs_order_number, cs_ext_ship_cost, cs_net_profit], file_type=parquet, predicate=DynamicFilter [ empty ] AND DynamicFilter [ empty ] AND DynamicFilter [ empty ] + │ DistributedLeafExec: DataSourceExec: file_groups={3 groups: [[/testdata/tpcds/plans_sf1_partitions4/catalog_sales/part-0.parquet:.., /testdata/tpcds/plans_sf1_partitions4/catalog_sales/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/catalog_sales/part-1.parquet:.., /testdata/tpcds/plans_sf1_partitions4/catalog_sales/part-2.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/catalog_sales/part-2.parquet:.., /testdata/tpcds/plans_sf1_partitions4/catalog_sales/part-3.parquet:..]]}, projection=[cs_ship_date_sk, cs_ship_addr_sk, cs_call_center_sk, cs_warehouse_sk, cs_order_number, cs_ext_ship_cost, cs_net_profit], file_type=parquet, predicate=DynamicFilter [ empty ] AND DynamicFilter [ empty ] AND DynamicFilter [ empty ] └────────────────────────────────────────────────── ┌───── Stage 1 ── Tasks: t0:[p0..p8] t1:[p9..p17] │ BroadcastExec: input_partitions=3, consumer_tasks=3, output_partitions=9 │ ProjectionExec: expr=[cc_call_center_sk@0 as cc_call_center_sk, CAST(cc_call_center_sk@0 AS Float64) as CAST(call_center.cc_call_center_sk AS Float64)] │ FilterExec: cc_county@1 = Williamson County, projection=[cc_call_center_sk@0] │ RepartitionExec: partitioning=RoundRobinBatch(3), input_partitions=2 - │ PartitionIsolatorExec: tasks=2 partitions=4 - │ DataSourceExec: file_groups={4 groups: [[/testdata/tpcds/plans_sf1_partitions4/call_center/part-0.parquet], [/testdata/tpcds/plans_sf1_partitions4/call_center/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/call_center/part-2.parquet], [/testdata/tpcds/plans_sf1_partitions4/call_center/part-3.parquet]]}, projection=[cc_call_center_sk, cc_county], file_type=parquet, predicate=cc_county@25 = Williamson County, pruning_predicate=cc_county_null_count@2 != row_count@3 AND cc_county_min@0 <= Williamson County AND Williamson County <= cc_county_max@1, required_guarantees=[cc_county in (Williamson County)] + │ DistributedLeafExec: DataSourceExec: file_groups={2 groups: [[/testdata/tpcds/plans_sf1_partitions4/call_center/part-0.parquet, /testdata/tpcds/plans_sf1_partitions4/call_center/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/call_center/part-2.parquet, /testdata/tpcds/plans_sf1_partitions4/call_center/part-3.parquet]]}, projection=[cc_call_center_sk, cc_county], file_type=parquet, predicate=cc_county@25 = Williamson County, pruning_predicate=cc_county_null_count@2 != row_count@3 AND cc_county_min@0 <= Williamson County AND Williamson County <= cc_county_max@1, required_guarantees=[cc_county in (Williamson County)] └────────────────────────────────────────────────── ┌───── Stage 2 ── Tasks: t0:[p0..p8] t1:[p9..p17] │ BroadcastExec: input_partitions=3, consumer_tasks=3, output_partitions=9 │ ProjectionExec: expr=[ca_address_sk@0 as ca_address_sk, CAST(ca_address_sk@0 AS Float64) as CAST(customer_address.ca_address_sk AS Float64)] │ FilterExec: ca_state@1 = GA, projection=[ca_address_sk@0] │ RepartitionExec: partitioning=RoundRobinBatch(3), input_partitions=2 - │ PartitionIsolatorExec: tasks=2 partitions=4 - │ DataSourceExec: file_groups={4 groups: [[/testdata/tpcds/plans_sf1_partitions4/customer_address/part-0.parquet], [/testdata/tpcds/plans_sf1_partitions4/customer_address/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/customer_address/part-2.parquet], [/testdata/tpcds/plans_sf1_partitions4/customer_address/part-3.parquet]]}, projection=[ca_address_sk, ca_state], file_type=parquet, predicate=ca_state@8 = GA, pruning_predicate=ca_state_null_count@2 != row_count@3 AND ca_state_min@0 <= GA AND GA <= ca_state_max@1, required_guarantees=[ca_state in (GA)] + │ DistributedLeafExec: DataSourceExec: file_groups={2 groups: [[/testdata/tpcds/plans_sf1_partitions4/customer_address/part-0.parquet, /testdata/tpcds/plans_sf1_partitions4/customer_address/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/customer_address/part-2.parquet, /testdata/tpcds/plans_sf1_partitions4/customer_address/part-3.parquet]]}, projection=[ca_address_sk, ca_state], file_type=parquet, predicate=ca_state@8 = GA, pruning_predicate=ca_state_null_count@2 != row_count@3 AND ca_state_min@0 <= GA AND GA <= ca_state_max@1, required_guarantees=[ca_state in (GA)] └────────────────────────────────────────────────── ┌───── Stage 3 ── Tasks: t0:[p0..p8] t1:[p9..p17] │ BroadcastExec: input_partitions=3, consumer_tasks=3, output_partitions=9 │ ProjectionExec: expr=[d_date_sk@0 as d_date_sk, CAST(d_date_sk@0 AS Float64) as CAST(date_dim.d_date_sk AS Float64)] │ FilterExec: d_date@1 >= 2002-02-01 AND d_date@1 <= 2002-04-02, projection=[d_date_sk@0] │ RepartitionExec: partitioning=RoundRobinBatch(3), input_partitions=2 - │ PartitionIsolatorExec: tasks=2 partitions=4 - │ DataSourceExec: file_groups={4 groups: [[/testdata/tpcds/plans_sf1_partitions4/date_dim/part-0.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-2.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-3.parquet]]}, projection=[d_date_sk, d_date], file_type=parquet, predicate=d_date@2 >= 2002-02-01 AND d_date@2 <= 2002-04-02, pruning_predicate=d_date_null_count@1 != row_count@2 AND d_date_max@0 >= 2002-02-01 AND d_date_null_count@1 != row_count@2 AND d_date_min@3 <= 2002-04-02, required_guarantees=[] + │ DistributedLeafExec: DataSourceExec: file_groups={2 groups: [[/testdata/tpcds/plans_sf1_partitions4/date_dim/part-0.parquet, /testdata/tpcds/plans_sf1_partitions4/date_dim/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-2.parquet, /testdata/tpcds/plans_sf1_partitions4/date_dim/part-3.parquet]]}, projection=[d_date_sk, d_date], file_type=parquet, predicate=d_date@2 >= 2002-02-01 AND d_date@2 <= 2002-04-02, pruning_predicate=d_date_null_count@1 != row_count@2 AND d_date_max@0 >= 2002-02-01 AND d_date_null_count@1 != row_count@2 AND d_date_min@3 <= 2002-04-02, required_guarantees=[] └────────────────────────────────────────────────── "); Ok(()) @@ -2055,7 +1887,7 @@ mod tests { #[tokio::test] async fn test_tpcds_17() -> Result<()> { let display = test_tpcds_query("q17").await?; - assert_snapshot!(display, @" + assert_snapshot!(display, @r" ┌───── DistributedExec ── Tasks: t0:[p0] │ SortPreservingMergeExec: [i_item_id@0 ASC, i_item_desc@1 ASC, s_state@2 ASC], fetch=100 │ [Stage 11] => NetworkCoalesceExec: output_partitions=6, input_tasks=2 @@ -2073,8 +1905,7 @@ mod tests { │ CoalescePartitionsExec │ [Stage 9] => NetworkBroadcastExec: partitions_per_consumer=3, stage_partitions=9, input_tasks=3 │ RepartitionExec: partitioning=RoundRobinBatch(3), input_partitions=2 - │ PartitionIsolatorExec: tasks=3 partitions=4 - │ DataSourceExec: file_groups={4 groups: [[/testdata/tpcds/plans_sf1_partitions4/item/part-0.parquet], [/testdata/tpcds/plans_sf1_partitions4/item/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/item/part-2.parquet], [/testdata/tpcds/plans_sf1_partitions4/item/part-3.parquet]]}, projection=[i_item_sk, i_item_id, i_item_desc], file_type=parquet, predicate=DynamicFilter [ empty ] + │ DistributedLeafExec: DataSourceExec: file_groups={2 groups: [[/testdata/tpcds/plans_sf1_partitions4/item/part-0.parquet, /testdata/tpcds/plans_sf1_partitions4/item/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/item/part-2.parquet, /testdata/tpcds/plans_sf1_partitions4/item/part-3.parquet]]}, projection=[i_item_sk, i_item_id, i_item_desc], file_type=parquet, predicate=DynamicFilter [ empty ] └────────────────────────────────────────────────── ┌───── Stage 9 ── Tasks: t0:[p0..p8] t1:[p9..p17] t2:[p18..p26] │ BroadcastExec: input_partitions=3, consumer_tasks=3, output_partitions=9 @@ -2098,37 +1929,32 @@ mod tests { └────────────────────────────────────────────────── ┌───── Stage 1 ── Tasks: t0:[p0..p5] t1:[p6..p11] │ BroadcastExec: input_partitions=2, consumer_tasks=3, output_partitions=6 - │ PartitionIsolatorExec: tasks=2 partitions=4 - │ DataSourceExec: file_groups={4 groups: [[/testdata/tpcds/plans_sf1_partitions4/store/part-0.parquet], [/testdata/tpcds/plans_sf1_partitions4/store/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/store/part-2.parquet], [/testdata/tpcds/plans_sf1_partitions4/store/part-3.parquet]]}, projection=[s_store_sk, s_state, CAST(s_store_sk@0 AS Float64) as CAST(store.s_store_sk AS Float64)], file_type=parquet + │ DistributedLeafExec: DataSourceExec: file_groups={2 groups: [[/testdata/tpcds/plans_sf1_partitions4/store/part-0.parquet, /testdata/tpcds/plans_sf1_partitions4/store/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/store/part-2.parquet, /testdata/tpcds/plans_sf1_partitions4/store/part-3.parquet]]}, projection=[s_store_sk, s_state, CAST(s_store_sk@0 AS Float64) as CAST(store.s_store_sk AS Float64)], file_type=parquet └────────────────────────────────────────────────── ┌───── Stage 2 ── Tasks: t0:[p0..p8] t1:[p9..p17] │ BroadcastExec: input_partitions=3, consumer_tasks=3, output_partitions=9 │ ProjectionExec: expr=[d_date_sk@0 as d_date_sk, CAST(d_date_sk@0 AS Float64) as CAST(d3.d_date_sk AS Float64)] │ FilterExec: d_quarter_name@1 = 2001Q1 OR d_quarter_name@1 = 2001Q2 OR d_quarter_name@1 = 2001Q3, projection=[d_date_sk@0] │ RepartitionExec: partitioning=RoundRobinBatch(3), input_partitions=2 - │ PartitionIsolatorExec: tasks=2 partitions=4 - │ DataSourceExec: file_groups={4 groups: [[/testdata/tpcds/plans_sf1_partitions4/date_dim/part-0.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-2.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-3.parquet]]}, projection=[d_date_sk, d_quarter_name], file_type=parquet, predicate=d_quarter_name@15 = 2001Q1 OR d_quarter_name@15 = 2001Q2 OR d_quarter_name@15 = 2001Q3, pruning_predicate=d_quarter_name_null_count@2 != row_count@3 AND d_quarter_name_min@0 <= 2001Q1 AND 2001Q1 <= d_quarter_name_max@1 OR d_quarter_name_null_count@2 != row_count@3 AND d_quarter_name_min@0 <= 2001Q2 AND 2001Q2 <= d_quarter_name_max@1 OR d_quarter_name_null_count@2 != row_count@3 AND d_quarter_name_min@0 <= 2001Q3 AND 2001Q3 <= d_quarter_name_max@1, required_guarantees=[d_quarter_name in (2001Q1, 2001Q2, 2001Q3)] + │ DistributedLeafExec: DataSourceExec: file_groups={2 groups: [[/testdata/tpcds/plans_sf1_partitions4/date_dim/part-0.parquet, /testdata/tpcds/plans_sf1_partitions4/date_dim/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-2.parquet, /testdata/tpcds/plans_sf1_partitions4/date_dim/part-3.parquet]]}, projection=[d_date_sk, d_quarter_name], file_type=parquet, predicate=d_quarter_name@15 = 2001Q1 OR d_quarter_name@15 = 2001Q2 OR d_quarter_name@15 = 2001Q3, pruning_predicate=d_quarter_name_null_count@2 != row_count@3 AND d_quarter_name_min@0 <= 2001Q1 AND 2001Q1 <= d_quarter_name_max@1 OR d_quarter_name_null_count@2 != row_count@3 AND d_quarter_name_min@0 <= 2001Q2 AND 2001Q2 <= d_quarter_name_max@1 OR d_quarter_name_null_count@2 != row_count@3 AND d_quarter_name_min@0 <= 2001Q3 AND 2001Q3 <= d_quarter_name_max@1, required_guarantees=[d_quarter_name in (2001Q1, 2001Q2, 2001Q3)] └────────────────────────────────────────────────── ┌───── Stage 3 ── Tasks: t0:[p0..p8] t1:[p9..p17] │ BroadcastExec: input_partitions=3, consumer_tasks=3, output_partitions=9 │ ProjectionExec: expr=[d_date_sk@0 as d_date_sk, CAST(d_date_sk@0 AS Float64) as CAST(d2.d_date_sk AS Float64)] │ FilterExec: d_quarter_name@1 = 2001Q1 OR d_quarter_name@1 = 2001Q2 OR d_quarter_name@1 = 2001Q3, projection=[d_date_sk@0] │ RepartitionExec: partitioning=RoundRobinBatch(3), input_partitions=2 - │ PartitionIsolatorExec: tasks=2 partitions=4 - │ DataSourceExec: file_groups={4 groups: [[/testdata/tpcds/plans_sf1_partitions4/date_dim/part-0.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-2.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-3.parquet]]}, projection=[d_date_sk, d_quarter_name], file_type=parquet, predicate=d_quarter_name@15 = 2001Q1 OR d_quarter_name@15 = 2001Q2 OR d_quarter_name@15 = 2001Q3, pruning_predicate=d_quarter_name_null_count@2 != row_count@3 AND d_quarter_name_min@0 <= 2001Q1 AND 2001Q1 <= d_quarter_name_max@1 OR d_quarter_name_null_count@2 != row_count@3 AND d_quarter_name_min@0 <= 2001Q2 AND 2001Q2 <= d_quarter_name_max@1 OR d_quarter_name_null_count@2 != row_count@3 AND d_quarter_name_min@0 <= 2001Q3 AND 2001Q3 <= d_quarter_name_max@1, required_guarantees=[d_quarter_name in (2001Q1, 2001Q2, 2001Q3)] + │ DistributedLeafExec: DataSourceExec: file_groups={2 groups: [[/testdata/tpcds/plans_sf1_partitions4/date_dim/part-0.parquet, /testdata/tpcds/plans_sf1_partitions4/date_dim/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-2.parquet, /testdata/tpcds/plans_sf1_partitions4/date_dim/part-3.parquet]]}, projection=[d_date_sk, d_quarter_name], file_type=parquet, predicate=d_quarter_name@15 = 2001Q1 OR d_quarter_name@15 = 2001Q2 OR d_quarter_name@15 = 2001Q3, pruning_predicate=d_quarter_name_null_count@2 != row_count@3 AND d_quarter_name_min@0 <= 2001Q1 AND 2001Q1 <= d_quarter_name_max@1 OR d_quarter_name_null_count@2 != row_count@3 AND d_quarter_name_min@0 <= 2001Q2 AND 2001Q2 <= d_quarter_name_max@1 OR d_quarter_name_null_count@2 != row_count@3 AND d_quarter_name_min@0 <= 2001Q3 AND 2001Q3 <= d_quarter_name_max@1, required_guarantees=[d_quarter_name in (2001Q1, 2001Q2, 2001Q3)] └────────────────────────────────────────────────── ┌───── Stage 4 ── Tasks: t0:[p0..p8] t1:[p9..p17] │ BroadcastExec: input_partitions=3, consumer_tasks=3, output_partitions=9 │ ProjectionExec: expr=[d_date_sk@0 as d_date_sk, CAST(d_date_sk@0 AS Float64) as CAST(d1.d_date_sk AS Float64)] │ FilterExec: d_quarter_name@1 = 2001Q1, projection=[d_date_sk@0] │ RepartitionExec: partitioning=RoundRobinBatch(3), input_partitions=2 - │ PartitionIsolatorExec: tasks=2 partitions=4 - │ DataSourceExec: file_groups={4 groups: [[/testdata/tpcds/plans_sf1_partitions4/date_dim/part-0.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-2.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-3.parquet]]}, projection=[d_date_sk, d_quarter_name], file_type=parquet, predicate=d_quarter_name@15 = 2001Q1, pruning_predicate=d_quarter_name_null_count@2 != row_count@3 AND d_quarter_name_min@0 <= 2001Q1 AND 2001Q1 <= d_quarter_name_max@1, required_guarantees=[d_quarter_name in (2001Q1)] + │ DistributedLeafExec: DataSourceExec: file_groups={2 groups: [[/testdata/tpcds/plans_sf1_partitions4/date_dim/part-0.parquet, /testdata/tpcds/plans_sf1_partitions4/date_dim/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-2.parquet, /testdata/tpcds/plans_sf1_partitions4/date_dim/part-3.parquet]]}, projection=[d_date_sk, d_quarter_name], file_type=parquet, predicate=d_quarter_name@15 = 2001Q1, pruning_predicate=d_quarter_name_null_count@2 != row_count@3 AND d_quarter_name_min@0 <= 2001Q1 AND 2001Q1 <= d_quarter_name_max@1, required_guarantees=[d_quarter_name in (2001Q1)] └────────────────────────────────────────────────── ┌───── Stage 5 ── Tasks: t0:[p0..p8] t1:[p0..p8] t2:[p0..p8] - │ RepartitionExec: partitioning=Hash([cs_bill_customer_sk@1, cs_item_sk@2], 9), input_partitions=2 - │ PartitionIsolatorExec: tasks=3 partitions=6 - │ DataSourceExec: file_groups={6 groups: [[/testdata/tpcds/plans_sf1_partitions4/catalog_sales/part-0.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/catalog_sales/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/catalog_sales/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/catalog_sales/part-2.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/catalog_sales/part-2.parquet:..], ...]}, projection=[cs_sold_date_sk, cs_bill_customer_sk, cs_item_sk, cs_quantity], file_type=parquet, predicate=DynamicFilter [ empty ] + │ RepartitionExec: partitioning=Hash([cs_bill_customer_sk@1, cs_item_sk@2], 9), input_partitions=3 + │ DistributedLeafExec: DataSourceExec: file_groups={3 groups: [[/testdata/tpcds/plans_sf1_partitions4/catalog_sales/part-0.parquet:.., /testdata/tpcds/plans_sf1_partitions4/catalog_sales/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/catalog_sales/part-1.parquet:.., /testdata/tpcds/plans_sf1_partitions4/catalog_sales/part-2.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/catalog_sales/part-2.parquet:.., /testdata/tpcds/plans_sf1_partitions4/catalog_sales/part-3.parquet:..]]}, projection=[cs_sold_date_sk, cs_bill_customer_sk, cs_item_sk, cs_quantity], file_type=parquet, predicate=DynamicFilter [ empty ] └────────────────────────────────────────────────── ┌───── Stage 8 ── Tasks: t0:[p0..p8] t1:[p0..p8] t2:[p0..p8] │ RepartitionExec: partitioning=Hash([sr_customer_sk@6, sr_item_sk@5], 9), input_partitions=3 @@ -2138,14 +1964,12 @@ mod tests { │ [Stage 7] => NetworkShuffleExec: output_partitions=3, input_tasks=3 └────────────────────────────────────────────────── ┌───── Stage 6 ── Tasks: t0:[p0..p8] t1:[p0..p8] t2:[p0..p8] - │ RepartitionExec: partitioning=Hash([sr_customer_sk@2, sr_item_sk@1, sr_ticket_number@3], 9), input_partitions=2 - │ PartitionIsolatorExec: tasks=3 partitions=6 - │ DataSourceExec: file_groups={6 groups: [[/testdata/tpcds/plans_sf1_partitions4/store_returns/part-0.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_returns/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_returns/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_returns/part-2.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_returns/part-2.parquet:..], ...]}, projection=[sr_returned_date_sk, sr_item_sk, sr_customer_sk, sr_ticket_number, sr_return_quantity], file_type=parquet, predicate=DynamicFilter [ empty ] AND DynamicFilter [ empty ] + │ RepartitionExec: partitioning=Hash([sr_customer_sk@2, sr_item_sk@1, sr_ticket_number@3], 9), input_partitions=3 + │ DistributedLeafExec: DataSourceExec: file_groups={3 groups: [[/testdata/tpcds/plans_sf1_partitions4/store_returns/part-0.parquet:.., /testdata/tpcds/plans_sf1_partitions4/store_returns/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_returns/part-1.parquet:.., /testdata/tpcds/plans_sf1_partitions4/store_returns/part-2.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_returns/part-2.parquet:.., /testdata/tpcds/plans_sf1_partitions4/store_returns/part-3.parquet:..]]}, projection=[sr_returned_date_sk, sr_item_sk, sr_customer_sk, sr_ticket_number, sr_return_quantity], file_type=parquet, predicate=DynamicFilter [ empty ] AND DynamicFilter [ empty ] └────────────────────────────────────────────────── ┌───── Stage 7 ── Tasks: t0:[p0..p8] t1:[p0..p8] t2:[p0..p8] - │ RepartitionExec: partitioning=Hash([ss_customer_sk@2, ss_item_sk@1, ss_ticket_number@4], 9), input_partitions=2 - │ PartitionIsolatorExec: tasks=3 partitions=6 - │ DataSourceExec: file_groups={6 groups: [[/testdata/tpcds/plans_sf1_partitions4/store_sales/part-0.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_sales/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_sales/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_sales/part-2.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_sales/part-2.parquet:..], ...]}, projection=[ss_sold_date_sk, ss_item_sk, ss_customer_sk, ss_store_sk, ss_ticket_number, ss_quantity], file_type=parquet, predicate=DynamicFilter [ empty ] AND DynamicFilter [ empty ] AND DynamicFilter [ empty ] + │ RepartitionExec: partitioning=Hash([ss_customer_sk@2, ss_item_sk@1, ss_ticket_number@4], 9), input_partitions=3 + │ DistributedLeafExec: DataSourceExec: file_groups={3 groups: [[/testdata/tpcds/plans_sf1_partitions4/store_sales/part-0.parquet:.., /testdata/tpcds/plans_sf1_partitions4/store_sales/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_sales/part-1.parquet:.., /testdata/tpcds/plans_sf1_partitions4/store_sales/part-2.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_sales/part-2.parquet:.., /testdata/tpcds/plans_sf1_partitions4/store_sales/part-3.parquet:..]]}, projection=[ss_sold_date_sk, ss_item_sk, ss_customer_sk, ss_store_sk, ss_ticket_number, ss_quantity], file_type=parquet, predicate=DynamicFilter [ empty ] AND DynamicFilter [ empty ] AND DynamicFilter [ empty ] └────────────────────────────────────────────────── "); Ok(()) @@ -2153,7 +1977,7 @@ mod tests { #[tokio::test] async fn test_tpcds_18() -> Result<()> { let display = test_tpcds_query("q18").await?; - assert_snapshot!(display, @" + assert_snapshot!(display, @r" ┌───── DistributedExec ── Tasks: t0:[p0] │ SortPreservingMergeExec: [ca_country@1 ASC, ca_state@2 ASC, ca_county@3 ASC, i_item_id@0 ASC], fetch=100 │ [Stage 10] => NetworkCoalesceExec: output_partitions=6, input_tasks=2 @@ -2171,8 +1995,7 @@ mod tests { │ CoalescePartitionsExec │ [Stage 8] => NetworkBroadcastExec: partitions_per_consumer=3, stage_partitions=9, input_tasks=3 │ RepartitionExec: partitioning=RoundRobinBatch(3), input_partitions=2 - │ PartitionIsolatorExec: tasks=3 partitions=4 - │ DataSourceExec: file_groups={4 groups: [[/testdata/tpcds/plans_sf1_partitions4/item/part-0.parquet], [/testdata/tpcds/plans_sf1_partitions4/item/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/item/part-2.parquet], [/testdata/tpcds/plans_sf1_partitions4/item/part-3.parquet]]}, projection=[i_item_sk, i_item_id], file_type=parquet, predicate=DynamicFilter [ empty ] + │ DistributedLeafExec: DataSourceExec: file_groups={2 groups: [[/testdata/tpcds/plans_sf1_partitions4/item/part-0.parquet, /testdata/tpcds/plans_sf1_partitions4/item/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/item/part-2.parquet, /testdata/tpcds/plans_sf1_partitions4/item/part-3.parquet]]}, projection=[i_item_sk, i_item_id], file_type=parquet, predicate=DynamicFilter [ empty ] └────────────────────────────────────────────────── ┌───── Stage 8 ── Tasks: t0:[p0..p8] t1:[p9..p17] t2:[p18..p26] │ BroadcastExec: input_partitions=3, consumer_tasks=3, output_partitions=9 @@ -2192,15 +2015,13 @@ mod tests { │ ProjectionExec: expr=[d_date_sk@0 as d_date_sk, CAST(d_date_sk@0 AS Float64) as CAST(date_dim.d_date_sk AS Float64)] │ FilterExec: d_year@1 = 1998, projection=[d_date_sk@0] │ RepartitionExec: partitioning=RoundRobinBatch(3), input_partitions=2 - │ PartitionIsolatorExec: tasks=2 partitions=4 - │ DataSourceExec: file_groups={4 groups: [[/testdata/tpcds/plans_sf1_partitions4/date_dim/part-0.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-2.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-3.parquet]]}, projection=[d_date_sk, d_year], file_type=parquet, predicate=d_year@6 = 1998, pruning_predicate=d_year_null_count@2 != row_count@3 AND d_year_min@0 <= 1998 AND 1998 <= d_year_max@1, required_guarantees=[d_year in (1998)] + │ DistributedLeafExec: DataSourceExec: file_groups={2 groups: [[/testdata/tpcds/plans_sf1_partitions4/date_dim/part-0.parquet, /testdata/tpcds/plans_sf1_partitions4/date_dim/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-2.parquet, /testdata/tpcds/plans_sf1_partitions4/date_dim/part-3.parquet]]}, projection=[d_date_sk, d_year], file_type=parquet, predicate=d_year@6 = 1998, pruning_predicate=d_year_null_count@2 != row_count@3 AND d_year_min@0 <= 1998 AND 1998 <= d_year_max@1, required_guarantees=[d_year in (1998)] └────────────────────────────────────────────────── ┌───── Stage 2 ── Tasks: t0:[p0..p8] t1:[p9..p17] │ BroadcastExec: input_partitions=3, consumer_tasks=3, output_partitions=9 │ FilterExec: ca_state@2 IN (SET) ([MS, IN, ND, OK, NM, VA, MS]) │ RepartitionExec: partitioning=RoundRobinBatch(3), input_partitions=2 - │ PartitionIsolatorExec: tasks=2 partitions=4 - │ DataSourceExec: file_groups={4 groups: [[/testdata/tpcds/plans_sf1_partitions4/customer_address/part-0.parquet], [/testdata/tpcds/plans_sf1_partitions4/customer_address/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/customer_address/part-2.parquet], [/testdata/tpcds/plans_sf1_partitions4/customer_address/part-3.parquet]]}, projection=[ca_address_sk, ca_county, ca_state, ca_country], file_type=parquet, predicate=ca_state@8 IN (SET) ([MS, IN, ND, OK, NM, VA, MS]), pruning_predicate=ca_state_null_count@2 != row_count@3 AND ca_state_min@0 <= MS AND MS <= ca_state_max@1 OR ca_state_null_count@2 != row_count@3 AND ca_state_min@0 <= IN AND IN <= ca_state_max@1 OR ca_state_null_count@2 != row_count@3 AND ca_state_min@0 <= ND AND ND <= ca_state_max@1 OR ca_state_null_count@2 != row_count@3 AND ca_state_min@0 <= OK AND OK <= ca_state_max@1 OR ca_state_null_count@2 != row_count@3 AND ca_state_min@0 <= NM AND NM <= ca_state_max@1 OR ca_state_null_count@2 != row_count@3 AND ca_state_min@0 <= VA AND VA <= ca_state_max@1 OR ca_state_null_count@2 != row_count@3 AND ca_state_min@0 <= MS AND MS <= ca_state_max@1, required_guarantees=[ca_state in (IN, MS, ND, NM, OK, VA)] + │ DistributedLeafExec: DataSourceExec: file_groups={2 groups: [[/testdata/tpcds/plans_sf1_partitions4/customer_address/part-0.parquet, /testdata/tpcds/plans_sf1_partitions4/customer_address/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/customer_address/part-2.parquet, /testdata/tpcds/plans_sf1_partitions4/customer_address/part-3.parquet]]}, projection=[ca_address_sk, ca_county, ca_state, ca_country], file_type=parquet, predicate=ca_state@8 IN (SET) ([MS, IN, ND, OK, NM, VA, MS]), pruning_predicate=ca_state_null_count@2 != row_count@3 AND ca_state_min@0 <= MS AND MS <= ca_state_max@1 OR ca_state_null_count@2 != row_count@3 AND ca_state_min@0 <= IN AND IN <= ca_state_max@1 OR ca_state_null_count@2 != row_count@3 AND ca_state_min@0 <= ND AND ND <= ca_state_max@1 OR ca_state_null_count@2 != row_count@3 AND ca_state_min@0 <= OK AND OK <= ca_state_max@1 OR ca_state_null_count@2 != row_count@3 AND ca_state_min@0 <= NM AND NM <= ca_state_max@1 OR ca_state_null_count@2 != row_count@3 AND ca_state_min@0 <= VA AND VA <= ca_state_max@1 OR ca_state_null_count@2 != row_count@3 AND ca_state_min@0 <= MS AND MS <= ca_state_max@1, required_guarantees=[ca_state in (IN, MS, ND, NM, OK, VA)] └────────────────────────────────────────────────── ┌───── Stage 6 ── Tasks: t0:[p0..p8] t1:[p0..p8] t2:[p0..p8] │ RepartitionExec: partitioning=Hash([c_current_cdemo_sk@8], 9), input_partitions=3 @@ -2218,25 +2039,21 @@ mod tests { │ ProjectionExec: expr=[c_customer_sk@0 as c_customer_sk, c_current_cdemo_sk@1 as c_current_cdemo_sk, c_current_addr_sk@2 as c_current_addr_sk, c_birth_year@3 as c_birth_year, CAST(c_customer_sk@0 AS Float64) as CAST(customer.c_customer_sk AS Float64)] │ FilterExec: c_birth_month@3 IN (SET) ([1, 6, 8, 9, 12, 2]), projection=[c_customer_sk@0, c_current_cdemo_sk@1, c_current_addr_sk@2, c_birth_year@4] │ RepartitionExec: partitioning=RoundRobinBatch(3), input_partitions=2 - │ PartitionIsolatorExec: tasks=2 partitions=4 - │ DataSourceExec: file_groups={4 groups: [[/testdata/tpcds/plans_sf1_partitions4/customer/part-0.parquet], [/testdata/tpcds/plans_sf1_partitions4/customer/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/customer/part-2.parquet], [/testdata/tpcds/plans_sf1_partitions4/customer/part-3.parquet]]}, projection=[c_customer_sk, c_current_cdemo_sk, c_current_addr_sk, c_birth_month, c_birth_year], file_type=parquet, predicate=c_birth_month@12 IN (SET) ([1, 6, 8, 9, 12, 2]) AND DynamicFilter [ empty ], pruning_predicate=c_birth_month_null_count@2 != row_count@3 AND c_birth_month_min@0 <= 1 AND 1 <= c_birth_month_max@1 OR c_birth_month_null_count@2 != row_count@3 AND c_birth_month_min@0 <= 6 AND 6 <= c_birth_month_max@1 OR c_birth_month_null_count@2 != row_count@3 AND c_birth_month_min@0 <= 8 AND 8 <= c_birth_month_max@1 OR c_birth_month_null_count@2 != row_count@3 AND c_birth_month_min@0 <= 9 AND 9 <= c_birth_month_max@1 OR c_birth_month_null_count@2 != row_count@3 AND c_birth_month_min@0 <= 12 AND 12 <= c_birth_month_max@1 OR c_birth_month_null_count@2 != row_count@3 AND c_birth_month_min@0 <= 2 AND 2 <= c_birth_month_max@1, required_guarantees=[c_birth_month in (1, 12, 2, 6, 8, 9)] + │ DistributedLeafExec: DataSourceExec: file_groups={2 groups: [[/testdata/tpcds/plans_sf1_partitions4/customer/part-0.parquet, /testdata/tpcds/plans_sf1_partitions4/customer/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/customer/part-2.parquet, /testdata/tpcds/plans_sf1_partitions4/customer/part-3.parquet]]}, projection=[c_customer_sk, c_current_cdemo_sk, c_current_addr_sk, c_birth_month, c_birth_year], file_type=parquet, predicate=c_birth_month@12 IN (SET) ([1, 6, 8, 9, 12, 2]) AND DynamicFilter [ empty ], pruning_predicate=c_birth_month_null_count@2 != row_count@3 AND c_birth_month_min@0 <= 1 AND 1 <= c_birth_month_max@1 OR c_birth_month_null_count@2 != row_count@3 AND c_birth_month_min@0 <= 6 AND 6 <= c_birth_month_max@1 OR c_birth_month_null_count@2 != row_count@3 AND c_birth_month_min@0 <= 8 AND 8 <= c_birth_month_max@1 OR c_birth_month_null_count@2 != row_count@3 AND c_birth_month_min@0 <= 9 AND 9 <= c_birth_month_max@1 OR c_birth_month_null_count@2 != row_count@3 AND c_birth_month_min@0 <= 12 AND 12 <= c_birth_month_max@1 OR c_birth_month_null_count@2 != row_count@3 AND c_birth_month_min@0 <= 2 AND 2 <= c_birth_month_max@1, required_guarantees=[c_birth_month in (1, 12, 2, 6, 8, 9)] └────────────────────────────────────────────────── ┌───── Stage 4 ── Tasks: t0:[p0..p8] t1:[p0..p8] t2:[p0..p8] - │ RepartitionExec: partitioning=Hash([CAST(cd1.cd_demo_sk AS Float64)@2], 9), input_partitions=2 + │ RepartitionExec: partitioning=Hash([CAST(cd1.cd_demo_sk AS Float64)@2], 9), input_partitions=3 │ ProjectionExec: expr=[cd_demo_sk@0 as cd_demo_sk, cd_dep_count@1 as cd_dep_count, CAST(cd_demo_sk@0 AS Float64) as CAST(cd1.cd_demo_sk AS Float64)] │ FilterExec: cd_gender@1 = F AND cd_education_status@2 = Unknown, projection=[cd_demo_sk@0, cd_dep_count@3] - │ PartitionIsolatorExec: tasks=3 partitions=6 - │ DataSourceExec: file_groups={6 groups: [[/testdata/tpcds/plans_sf1_partitions4/customer_demographics/part-0.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/customer_demographics/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/customer_demographics/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/customer_demographics/part-2.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/customer_demographics/part-2.parquet:..], ...]}, projection=[cd_demo_sk, cd_gender, cd_education_status, cd_dep_count], file_type=parquet, predicate=cd_gender@1 = F AND cd_education_status@3 = Unknown, pruning_predicate=cd_gender_null_count@2 != row_count@3 AND cd_gender_min@0 <= F AND F <= cd_gender_max@1 AND cd_education_status_null_count@6 != row_count@3 AND cd_education_status_min@4 <= Unknown AND Unknown <= cd_education_status_max@5, required_guarantees=[cd_education_status in (Unknown), cd_gender in (F)] + │ DistributedLeafExec: DataSourceExec: file_groups={3 groups: [[/testdata/tpcds/plans_sf1_partitions4/customer_demographics/part-0.parquet:.., /testdata/tpcds/plans_sf1_partitions4/customer_demographics/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/customer_demographics/part-1.parquet:.., /testdata/tpcds/plans_sf1_partitions4/customer_demographics/part-2.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/customer_demographics/part-2.parquet:.., /testdata/tpcds/plans_sf1_partitions4/customer_demographics/part-3.parquet:..]]}, projection=[cd_demo_sk, cd_gender, cd_education_status, cd_dep_count], file_type=parquet, predicate=cd_gender@1 = F AND cd_education_status@3 = Unknown, pruning_predicate=cd_gender_null_count@2 != row_count@3 AND cd_gender_min@0 <= F AND F <= cd_gender_max@1 AND cd_education_status_null_count@6 != row_count@3 AND cd_education_status_min@4 <= Unknown AND Unknown <= cd_education_status_max@5, required_guarantees=[cd_education_status in (Unknown), cd_gender in (F)] └────────────────────────────────────────────────── ┌───── Stage 5 ── Tasks: t0:[p0..p8] t1:[p0..p8] t2:[p0..p8] - │ RepartitionExec: partitioning=Hash([cs_bill_cdemo_sk@2], 9), input_partitions=2 - │ PartitionIsolatorExec: tasks=3 partitions=6 - │ DataSourceExec: file_groups={6 groups: [[/testdata/tpcds/plans_sf1_partitions4/catalog_sales/part-0.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/catalog_sales/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/catalog_sales/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/catalog_sales/part-2.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/catalog_sales/part-2.parquet:..], ...]}, projection=[cs_sold_date_sk, cs_bill_customer_sk, cs_bill_cdemo_sk, cs_item_sk, cs_quantity, cs_list_price, cs_sales_price, cs_coupon_amt, cs_net_profit], file_type=parquet, predicate=DynamicFilter [ empty ] AND DynamicFilter [ empty ] AND DynamicFilter [ empty ] + │ RepartitionExec: partitioning=Hash([cs_bill_cdemo_sk@2], 9), input_partitions=3 + │ DistributedLeafExec: DataSourceExec: file_groups={3 groups: [[/testdata/tpcds/plans_sf1_partitions4/catalog_sales/part-0.parquet:.., /testdata/tpcds/plans_sf1_partitions4/catalog_sales/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/catalog_sales/part-1.parquet:.., /testdata/tpcds/plans_sf1_partitions4/catalog_sales/part-2.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/catalog_sales/part-2.parquet:.., /testdata/tpcds/plans_sf1_partitions4/catalog_sales/part-3.parquet:..]]}, projection=[cs_sold_date_sk, cs_bill_customer_sk, cs_bill_cdemo_sk, cs_item_sk, cs_quantity, cs_list_price, cs_sales_price, cs_coupon_amt, cs_net_profit], file_type=parquet, predicate=DynamicFilter [ empty ] AND DynamicFilter [ empty ] AND DynamicFilter [ empty ] └────────────────────────────────────────────────── ┌───── Stage 7 ── Tasks: t0:[p0..p8] t1:[p0..p8] t2:[p0..p8] - │ RepartitionExec: partitioning=Hash([CAST(cd2.cd_demo_sk AS Float64)@1], 9), input_partitions=2 - │ PartitionIsolatorExec: tasks=3 partitions=6 - │ DataSourceExec: file_groups={6 groups: [[/testdata/tpcds/plans_sf1_partitions4/customer_demographics/part-0.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/customer_demographics/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/customer_demographics/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/customer_demographics/part-2.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/customer_demographics/part-2.parquet:..], ...]}, projection=[cd_demo_sk, CAST(cd_demo_sk@0 AS Float64) as CAST(cd2.cd_demo_sk AS Float64)], file_type=parquet, predicate=DynamicFilter [ empty ] + │ RepartitionExec: partitioning=Hash([CAST(cd2.cd_demo_sk AS Float64)@1], 9), input_partitions=3 + │ DistributedLeafExec: DataSourceExec: file_groups={3 groups: [[/testdata/tpcds/plans_sf1_partitions4/customer_demographics/part-0.parquet:.., /testdata/tpcds/plans_sf1_partitions4/customer_demographics/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/customer_demographics/part-1.parquet:.., /testdata/tpcds/plans_sf1_partitions4/customer_demographics/part-2.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/customer_demographics/part-2.parquet:.., /testdata/tpcds/plans_sf1_partitions4/customer_demographics/part-3.parquet:..]]}, projection=[cd_demo_sk, CAST(cd_demo_sk@0 AS Float64) as CAST(cd2.cd_demo_sk AS Float64)], file_type=parquet, predicate=DynamicFilter [ empty ] └────────────────────────────────────────────────── "); Ok(()) @@ -2244,7 +2061,7 @@ mod tests { #[tokio::test] async fn test_tpcds_19() -> Result<()> { let display = test_tpcds_query("q19").await?; - assert_snapshot!(display, @" + assert_snapshot!(display, @r" ┌───── DistributedExec ── Tasks: t0:[p0] │ SortPreservingMergeExec: [ext_price@4 DESC, brand@1 ASC NULLS LAST, brand_id@0 ASC NULLS LAST, i_manufact_id@2 ASC NULLS LAST, i_manufact@3 ASC NULLS LAST], fetch=100 │ [Stage 7] => NetworkCoalesceExec: output_partitions=6, input_tasks=2 @@ -2265,13 +2082,11 @@ mod tests { │ CoalescePartitionsExec │ [Stage 5] => NetworkBroadcastExec: partitions_per_consumer=3, stage_partitions=9, input_tasks=3 │ RepartitionExec: partitioning=RoundRobinBatch(3), input_partitions=2 - │ PartitionIsolatorExec: tasks=3 partitions=4 - │ DataSourceExec: file_groups={4 groups: [[/testdata/tpcds/plans_sf1_partitions4/customer_address/part-0.parquet], [/testdata/tpcds/plans_sf1_partitions4/customer_address/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/customer_address/part-2.parquet], [/testdata/tpcds/plans_sf1_partitions4/customer_address/part-3.parquet]]}, projection=[ca_address_sk, ca_zip], file_type=parquet, predicate=DynamicFilter [ empty ] + │ DistributedLeafExec: DataSourceExec: file_groups={2 groups: [[/testdata/tpcds/plans_sf1_partitions4/customer_address/part-0.parquet, /testdata/tpcds/plans_sf1_partitions4/customer_address/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/customer_address/part-2.parquet, /testdata/tpcds/plans_sf1_partitions4/customer_address/part-3.parquet]]}, projection=[ca_address_sk, ca_zip], file_type=parquet, predicate=DynamicFilter [ empty ] └────────────────────────────────────────────────── ┌───── Stage 1 ── Tasks: t0:[p0..p5] t1:[p6..p11] │ BroadcastExec: input_partitions=2, consumer_tasks=3, output_partitions=6 - │ PartitionIsolatorExec: tasks=2 partitions=4 - │ DataSourceExec: file_groups={4 groups: [[/testdata/tpcds/plans_sf1_partitions4/store/part-0.parquet], [/testdata/tpcds/plans_sf1_partitions4/store/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/store/part-2.parquet], [/testdata/tpcds/plans_sf1_partitions4/store/part-3.parquet]]}, projection=[s_store_sk, s_zip, CAST(s_store_sk@0 AS Float64) as CAST(store.s_store_sk AS Float64)], file_type=parquet + │ DistributedLeafExec: DataSourceExec: file_groups={2 groups: [[/testdata/tpcds/plans_sf1_partitions4/store/part-0.parquet, /testdata/tpcds/plans_sf1_partitions4/store/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/store/part-2.parquet, /testdata/tpcds/plans_sf1_partitions4/store/part-3.parquet]]}, projection=[s_store_sk, s_zip, CAST(s_store_sk@0 AS Float64) as CAST(store.s_store_sk AS Float64)], file_type=parquet └────────────────────────────────────────────────── ┌───── Stage 5 ── Tasks: t0:[p0..p8] t1:[p9..p17] t2:[p18..p26] │ BroadcastExec: input_partitions=3, consumer_tasks=3, output_partitions=9 @@ -2280,34 +2095,30 @@ mod tests { │ [Stage 4] => NetworkBroadcastExec: partitions_per_consumer=3, stage_partitions=9, input_tasks=3 │ ProjectionExec: expr=[c_customer_sk@0 as c_customer_sk, c_current_addr_sk@1 as c_current_addr_sk, CAST(c_customer_sk@0 AS Float64) as CAST(customer.c_customer_sk AS Float64)] │ RepartitionExec: partitioning=RoundRobinBatch(3), input_partitions=2 - │ PartitionIsolatorExec: tasks=3 partitions=4 - │ DataSourceExec: file_groups={4 groups: [[/testdata/tpcds/plans_sf1_partitions4/customer/part-0.parquet], [/testdata/tpcds/plans_sf1_partitions4/customer/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/customer/part-2.parquet], [/testdata/tpcds/plans_sf1_partitions4/customer/part-3.parquet]]}, projection=[c_customer_sk, c_current_addr_sk], file_type=parquet, predicate=DynamicFilter [ empty ] + │ DistributedLeafExec: DataSourceExec: file_groups={2 groups: [[/testdata/tpcds/plans_sf1_partitions4/customer/part-0.parquet, /testdata/tpcds/plans_sf1_partitions4/customer/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/customer/part-2.parquet, /testdata/tpcds/plans_sf1_partitions4/customer/part-3.parquet]]}, projection=[c_customer_sk, c_current_addr_sk], file_type=parquet, predicate=DynamicFilter [ empty ] └────────────────────────────────────────────────── ┌───── Stage 4 ── Tasks: t0:[p0..p8] t1:[p9..p17] t2:[p18..p26] │ BroadcastExec: input_partitions=3, consumer_tasks=3, output_partitions=9 │ HashJoinExec: mode=CollectLeft, join_type=Inner, on=[(ss_item_sk@0, i_item_sk@0)], projection=[ss_customer_sk@1, ss_store_sk@2, ss_ext_sales_price@3, i_brand_id@5, i_brand@6, i_manufact_id@7, i_manufact@8] │ CoalescePartitionsExec - │ [Stage 3] => NetworkBroadcastExec: partitions_per_consumer=2, stage_partitions=6, input_tasks=3 + │ [Stage 3] => NetworkBroadcastExec: partitions_per_consumer=3, stage_partitions=9, input_tasks=3 │ FilterExec: i_manager_id@5 = 8, projection=[i_item_sk@0, i_brand_id@1, i_brand@2, i_manufact_id@3, i_manufact@4] │ RepartitionExec: partitioning=RoundRobinBatch(3), input_partitions=2 - │ PartitionIsolatorExec: tasks=3 partitions=4 - │ DataSourceExec: file_groups={4 groups: [[/testdata/tpcds/plans_sf1_partitions4/item/part-0.parquet], [/testdata/tpcds/plans_sf1_partitions4/item/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/item/part-2.parquet], [/testdata/tpcds/plans_sf1_partitions4/item/part-3.parquet]]}, projection=[i_item_sk, i_brand_id, i_brand, i_manufact_id, i_manufact, i_manager_id], file_type=parquet, predicate=i_manager_id@20 = 8 AND DynamicFilter [ empty ], pruning_predicate=i_manager_id_null_count@2 != row_count@3 AND i_manager_id_min@0 <= 8 AND 8 <= i_manager_id_max@1, required_guarantees=[i_manager_id in (8)] + │ DistributedLeafExec: DataSourceExec: file_groups={2 groups: [[/testdata/tpcds/plans_sf1_partitions4/item/part-0.parquet, /testdata/tpcds/plans_sf1_partitions4/item/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/item/part-2.parquet, /testdata/tpcds/plans_sf1_partitions4/item/part-3.parquet]]}, projection=[i_item_sk, i_brand_id, i_brand, i_manufact_id, i_manufact, i_manager_id], file_type=parquet, predicate=i_manager_id@20 = 8 AND DynamicFilter [ empty ], pruning_predicate=i_manager_id_null_count@2 != row_count@3 AND i_manager_id_min@0 <= 8 AND 8 <= i_manager_id_max@1, required_guarantees=[i_manager_id in (8)] └────────────────────────────────────────────────── - ┌───── Stage 3 ── Tasks: t0:[p0..p5] t1:[p6..p11] t2:[p12..p17] - │ BroadcastExec: input_partitions=2, consumer_tasks=3, output_partitions=6 + ┌───── Stage 3 ── Tasks: t0:[p0..p8] t1:[p9..p17] t2:[p18..p26] + │ BroadcastExec: input_partitions=3, consumer_tasks=3, output_partitions=9 │ HashJoinExec: mode=CollectLeft, join_type=Inner, on=[(CAST(date_dim.d_date_sk AS Float64)@1, ss_sold_date_sk@0)], projection=[ss_item_sk@3, ss_customer_sk@4, ss_store_sk@5, ss_ext_sales_price@6] │ CoalescePartitionsExec │ [Stage 2] => NetworkBroadcastExec: partitions_per_consumer=3, stage_partitions=9, input_tasks=2 - │ PartitionIsolatorExec: tasks=3 partitions=6 - │ DataSourceExec: file_groups={6 groups: [[/testdata/tpcds/plans_sf1_partitions4/store_sales/part-0.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_sales/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_sales/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_sales/part-2.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_sales/part-2.parquet:..], ...]}, projection=[ss_sold_date_sk, ss_item_sk, ss_customer_sk, ss_store_sk, ss_ext_sales_price], file_type=parquet, predicate=DynamicFilter [ empty ] AND DynamicFilter [ empty ] + │ DistributedLeafExec: DataSourceExec: file_groups={3 groups: [[/testdata/tpcds/plans_sf1_partitions4/store_sales/part-0.parquet:.., /testdata/tpcds/plans_sf1_partitions4/store_sales/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_sales/part-1.parquet:.., /testdata/tpcds/plans_sf1_partitions4/store_sales/part-2.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_sales/part-2.parquet:.., /testdata/tpcds/plans_sf1_partitions4/store_sales/part-3.parquet:..]]}, projection=[ss_sold_date_sk, ss_item_sk, ss_customer_sk, ss_store_sk, ss_ext_sales_price], file_type=parquet, predicate=DynamicFilter [ empty ] AND DynamicFilter [ empty ] └────────────────────────────────────────────────── ┌───── Stage 2 ── Tasks: t0:[p0..p8] t1:[p9..p17] │ BroadcastExec: input_partitions=3, consumer_tasks=3, output_partitions=9 │ ProjectionExec: expr=[d_date_sk@0 as d_date_sk, CAST(d_date_sk@0 AS Float64) as CAST(date_dim.d_date_sk AS Float64)] │ FilterExec: d_moy@2 = 11 AND d_year@1 = 1998, projection=[d_date_sk@0] │ RepartitionExec: partitioning=RoundRobinBatch(3), input_partitions=2 - │ PartitionIsolatorExec: tasks=2 partitions=4 - │ DataSourceExec: file_groups={4 groups: [[/testdata/tpcds/plans_sf1_partitions4/date_dim/part-0.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-2.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-3.parquet]]}, projection=[d_date_sk, d_year, d_moy], file_type=parquet, predicate=d_moy@8 = 11 AND d_year@6 = 1998, pruning_predicate=d_moy_null_count@2 != row_count@3 AND d_moy_min@0 <= 11 AND 11 <= d_moy_max@1 AND d_year_null_count@6 != row_count@3 AND d_year_min@4 <= 1998 AND 1998 <= d_year_max@5, required_guarantees=[d_moy in (11), d_year in (1998)] + │ DistributedLeafExec: DataSourceExec: file_groups={2 groups: [[/testdata/tpcds/plans_sf1_partitions4/date_dim/part-0.parquet, /testdata/tpcds/plans_sf1_partitions4/date_dim/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-2.parquet, /testdata/tpcds/plans_sf1_partitions4/date_dim/part-3.parquet]]}, projection=[d_date_sk, d_year, d_moy], file_type=parquet, predicate=d_moy@8 = 11 AND d_year@6 = 1998, pruning_predicate=d_moy_null_count@2 != row_count@3 AND d_moy_min@0 <= 11 AND 11 <= d_moy_max@1 AND d_year_null_count@6 != row_count@3 AND d_year_min@4 <= 1998 AND 1998 <= d_year_max@5, required_guarantees=[d_moy in (11), d_year in (1998)] └────────────────────────────────────────────────── "); Ok(()) @@ -2330,7 +2141,7 @@ mod tests { │ [Stage 3] => NetworkShuffleExec: output_partitions=3, input_tasks=3 └────────────────────────────────────────────────── ┌───── Stage 3 ── Tasks: t0:[p0..p5] t1:[p0..p5] t2:[p0..p5] - │ RepartitionExec: partitioning=Hash([i_item_id@0, i_item_desc@1, i_category@2, i_class@3, i_current_price@4], 6), input_partitions=2 + │ RepartitionExec: partitioning=Hash([i_item_id@0, i_item_desc@1, i_category@2, i_class@3, i_current_price@4], 6), input_partitions=3 │ AggregateExec: mode=Partial, gby=[i_item_id@1 as i_item_id, i_item_desc@2 as i_item_desc, i_category@5 as i_category, i_class@4 as i_class, i_current_price@3 as i_current_price], aggr=[sum(catalog_sales.cs_ext_sales_price)] │ HashJoinExec: mode=CollectLeft, join_type=Inner, on=[(CAST(date_dim.d_date_sk AS Float64)@1, cs_sold_date_sk@0)], projection=[cs_ext_sales_price@3, i_item_id@4, i_item_desc@5, i_current_price@6, i_class@7, i_category@8] │ CoalescePartitionsExec @@ -2339,23 +2150,20 @@ mod tests { │ HashJoinExec: mode=CollectLeft, join_type=Inner, on=[(i_item_sk@0, cs_item_sk@1)], projection=[i_item_id@1, i_item_desc@2, i_current_price@3, i_class@4, i_category@5, cs_sold_date_sk@6, cs_ext_sales_price@8] │ CoalescePartitionsExec │ [Stage 2] => NetworkBroadcastExec: partitions_per_consumer=3, stage_partitions=9, input_tasks=2 - │ PartitionIsolatorExec: tasks=3 partitions=6 - │ DataSourceExec: file_groups={6 groups: [[/testdata/tpcds/plans_sf1_partitions4/catalog_sales/part-0.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/catalog_sales/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/catalog_sales/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/catalog_sales/part-2.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/catalog_sales/part-2.parquet:..], ...]}, projection=[cs_sold_date_sk, cs_item_sk, cs_ext_sales_price], file_type=parquet, predicate=DynamicFilter [ empty ] AND DynamicFilter [ empty ] + │ DistributedLeafExec: DataSourceExec: file_groups={3 groups: [[/testdata/tpcds/plans_sf1_partitions4/catalog_sales/part-0.parquet:.., /testdata/tpcds/plans_sf1_partitions4/catalog_sales/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/catalog_sales/part-1.parquet:.., /testdata/tpcds/plans_sf1_partitions4/catalog_sales/part-2.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/catalog_sales/part-2.parquet:.., /testdata/tpcds/plans_sf1_partitions4/catalog_sales/part-3.parquet:..]]}, projection=[cs_sold_date_sk, cs_item_sk, cs_ext_sales_price], file_type=parquet, predicate=DynamicFilter [ empty ] AND DynamicFilter [ empty ] └────────────────────────────────────────────────── ┌───── Stage 1 ── Tasks: t0:[p0..p8] t1:[p9..p17] │ BroadcastExec: input_partitions=3, consumer_tasks=3, output_partitions=9 │ ProjectionExec: expr=[d_date_sk@0 as d_date_sk, CAST(d_date_sk@0 AS Float64) as CAST(date_dim.d_date_sk AS Float64)] │ FilterExec: d_date@1 >= 1999-02-22 AND d_date@1 <= 1999-03-24, projection=[d_date_sk@0] │ RepartitionExec: partitioning=RoundRobinBatch(3), input_partitions=2 - │ PartitionIsolatorExec: tasks=2 partitions=4 - │ DataSourceExec: file_groups={4 groups: [[/testdata/tpcds/plans_sf1_partitions4/date_dim/part-0.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-2.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-3.parquet]]}, projection=[d_date_sk, d_date], file_type=parquet, predicate=d_date@2 >= 1999-02-22 AND d_date@2 <= 1999-03-24, pruning_predicate=d_date_null_count@1 != row_count@2 AND d_date_max@0 >= 1999-02-22 AND d_date_null_count@1 != row_count@2 AND d_date_min@3 <= 1999-03-24, required_guarantees=[] + │ DistributedLeafExec: DataSourceExec: file_groups={2 groups: [[/testdata/tpcds/plans_sf1_partitions4/date_dim/part-0.parquet, /testdata/tpcds/plans_sf1_partitions4/date_dim/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-2.parquet, /testdata/tpcds/plans_sf1_partitions4/date_dim/part-3.parquet]]}, projection=[d_date_sk, d_date], file_type=parquet, predicate=d_date@2 >= 1999-02-22 AND d_date@2 <= 1999-03-24, pruning_predicate=d_date_null_count@1 != row_count@2 AND d_date_max@0 >= 1999-02-22 AND d_date_null_count@1 != row_count@2 AND d_date_min@3 <= 1999-03-24, required_guarantees=[] └────────────────────────────────────────────────── ┌───── Stage 2 ── Tasks: t0:[p0..p8] t1:[p9..p17] │ BroadcastExec: input_partitions=3, consumer_tasks=3, output_partitions=9 │ FilterExec: i_category@5 = Sports OR i_category@5 = Books OR i_category@5 = Home │ RepartitionExec: partitioning=RoundRobinBatch(3), input_partitions=2 - │ PartitionIsolatorExec: tasks=2 partitions=4 - │ DataSourceExec: file_groups={4 groups: [[/testdata/tpcds/plans_sf1_partitions4/item/part-0.parquet], [/testdata/tpcds/plans_sf1_partitions4/item/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/item/part-2.parquet], [/testdata/tpcds/plans_sf1_partitions4/item/part-3.parquet]]}, projection=[i_item_sk, i_item_id, i_item_desc, i_current_price, i_class, i_category], file_type=parquet, predicate=i_category@12 = Sports OR i_category@12 = Books OR i_category@12 = Home, pruning_predicate=i_category_null_count@2 != row_count@3 AND i_category_min@0 <= Sports AND Sports <= i_category_max@1 OR i_category_null_count@2 != row_count@3 AND i_category_min@0 <= Books AND Books <= i_category_max@1 OR i_category_null_count@2 != row_count@3 AND i_category_min@0 <= Home AND Home <= i_category_max@1, required_guarantees=[i_category in (Books, Home, Sports)] + │ DistributedLeafExec: DataSourceExec: file_groups={2 groups: [[/testdata/tpcds/plans_sf1_partitions4/item/part-0.parquet, /testdata/tpcds/plans_sf1_partitions4/item/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/item/part-2.parquet, /testdata/tpcds/plans_sf1_partitions4/item/part-3.parquet]]}, projection=[i_item_sk, i_item_id, i_item_desc, i_current_price, i_class, i_category], file_type=parquet, predicate=i_category@12 = Sports OR i_category@12 = Books OR i_category@12 = Home, pruning_predicate=i_category_null_count@2 != row_count@3 AND i_category_min@0 <= Sports AND Sports <= i_category_max@1 OR i_category_null_count@2 != row_count@3 AND i_category_min@0 <= Books AND Books <= i_category_max@1 OR i_category_null_count@2 != row_count@3 AND i_category_min@0 <= Home AND Home <= i_category_max@1, required_guarantees=[i_category in (Books, Home, Sports)] └────────────────────────────────────────────────── "#); Ok(()) @@ -2377,7 +2185,7 @@ mod tests { │ [Stage 4] => NetworkShuffleExec: output_partitions=3, input_tasks=3 └────────────────────────────────────────────────── ┌───── Stage 4 ── Tasks: t0:[p0..p5] t1:[p0..p5] t2:[p0..p5] - │ RepartitionExec: partitioning=Hash([w_warehouse_name@0, i_item_id@1], 6), input_partitions=2 + │ RepartitionExec: partitioning=Hash([w_warehouse_name@0, i_item_id@1], 6), input_partitions=3 │ AggregateExec: mode=Partial, gby=[w_warehouse_name@2 as w_warehouse_name, i_item_id@3 as i_item_id], aggr=[sum(CASE WHEN date_dim.d_date < Utf8("2000-03-11") THEN inventory.inv_quantity_on_hand ELSE Int64(0) END), sum(CASE WHEN date_dim.d_date >= Utf8("2000-03-11") THEN inventory.inv_quantity_on_hand ELSE Int64(0) END)] │ ProjectionExec: expr=[d_date@0 as __common_expr_2, inv_quantity_on_hand@1 as inv_quantity_on_hand, w_warehouse_name@2 as w_warehouse_name, i_item_id@3 as i_item_id] │ HashJoinExec: mode=CollectLeft, join_type=Inner, on=[(d_date_sk@0, inv_date_sk@0)], projection=[d_date@1, inv_quantity_on_hand@3, w_warehouse_name@4, i_item_id@5] @@ -2391,27 +2199,23 @@ mod tests { │ HashJoinExec: mode=CollectLeft, join_type=Inner, on=[(w_warehouse_sk@0, inv_warehouse_sk@2)], projection=[w_warehouse_name@1, inv_date_sk@2, inv_item_sk@3, inv_quantity_on_hand@5] │ CoalescePartitionsExec │ [Stage 3] => NetworkBroadcastExec: partitions_per_consumer=2, stage_partitions=6, input_tasks=2 - │ PartitionIsolatorExec: tasks=3 partitions=6 - │ DataSourceExec: file_groups={6 groups: [[/testdata/tpcds/plans_sf1_partitions4/inventory/part-0.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/inventory/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/inventory/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/inventory/part-2.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/inventory/part-2.parquet:..], ...]}, projection=[inv_date_sk, inv_item_sk, inv_warehouse_sk, inv_quantity_on_hand], file_type=parquet, predicate=DynamicFilter [ empty ] AND DynamicFilter [ empty ] AND DynamicFilter [ empty ] + │ DistributedLeafExec: DataSourceExec: file_groups={3 groups: [[/testdata/tpcds/plans_sf1_partitions4/inventory/part-0.parquet:.., /testdata/tpcds/plans_sf1_partitions4/inventory/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/inventory/part-1.parquet:.., /testdata/tpcds/plans_sf1_partitions4/inventory/part-2.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/inventory/part-2.parquet:.., /testdata/tpcds/plans_sf1_partitions4/inventory/part-3.parquet:..]]}, projection=[inv_date_sk, inv_item_sk, inv_warehouse_sk, inv_quantity_on_hand], file_type=parquet, predicate=DynamicFilter [ empty ] AND DynamicFilter [ empty ] AND DynamicFilter [ empty ] └────────────────────────────────────────────────── ┌───── Stage 1 ── Tasks: t0:[p0..p8] t1:[p9..p17] │ BroadcastExec: input_partitions=3, consumer_tasks=3, output_partitions=9 │ FilterExec: d_date@1 >= 2000-02-10 AND d_date@1 <= 2000-04-10 │ RepartitionExec: partitioning=RoundRobinBatch(3), input_partitions=2 - │ PartitionIsolatorExec: tasks=2 partitions=4 - │ DataSourceExec: file_groups={4 groups: [[/testdata/tpcds/plans_sf1_partitions4/date_dim/part-0.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-2.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-3.parquet]]}, projection=[d_date_sk, d_date], file_type=parquet, predicate=d_date@2 >= 2000-02-10 AND d_date@2 <= 2000-04-10, pruning_predicate=d_date_null_count@1 != row_count@2 AND d_date_max@0 >= 2000-02-10 AND d_date_null_count@1 != row_count@2 AND d_date_min@3 <= 2000-04-10, required_guarantees=[] + │ DistributedLeafExec: DataSourceExec: file_groups={2 groups: [[/testdata/tpcds/plans_sf1_partitions4/date_dim/part-0.parquet, /testdata/tpcds/plans_sf1_partitions4/date_dim/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-2.parquet, /testdata/tpcds/plans_sf1_partitions4/date_dim/part-3.parquet]]}, projection=[d_date_sk, d_date], file_type=parquet, predicate=d_date@2 >= 2000-02-10 AND d_date@2 <= 2000-04-10, pruning_predicate=d_date_null_count@1 != row_count@2 AND d_date_max@0 >= 2000-02-10 AND d_date_null_count@1 != row_count@2 AND d_date_min@3 <= 2000-04-10, required_guarantees=[] └────────────────────────────────────────────────── ┌───── Stage 2 ── Tasks: t0:[p0..p8] t1:[p9..p17] │ BroadcastExec: input_partitions=3, consumer_tasks=3, output_partitions=9 │ FilterExec: i_current_price@2 >= Some(99),4,2 AND i_current_price@2 <= Some(149),4,2, projection=[i_item_sk@0, i_item_id@1] │ RepartitionExec: partitioning=RoundRobinBatch(3), input_partitions=2 - │ PartitionIsolatorExec: tasks=2 partitions=4 - │ DataSourceExec: file_groups={4 groups: [[/testdata/tpcds/plans_sf1_partitions4/item/part-0.parquet], [/testdata/tpcds/plans_sf1_partitions4/item/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/item/part-2.parquet], [/testdata/tpcds/plans_sf1_partitions4/item/part-3.parquet]]}, projection=[i_item_sk, i_item_id, i_current_price], file_type=parquet, predicate=i_current_price@5 >= Some(99),4,2 AND i_current_price@5 <= Some(149),4,2, pruning_predicate=i_current_price_null_count@1 != row_count@2 AND i_current_price_max@0 >= Some(99),4,2 AND i_current_price_null_count@1 != row_count@2 AND i_current_price_min@3 <= Some(149),4,2, required_guarantees=[] + │ DistributedLeafExec: DataSourceExec: file_groups={2 groups: [[/testdata/tpcds/plans_sf1_partitions4/item/part-0.parquet, /testdata/tpcds/plans_sf1_partitions4/item/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/item/part-2.parquet, /testdata/tpcds/plans_sf1_partitions4/item/part-3.parquet]]}, projection=[i_item_sk, i_item_id, i_current_price], file_type=parquet, predicate=i_current_price@5 >= Some(99),4,2 AND i_current_price@5 <= Some(149),4,2, pruning_predicate=i_current_price_null_count@1 != row_count@2 AND i_current_price_max@0 >= Some(99),4,2 AND i_current_price_null_count@1 != row_count@2 AND i_current_price_min@3 <= Some(149),4,2, required_guarantees=[] └────────────────────────────────────────────────── ┌───── Stage 3 ── Tasks: t0:[p0..p5] t1:[p6..p11] │ BroadcastExec: input_partitions=2, consumer_tasks=3, output_partitions=6 - │ PartitionIsolatorExec: tasks=2 partitions=4 - │ DataSourceExec: file_groups={4 groups: [[/testdata/tpcds/plans_sf1_partitions4/warehouse/part-0.parquet], [/testdata/tpcds/plans_sf1_partitions4/warehouse/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/warehouse/part-2.parquet], [/testdata/tpcds/plans_sf1_partitions4/warehouse/part-3.parquet]]}, projection=[w_warehouse_sk, w_warehouse_name], file_type=parquet + │ DistributedLeafExec: DataSourceExec: file_groups={2 groups: [[/testdata/tpcds/plans_sf1_partitions4/warehouse/part-0.parquet, /testdata/tpcds/plans_sf1_partitions4/warehouse/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/warehouse/part-2.parquet, /testdata/tpcds/plans_sf1_partitions4/warehouse/part-3.parquet]]}, projection=[w_warehouse_sk, w_warehouse_name], file_type=parquet └────────────────────────────────────────────────── "#); Ok(()) @@ -2419,7 +2223,7 @@ mod tests { #[tokio::test] async fn test_tpcds_22() -> Result<()> { let display = test_tpcds_query("q22").await?; - assert_snapshot!(display, @" + assert_snapshot!(display, @r" ┌───── DistributedExec ── Tasks: t0:[p0] │ SortPreservingMergeExec: [qoh@4 ASC, i_product_name@0 ASC, i_brand@1 ASC, i_class@2 ASC, i_category@3 ASC], fetch=100 │ [Stage 4] => NetworkCoalesceExec: output_partitions=6, input_tasks=2 @@ -2431,7 +2235,7 @@ mod tests { │ [Stage 3] => NetworkShuffleExec: output_partitions=3, input_tasks=3 └────────────────────────────────────────────────── ┌───── Stage 3 ── Tasks: t0:[p0..p5] t1:[p0..p5] t2:[p0..p5] - │ RepartitionExec: partitioning=Hash([i_product_name@0, i_brand@1, i_class@2, i_category@3, __grouping_id@4], 6), input_partitions=2 + │ RepartitionExec: partitioning=Hash([i_product_name@0, i_brand@1, i_class@2, i_category@3, __grouping_id@4], 6), input_partitions=3 │ AggregateExec: mode=Partial, gby=[(NULL as i_product_name, NULL as i_brand, NULL as i_class, NULL as i_category), (i_product_name@4 as i_product_name, NULL as i_brand, NULL as i_class, NULL as i_category), (i_product_name@4 as i_product_name, i_brand@1 as i_brand, NULL as i_class, NULL as i_category), (i_product_name@4 as i_product_name, i_brand@1 as i_brand, i_class@2 as i_class, NULL as i_category), (i_product_name@4 as i_product_name, i_brand@1 as i_brand, i_class@2 as i_class, i_category@3 as i_category)], aggr=[avg(inventory.inv_quantity_on_hand)] │ ProjectionExec: expr=[inv_quantity_on_hand@4 as inv_quantity_on_hand, i_brand@0 as i_brand, i_class@1 as i_class, i_category@2 as i_category, i_product_name@3 as i_product_name] │ HashJoinExec: mode=CollectLeft, join_type=Inner, on=[(i_item_sk@0, inv_item_sk@0)], projection=[i_brand@1, i_class@2, i_category@3, i_product_name@4, inv_quantity_on_hand@6] @@ -2440,20 +2244,17 @@ mod tests { │ HashJoinExec: mode=CollectLeft, join_type=Inner, on=[(d_date_sk@0, inv_date_sk@0)], projection=[inv_item_sk@2, inv_quantity_on_hand@3] │ CoalescePartitionsExec │ [Stage 2] => NetworkBroadcastExec: partitions_per_consumer=3, stage_partitions=9, input_tasks=2 - │ PartitionIsolatorExec: tasks=3 partitions=6 - │ DataSourceExec: file_groups={6 groups: [[/testdata/tpcds/plans_sf1_partitions4/inventory/part-0.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/inventory/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/inventory/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/inventory/part-2.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/inventory/part-2.parquet:..], ...]}, projection=[inv_date_sk, inv_item_sk, inv_quantity_on_hand], file_type=parquet, predicate=DynamicFilter [ empty ] AND DynamicFilter [ empty ] + │ DistributedLeafExec: DataSourceExec: file_groups={3 groups: [[/testdata/tpcds/plans_sf1_partitions4/inventory/part-0.parquet:.., /testdata/tpcds/plans_sf1_partitions4/inventory/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/inventory/part-1.parquet:.., /testdata/tpcds/plans_sf1_partitions4/inventory/part-2.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/inventory/part-2.parquet:.., /testdata/tpcds/plans_sf1_partitions4/inventory/part-3.parquet:..]]}, projection=[inv_date_sk, inv_item_sk, inv_quantity_on_hand], file_type=parquet, predicate=DynamicFilter [ empty ] AND DynamicFilter [ empty ] └────────────────────────────────────────────────── ┌───── Stage 1 ── Tasks: t0:[p0..p5] t1:[p6..p11] │ BroadcastExec: input_partitions=2, consumer_tasks=3, output_partitions=6 - │ PartitionIsolatorExec: tasks=2 partitions=4 - │ DataSourceExec: file_groups={4 groups: [[/testdata/tpcds/plans_sf1_partitions4/item/part-0.parquet], [/testdata/tpcds/plans_sf1_partitions4/item/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/item/part-2.parquet], [/testdata/tpcds/plans_sf1_partitions4/item/part-3.parquet]]}, projection=[i_item_sk, i_brand, i_class, i_category, i_product_name], file_type=parquet + │ DistributedLeafExec: DataSourceExec: file_groups={2 groups: [[/testdata/tpcds/plans_sf1_partitions4/item/part-0.parquet, /testdata/tpcds/plans_sf1_partitions4/item/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/item/part-2.parquet, /testdata/tpcds/plans_sf1_partitions4/item/part-3.parquet]]}, projection=[i_item_sk, i_brand, i_class, i_category, i_product_name], file_type=parquet └────────────────────────────────────────────────── ┌───── Stage 2 ── Tasks: t0:[p0..p8] t1:[p9..p17] │ BroadcastExec: input_partitions=3, consumer_tasks=3, output_partitions=9 │ FilterExec: d_month_seq@1 >= 1200 AND d_month_seq@1 <= 1211, projection=[d_date_sk@0] │ RepartitionExec: partitioning=RoundRobinBatch(3), input_partitions=2 - │ PartitionIsolatorExec: tasks=2 partitions=4 - │ DataSourceExec: file_groups={4 groups: [[/testdata/tpcds/plans_sf1_partitions4/date_dim/part-0.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-2.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-3.parquet]]}, projection=[d_date_sk, d_month_seq], file_type=parquet, predicate=d_month_seq@3 >= 1200 AND d_month_seq@3 <= 1211, pruning_predicate=d_month_seq_null_count@1 != row_count@2 AND d_month_seq_max@0 >= 1200 AND d_month_seq_null_count@1 != row_count@2 AND d_month_seq_min@3 <= 1211, required_guarantees=[] + │ DistributedLeafExec: DataSourceExec: file_groups={2 groups: [[/testdata/tpcds/plans_sf1_partitions4/date_dim/part-0.parquet, /testdata/tpcds/plans_sf1_partitions4/date_dim/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-2.parquet, /testdata/tpcds/plans_sf1_partitions4/date_dim/part-3.parquet]]}, projection=[d_date_sk, d_month_seq], file_type=parquet, predicate=d_month_seq@3 >= 1200 AND d_month_seq@3 <= 1211, pruning_predicate=d_month_seq_null_count@1 != row_count@2 AND d_month_seq_max@0 >= 1200 AND d_month_seq_null_count@1 != row_count@2 AND d_month_seq_min@3 <= 1211, required_guarantees=[] └────────────────────────────────────────────────── "); Ok(()) @@ -2461,7 +2262,7 @@ mod tests { #[tokio::test] async fn test_tpcds_23() -> Result<()> { let display = test_tpcds_query("q23").await?; - assert_snapshot!(display, @" + assert_snapshot!(display, @r" ┌───── DistributedExec ── Tasks: t0:[p0] │ SortPreservingMergeExec: [c_last_name@0 ASC, c_first_name@1 ASC, sales@2 ASC], fetch=100 │ [Stage 33] => NetworkCoalesceExec: output_partitions=6, input_tasks=2 @@ -2491,15 +2292,15 @@ mod tests { │ BroadcastExec: input_partitions=3, consumer_tasks=3, output_partitions=9 │ HashJoinExec: mode=CollectLeft, join_type=Inner, on=[(cs_item_sk@1, item_sk@0)], projection=[cs_bill_customer_sk@0, cs_quantity@2, cs_list_price@3, c_first_name@4, c_last_name@5] │ CoalescePartitionsExec - │ [Stage 3] => NetworkBroadcastExec: partitions_per_consumer=2, stage_partitions=6, input_tasks=3 + │ [Stage 3] => NetworkBroadcastExec: partitions_per_consumer=3, stage_partitions=9, input_tasks=3 │ ProjectionExec: expr=[i_item_sk@0 as item_sk] │ FilterExec: count(Int64(1))@1 > 4, projection=[i_item_sk@0] │ ProjectionExec: expr=[i_item_sk@1 as i_item_sk, count(Int64(1))@3 as count(Int64(1))] │ AggregateExec: mode=FinalPartitioned, gby=[itemdesc@0 as itemdesc, i_item_sk@1 as i_item_sk, d_date@2 as d_date], aggr=[count(Int64(1))] │ [Stage 6] => NetworkShuffleExec: output_partitions=3, input_tasks=3 └────────────────────────────────────────────────── - ┌───── Stage 3 ── Tasks: t0:[p0..p5] t1:[p6..p11] t2:[p12..p17] - │ BroadcastExec: input_partitions=2, consumer_tasks=3, output_partitions=6 + ┌───── Stage 3 ── Tasks: t0:[p0..p8] t1:[p9..p17] t2:[p18..p26] + │ BroadcastExec: input_partitions=3, consumer_tasks=3, output_partitions=9 │ HashJoinExec: mode=CollectLeft, join_type=Inner, on=[(CAST(date_dim.d_date_sk AS Float64)@1, cs_sold_date_sk@0)], projection=[cs_bill_customer_sk@3, cs_item_sk@4, cs_quantity@5, cs_list_price@6, c_first_name@7, c_last_name@8] │ CoalescePartitionsExec │ [Stage 1] => NetworkBroadcastExec: partitions_per_consumer=3, stage_partitions=9, input_tasks=2 @@ -2507,51 +2308,45 @@ mod tests { │ HashJoinExec: mode=CollectLeft, join_type=Inner, on=[(CAST(customer.c_customer_sk AS Float64)@3, cs_bill_customer_sk@1)], projection=[c_first_name@1, c_last_name@2, cs_sold_date_sk@4, cs_bill_customer_sk@5, cs_item_sk@6, cs_quantity@7, cs_list_price@8] │ CoalescePartitionsExec │ [Stage 2] => NetworkBroadcastExec: partitions_per_consumer=3, stage_partitions=9, input_tasks=2 - │ PartitionIsolatorExec: tasks=3 partitions=6 - │ DataSourceExec: file_groups={6 groups: [[/testdata/tpcds/plans_sf1_partitions4/catalog_sales/part-0.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/catalog_sales/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/catalog_sales/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/catalog_sales/part-2.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/catalog_sales/part-2.parquet:..], ...]}, projection=[cs_sold_date_sk, cs_bill_customer_sk, cs_item_sk, cs_quantity, cs_list_price], file_type=parquet, predicate=DynamicFilter [ empty ] AND DynamicFilter [ empty ] + │ DistributedLeafExec: DataSourceExec: file_groups={3 groups: [[/testdata/tpcds/plans_sf1_partitions4/catalog_sales/part-0.parquet:.., /testdata/tpcds/plans_sf1_partitions4/catalog_sales/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/catalog_sales/part-1.parquet:.., /testdata/tpcds/plans_sf1_partitions4/catalog_sales/part-2.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/catalog_sales/part-2.parquet:.., /testdata/tpcds/plans_sf1_partitions4/catalog_sales/part-3.parquet:..]]}, projection=[cs_sold_date_sk, cs_bill_customer_sk, cs_item_sk, cs_quantity, cs_list_price], file_type=parquet, predicate=DynamicFilter [ empty ] AND DynamicFilter [ empty ] └────────────────────────────────────────────────── ┌───── Stage 1 ── Tasks: t0:[p0..p8] t1:[p9..p17] │ BroadcastExec: input_partitions=3, consumer_tasks=3, output_partitions=9 │ ProjectionExec: expr=[d_date_sk@0 as d_date_sk, CAST(d_date_sk@0 AS Float64) as CAST(date_dim.d_date_sk AS Float64)] │ FilterExec: d_year@1 = 2000 AND d_moy@2 = 2, projection=[d_date_sk@0] │ RepartitionExec: partitioning=RoundRobinBatch(3), input_partitions=2 - │ PartitionIsolatorExec: tasks=2 partitions=4 - │ DataSourceExec: file_groups={4 groups: [[/testdata/tpcds/plans_sf1_partitions4/date_dim/part-0.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-2.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-3.parquet]]}, projection=[d_date_sk, d_year, d_moy], file_type=parquet, predicate=d_year@6 = 2000 AND d_moy@8 = 2, pruning_predicate=d_year_null_count@2 != row_count@3 AND d_year_min@0 <= 2000 AND 2000 <= d_year_max@1 AND d_moy_null_count@6 != row_count@3 AND d_moy_min@4 <= 2 AND 2 <= d_moy_max@5, required_guarantees=[d_moy in (2), d_year in (2000)] + │ DistributedLeafExec: DataSourceExec: file_groups={2 groups: [[/testdata/tpcds/plans_sf1_partitions4/date_dim/part-0.parquet, /testdata/tpcds/plans_sf1_partitions4/date_dim/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-2.parquet, /testdata/tpcds/plans_sf1_partitions4/date_dim/part-3.parquet]]}, projection=[d_date_sk, d_year, d_moy], file_type=parquet, predicate=d_year@6 = 2000 AND d_moy@8 = 2, pruning_predicate=d_year_null_count@2 != row_count@3 AND d_year_min@0 <= 2000 AND 2000 <= d_year_max@1 AND d_moy_null_count@6 != row_count@3 AND d_moy_min@4 <= 2 AND 2 <= d_moy_max@5, required_guarantees=[d_moy in (2), d_year in (2000)] └────────────────────────────────────────────────── ┌───── Stage 2 ── Tasks: t0:[p0..p8] t1:[p9..p17] │ BroadcastExec: input_partitions=3, consumer_tasks=3, output_partitions=9 │ ProjectionExec: expr=[c_customer_sk@0 as c_customer_sk, c_first_name@1 as c_first_name, c_last_name@2 as c_last_name, CAST(c_customer_sk@0 AS Float64) as CAST(customer.c_customer_sk AS Float64)] │ RepartitionExec: partitioning=RoundRobinBatch(3), input_partitions=2 - │ PartitionIsolatorExec: tasks=2 partitions=4 - │ DataSourceExec: file_groups={4 groups: [[/testdata/tpcds/plans_sf1_partitions4/customer/part-0.parquet], [/testdata/tpcds/plans_sf1_partitions4/customer/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/customer/part-2.parquet], [/testdata/tpcds/plans_sf1_partitions4/customer/part-3.parquet]]}, projection=[c_customer_sk, c_first_name, c_last_name], file_type=parquet + │ DistributedLeafExec: DataSourceExec: file_groups={2 groups: [[/testdata/tpcds/plans_sf1_partitions4/customer/part-0.parquet, /testdata/tpcds/plans_sf1_partitions4/customer/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/customer/part-2.parquet, /testdata/tpcds/plans_sf1_partitions4/customer/part-3.parquet]]}, projection=[c_customer_sk, c_first_name, c_last_name], file_type=parquet └────────────────────────────────────────────────── ┌───── Stage 6 ── Tasks: t0:[p0..p8] t1:[p0..p8] t2:[p0..p8] │ RepartitionExec: partitioning=Hash([itemdesc@0, i_item_sk@1, d_date@2], 9), input_partitions=3 │ AggregateExec: mode=Partial, gby=[itemdesc@1 as itemdesc, i_item_sk@2 as i_item_sk, d_date@0 as d_date], aggr=[count(Int64(1))] │ HashJoinExec: mode=CollectLeft, join_type=Inner, on=[(ss_item_sk@0, i_item_sk@1)], projection=[d_date@1, itemdesc@2, i_item_sk@3] │ CoalescePartitionsExec - │ [Stage 5] => NetworkBroadcastExec: partitions_per_consumer=2, stage_partitions=6, input_tasks=3 + │ [Stage 5] => NetworkBroadcastExec: partitions_per_consumer=3, stage_partitions=9, input_tasks=3 │ ProjectionExec: expr=[substr(i_item_desc@1, 1, 30) as itemdesc, i_item_sk@0 as i_item_sk] │ RepartitionExec: partitioning=RoundRobinBatch(3), input_partitions=2 - │ PartitionIsolatorExec: tasks=3 partitions=4 - │ DataSourceExec: file_groups={4 groups: [[/testdata/tpcds/plans_sf1_partitions4/item/part-0.parquet], [/testdata/tpcds/plans_sf1_partitions4/item/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/item/part-2.parquet], [/testdata/tpcds/plans_sf1_partitions4/item/part-3.parquet]]}, projection=[i_item_sk, i_item_desc], file_type=parquet, predicate=DynamicFilter [ empty ] + │ DistributedLeafExec: DataSourceExec: file_groups={2 groups: [[/testdata/tpcds/plans_sf1_partitions4/item/part-0.parquet, /testdata/tpcds/plans_sf1_partitions4/item/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/item/part-2.parquet, /testdata/tpcds/plans_sf1_partitions4/item/part-3.parquet]]}, projection=[i_item_sk, i_item_desc], file_type=parquet, predicate=DynamicFilter [ empty ] └────────────────────────────────────────────────── - ┌───── Stage 5 ── Tasks: t0:[p0..p5] t1:[p6..p11] t2:[p12..p17] - │ BroadcastExec: input_partitions=2, consumer_tasks=3, output_partitions=6 + ┌───── Stage 5 ── Tasks: t0:[p0..p8] t1:[p9..p17] t2:[p18..p26] + │ BroadcastExec: input_partitions=3, consumer_tasks=3, output_partitions=9 │ ProjectionExec: expr=[ss_item_sk@1 as ss_item_sk, d_date@0 as d_date] │ HashJoinExec: mode=CollectLeft, join_type=Inner, on=[(CAST(date_dim.d_date_sk AS Float64)@2, ss_sold_date_sk@0)], projection=[d_date@1, ss_item_sk@4] │ CoalescePartitionsExec │ [Stage 4] => NetworkBroadcastExec: partitions_per_consumer=3, stage_partitions=9, input_tasks=2 - │ PartitionIsolatorExec: tasks=3 partitions=6 - │ DataSourceExec: file_groups={6 groups: [[/testdata/tpcds/plans_sf1_partitions4/store_sales/part-0.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_sales/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_sales/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_sales/part-2.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_sales/part-2.parquet:..], ...]}, projection=[ss_sold_date_sk, ss_item_sk], file_type=parquet, predicate=DynamicFilter [ empty ] + │ DistributedLeafExec: DataSourceExec: file_groups={3 groups: [[/testdata/tpcds/plans_sf1_partitions4/store_sales/part-0.parquet:.., /testdata/tpcds/plans_sf1_partitions4/store_sales/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_sales/part-1.parquet:.., /testdata/tpcds/plans_sf1_partitions4/store_sales/part-2.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_sales/part-2.parquet:.., /testdata/tpcds/plans_sf1_partitions4/store_sales/part-3.parquet:..]]}, projection=[ss_sold_date_sk, ss_item_sk], file_type=parquet, predicate=DynamicFilter [ empty ] └────────────────────────────────────────────────── ┌───── Stage 4 ── Tasks: t0:[p0..p8] t1:[p9..p17] │ BroadcastExec: input_partitions=3, consumer_tasks=3, output_partitions=9 │ ProjectionExec: expr=[d_date_sk@0 as d_date_sk, d_date@1 as d_date, CAST(d_date_sk@0 AS Float64) as CAST(date_dim.d_date_sk AS Float64)] │ FilterExec: d_year@2 IN (SET) ([2000, 2001, 2002, 2003]), projection=[d_date_sk@0, d_date@1] │ RepartitionExec: partitioning=RoundRobinBatch(3), input_partitions=2 - │ PartitionIsolatorExec: tasks=2 partitions=4 - │ DataSourceExec: file_groups={4 groups: [[/testdata/tpcds/plans_sf1_partitions4/date_dim/part-0.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-2.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-3.parquet]]}, projection=[d_date_sk, d_date, d_year], file_type=parquet, predicate=d_year@6 IN (SET) ([2000, 2001, 2002, 2003]), pruning_predicate=d_year_null_count@2 != row_count@3 AND d_year_min@0 <= 2000 AND 2000 <= d_year_max@1 OR d_year_null_count@2 != row_count@3 AND d_year_min@0 <= 2001 AND 2001 <= d_year_max@1 OR d_year_null_count@2 != row_count@3 AND d_year_min@0 <= 2002 AND 2002 <= d_year_max@1 OR d_year_null_count@2 != row_count@3 AND d_year_min@0 <= 2003 AND 2003 <= d_year_max@1, required_guarantees=[d_year in (2000, 2001, 2002, 2003)] + │ DistributedLeafExec: DataSourceExec: file_groups={2 groups: [[/testdata/tpcds/plans_sf1_partitions4/date_dim/part-0.parquet, /testdata/tpcds/plans_sf1_partitions4/date_dim/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-2.parquet, /testdata/tpcds/plans_sf1_partitions4/date_dim/part-3.parquet]]}, projection=[d_date_sk, d_date, d_year], file_type=parquet, predicate=d_year@6 IN (SET) ([2000, 2001, 2002, 2003]), pruning_predicate=d_year_null_count@2 != row_count@3 AND d_year_min@0 <= 2000 AND 2000 <= d_year_max@1 OR d_year_null_count@2 != row_count@3 AND d_year_min@0 <= 2001 AND 2001 <= d_year_max@1 OR d_year_null_count@2 != row_count@3 AND d_year_min@0 <= 2002 AND 2002 <= d_year_max@1 OR d_year_null_count@2 != row_count@3 AND d_year_min@0 <= 2003 AND 2003 <= d_year_max@1, required_guarantees=[d_year in (2000, 2001, 2002, 2003)] └────────────────────────────────────────────────── ┌───── Stage 15 ── Tasks: t0:[p0..p8] │ RepartitionExec: partitioning=Hash([c_customer_sk@0], 9), input_partitions=3 @@ -2589,32 +2384,27 @@ mod tests { │ ProjectionExec: expr=[d_date_sk@0 as d_date_sk, CAST(d_date_sk@0 AS Float64) as CAST(date_dim.d_date_sk AS Float64)] │ FilterExec: d_year@1 IN (SET) ([2000, 2001, 2002, 2003]), projection=[d_date_sk@0] │ RepartitionExec: partitioning=RoundRobinBatch(3), input_partitions=2 - │ PartitionIsolatorExec: tasks=2 partitions=4 - │ DataSourceExec: file_groups={4 groups: [[/testdata/tpcds/plans_sf1_partitions4/date_dim/part-0.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-2.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-3.parquet]]}, projection=[d_date_sk, d_year], file_type=parquet, predicate=d_year@6 IN (SET) ([2000, 2001, 2002, 2003]), pruning_predicate=d_year_null_count@2 != row_count@3 AND d_year_min@0 <= 2000 AND 2000 <= d_year_max@1 OR d_year_null_count@2 != row_count@3 AND d_year_min@0 <= 2001 AND 2001 <= d_year_max@1 OR d_year_null_count@2 != row_count@3 AND d_year_min@0 <= 2002 AND 2002 <= d_year_max@1 OR d_year_null_count@2 != row_count@3 AND d_year_min@0 <= 2003 AND 2003 <= d_year_max@1, required_guarantees=[d_year in (2000, 2001, 2002, 2003)] + │ DistributedLeafExec: DataSourceExec: file_groups={2 groups: [[/testdata/tpcds/plans_sf1_partitions4/date_dim/part-0.parquet, /testdata/tpcds/plans_sf1_partitions4/date_dim/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-2.parquet, /testdata/tpcds/plans_sf1_partitions4/date_dim/part-3.parquet]]}, projection=[d_date_sk, d_year], file_type=parquet, predicate=d_year@6 IN (SET) ([2000, 2001, 2002, 2003]), pruning_predicate=d_year_null_count@2 != row_count@3 AND d_year_min@0 <= 2000 AND 2000 <= d_year_max@1 OR d_year_null_count@2 != row_count@3 AND d_year_min@0 <= 2001 AND 2001 <= d_year_max@1 OR d_year_null_count@2 != row_count@3 AND d_year_min@0 <= 2002 AND 2002 <= d_year_max@1 OR d_year_null_count@2 != row_count@3 AND d_year_min@0 <= 2003 AND 2003 <= d_year_max@1, required_guarantees=[d_year in (2000, 2001, 2002, 2003)] └────────────────────────────────────────────────── ┌───── Stage 9 ── Tasks: t0:[p0..p8] t1:[p0..p8] │ RepartitionExec: partitioning=Hash([CAST(customer.c_customer_sk AS Float64)@1], 9), input_partitions=3 │ ProjectionExec: expr=[c_customer_sk@0 as c_customer_sk, CAST(c_customer_sk@0 AS Float64) as CAST(customer.c_customer_sk AS Float64)] │ RepartitionExec: partitioning=RoundRobinBatch(3), input_partitions=2 - │ PartitionIsolatorExec: tasks=2 partitions=4 - │ DataSourceExec: file_groups={4 groups: [[/testdata/tpcds/plans_sf1_partitions4/customer/part-0.parquet], [/testdata/tpcds/plans_sf1_partitions4/customer/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/customer/part-2.parquet], [/testdata/tpcds/plans_sf1_partitions4/customer/part-3.parquet]]}, projection=[c_customer_sk], file_type=parquet + │ DistributedLeafExec: DataSourceExec: file_groups={2 groups: [[/testdata/tpcds/plans_sf1_partitions4/customer/part-0.parquet, /testdata/tpcds/plans_sf1_partitions4/customer/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/customer/part-2.parquet, /testdata/tpcds/plans_sf1_partitions4/customer/part-3.parquet]]}, projection=[c_customer_sk], file_type=parquet └────────────────────────────────────────────────── ┌───── Stage 10 ── Tasks: t0:[p0..p8] t1:[p0..p8] t2:[p0..p8] - │ RepartitionExec: partitioning=Hash([ss_customer_sk@1], 9), input_partitions=2 - │ PartitionIsolatorExec: tasks=3 partitions=6 - │ DataSourceExec: file_groups={6 groups: [[/testdata/tpcds/plans_sf1_partitions4/store_sales/part-0.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_sales/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_sales/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_sales/part-2.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_sales/part-2.parquet:..], ...]}, projection=[ss_sold_date_sk, ss_customer_sk, ss_quantity, ss_sales_price], file_type=parquet, predicate=DynamicFilter [ empty ] AND DynamicFilter [ empty ] + │ RepartitionExec: partitioning=Hash([ss_customer_sk@1], 9), input_partitions=3 + │ DistributedLeafExec: DataSourceExec: file_groups={3 groups: [[/testdata/tpcds/plans_sf1_partitions4/store_sales/part-0.parquet:.., /testdata/tpcds/plans_sf1_partitions4/store_sales/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_sales/part-1.parquet:.., /testdata/tpcds/plans_sf1_partitions4/store_sales/part-2.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_sales/part-2.parquet:.., /testdata/tpcds/plans_sf1_partitions4/store_sales/part-3.parquet:..]]}, projection=[ss_sold_date_sk, ss_customer_sk, ss_quantity, ss_sales_price], file_type=parquet, predicate=DynamicFilter [ empty ] AND DynamicFilter [ empty ] └────────────────────────────────────────────────── ┌───── Stage 13 ── Tasks: t0:[p0..p2] t1:[p0..p2] │ RepartitionExec: partitioning=Hash([CAST(customer.c_customer_sk AS Float64)@1], 3), input_partitions=3 │ ProjectionExec: expr=[c_customer_sk@0 as c_customer_sk, CAST(c_customer_sk@0 AS Float64) as CAST(customer.c_customer_sk AS Float64)] │ RepartitionExec: partitioning=RoundRobinBatch(3), input_partitions=2 - │ PartitionIsolatorExec: tasks=2 partitions=4 - │ DataSourceExec: file_groups={4 groups: [[/testdata/tpcds/plans_sf1_partitions4/customer/part-0.parquet], [/testdata/tpcds/plans_sf1_partitions4/customer/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/customer/part-2.parquet], [/testdata/tpcds/plans_sf1_partitions4/customer/part-3.parquet]]}, projection=[c_customer_sk], file_type=parquet + │ DistributedLeafExec: DataSourceExec: file_groups={2 groups: [[/testdata/tpcds/plans_sf1_partitions4/customer/part-0.parquet, /testdata/tpcds/plans_sf1_partitions4/customer/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/customer/part-2.parquet, /testdata/tpcds/plans_sf1_partitions4/customer/part-3.parquet]]}, projection=[c_customer_sk], file_type=parquet └────────────────────────────────────────────────── ┌───── Stage 14 ── Tasks: t0:[p0..p2] t1:[p0..p2] t2:[p0..p2] - │ RepartitionExec: partitioning=Hash([ss_customer_sk@0], 3), input_partitions=2 - │ PartitionIsolatorExec: tasks=3 partitions=6 - │ DataSourceExec: file_groups={6 groups: [[/testdata/tpcds/plans_sf1_partitions4/store_sales/part-0.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_sales/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_sales/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_sales/part-2.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_sales/part-2.parquet:..], ...]}, projection=[ss_customer_sk, ss_quantity, ss_sales_price], file_type=parquet, predicate=DynamicFilter [ empty ] + │ RepartitionExec: partitioning=Hash([ss_customer_sk@0], 3), input_partitions=3 + │ DistributedLeafExec: DataSourceExec: file_groups={3 groups: [[/testdata/tpcds/plans_sf1_partitions4/store_sales/part-0.parquet:.., /testdata/tpcds/plans_sf1_partitions4/store_sales/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_sales/part-1.parquet:.., /testdata/tpcds/plans_sf1_partitions4/store_sales/part-2.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_sales/part-2.parquet:.., /testdata/tpcds/plans_sf1_partitions4/store_sales/part-3.parquet:..]]}, projection=[ss_customer_sk, ss_quantity, ss_sales_price], file_type=parquet, predicate=DynamicFilter [ empty ] └────────────────────────────────────────────────── ┌───── Stage 32 ── Tasks: t0:[p0..p5] t1:[p0..p5] t2:[p0..p5] │ RepartitionExec: partitioning=Hash([c_last_name@0, c_first_name@1], 6), input_partitions=3 @@ -2631,15 +2421,15 @@ mod tests { │ BroadcastExec: input_partitions=3, consumer_tasks=3, output_partitions=9 │ HashJoinExec: mode=CollectLeft, join_type=Inner, on=[(ws_item_sk@0, item_sk@0)], projection=[ws_bill_customer_sk@1, ws_quantity@2, ws_list_price@3, c_first_name@4, c_last_name@5] │ CoalescePartitionsExec - │ [Stage 19] => NetworkBroadcastExec: partitions_per_consumer=2, stage_partitions=6, input_tasks=3 + │ [Stage 19] => NetworkBroadcastExec: partitions_per_consumer=3, stage_partitions=9, input_tasks=3 │ ProjectionExec: expr=[i_item_sk@0 as item_sk] │ FilterExec: count(Int64(1))@1 > 4, projection=[i_item_sk@0] │ ProjectionExec: expr=[i_item_sk@1 as i_item_sk, count(Int64(1))@3 as count(Int64(1))] │ AggregateExec: mode=FinalPartitioned, gby=[itemdesc@0 as itemdesc, i_item_sk@1 as i_item_sk, d_date@2 as d_date], aggr=[count(Int64(1))] │ [Stage 22] => NetworkShuffleExec: output_partitions=3, input_tasks=3 └────────────────────────────────────────────────── - ┌───── Stage 19 ── Tasks: t0:[p0..p5] t1:[p6..p11] t2:[p12..p17] - │ BroadcastExec: input_partitions=2, consumer_tasks=3, output_partitions=6 + ┌───── Stage 19 ── Tasks: t0:[p0..p8] t1:[p9..p17] t2:[p18..p26] + │ BroadcastExec: input_partitions=3, consumer_tasks=3, output_partitions=9 │ HashJoinExec: mode=CollectLeft, join_type=Inner, on=[(CAST(date_dim.d_date_sk AS Float64)@1, ws_sold_date_sk@0)], projection=[ws_item_sk@3, ws_bill_customer_sk@4, ws_quantity@5, ws_list_price@6, c_first_name@7, c_last_name@8] │ CoalescePartitionsExec │ [Stage 17] => NetworkBroadcastExec: partitions_per_consumer=3, stage_partitions=9, input_tasks=2 @@ -2647,51 +2437,45 @@ mod tests { │ HashJoinExec: mode=CollectLeft, join_type=Inner, on=[(CAST(customer.c_customer_sk AS Float64)@3, ws_bill_customer_sk@2)], projection=[c_first_name@1, c_last_name@2, ws_sold_date_sk@4, ws_item_sk@5, ws_bill_customer_sk@6, ws_quantity@7, ws_list_price@8] │ CoalescePartitionsExec │ [Stage 18] => NetworkBroadcastExec: partitions_per_consumer=3, stage_partitions=9, input_tasks=2 - │ PartitionIsolatorExec: tasks=3 partitions=6 - │ DataSourceExec: file_groups={6 groups: [[/testdata/tpcds/plans_sf1_partitions4/web_sales/part-0.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/web_sales/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/web_sales/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/web_sales/part-2.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/web_sales/part-2.parquet:..], ...]}, projection=[ws_sold_date_sk, ws_item_sk, ws_bill_customer_sk, ws_quantity, ws_list_price], file_type=parquet, predicate=DynamicFilter [ empty ] AND DynamicFilter [ empty ] + │ DistributedLeafExec: DataSourceExec: file_groups={3 groups: [[/testdata/tpcds/plans_sf1_partitions4/web_sales/part-0.parquet:.., /testdata/tpcds/plans_sf1_partitions4/web_sales/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/web_sales/part-1.parquet:.., /testdata/tpcds/plans_sf1_partitions4/web_sales/part-2.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/web_sales/part-2.parquet:.., /testdata/tpcds/plans_sf1_partitions4/web_sales/part-3.parquet:..]]}, projection=[ws_sold_date_sk, ws_item_sk, ws_bill_customer_sk, ws_quantity, ws_list_price], file_type=parquet, predicate=DynamicFilter [ empty ] AND DynamicFilter [ empty ] └────────────────────────────────────────────────── ┌───── Stage 17 ── Tasks: t0:[p0..p8] t1:[p9..p17] │ BroadcastExec: input_partitions=3, consumer_tasks=3, output_partitions=9 │ ProjectionExec: expr=[d_date_sk@0 as d_date_sk, CAST(d_date_sk@0 AS Float64) as CAST(date_dim.d_date_sk AS Float64)] │ FilterExec: d_year@1 = 2000 AND d_moy@2 = 2, projection=[d_date_sk@0] │ RepartitionExec: partitioning=RoundRobinBatch(3), input_partitions=2 - │ PartitionIsolatorExec: tasks=2 partitions=4 - │ DataSourceExec: file_groups={4 groups: [[/testdata/tpcds/plans_sf1_partitions4/date_dim/part-0.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-2.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-3.parquet]]}, projection=[d_date_sk, d_year, d_moy], file_type=parquet, predicate=d_year@6 = 2000 AND d_moy@8 = 2, pruning_predicate=d_year_null_count@2 != row_count@3 AND d_year_min@0 <= 2000 AND 2000 <= d_year_max@1 AND d_moy_null_count@6 != row_count@3 AND d_moy_min@4 <= 2 AND 2 <= d_moy_max@5, required_guarantees=[d_moy in (2), d_year in (2000)] + │ DistributedLeafExec: DataSourceExec: file_groups={2 groups: [[/testdata/tpcds/plans_sf1_partitions4/date_dim/part-0.parquet, /testdata/tpcds/plans_sf1_partitions4/date_dim/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-2.parquet, /testdata/tpcds/plans_sf1_partitions4/date_dim/part-3.parquet]]}, projection=[d_date_sk, d_year, d_moy], file_type=parquet, predicate=d_year@6 = 2000 AND d_moy@8 = 2, pruning_predicate=d_year_null_count@2 != row_count@3 AND d_year_min@0 <= 2000 AND 2000 <= d_year_max@1 AND d_moy_null_count@6 != row_count@3 AND d_moy_min@4 <= 2 AND 2 <= d_moy_max@5, required_guarantees=[d_moy in (2), d_year in (2000)] └────────────────────────────────────────────────── ┌───── Stage 18 ── Tasks: t0:[p0..p8] t1:[p9..p17] │ BroadcastExec: input_partitions=3, consumer_tasks=3, output_partitions=9 │ ProjectionExec: expr=[c_customer_sk@0 as c_customer_sk, c_first_name@1 as c_first_name, c_last_name@2 as c_last_name, CAST(c_customer_sk@0 AS Float64) as CAST(customer.c_customer_sk AS Float64)] │ RepartitionExec: partitioning=RoundRobinBatch(3), input_partitions=2 - │ PartitionIsolatorExec: tasks=2 partitions=4 - │ DataSourceExec: file_groups={4 groups: [[/testdata/tpcds/plans_sf1_partitions4/customer/part-0.parquet], [/testdata/tpcds/plans_sf1_partitions4/customer/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/customer/part-2.parquet], [/testdata/tpcds/plans_sf1_partitions4/customer/part-3.parquet]]}, projection=[c_customer_sk, c_first_name, c_last_name], file_type=parquet + │ DistributedLeafExec: DataSourceExec: file_groups={2 groups: [[/testdata/tpcds/plans_sf1_partitions4/customer/part-0.parquet, /testdata/tpcds/plans_sf1_partitions4/customer/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/customer/part-2.parquet, /testdata/tpcds/plans_sf1_partitions4/customer/part-3.parquet]]}, projection=[c_customer_sk, c_first_name, c_last_name], file_type=parquet └────────────────────────────────────────────────── ┌───── Stage 22 ── Tasks: t0:[p0..p8] t1:[p0..p8] t2:[p0..p8] │ RepartitionExec: partitioning=Hash([itemdesc@0, i_item_sk@1, d_date@2], 9), input_partitions=3 │ AggregateExec: mode=Partial, gby=[itemdesc@1 as itemdesc, i_item_sk@2 as i_item_sk, d_date@0 as d_date], aggr=[count(Int64(1))] │ HashJoinExec: mode=CollectLeft, join_type=Inner, on=[(ss_item_sk@0, i_item_sk@1)], projection=[d_date@1, itemdesc@2, i_item_sk@3] │ CoalescePartitionsExec - │ [Stage 21] => NetworkBroadcastExec: partitions_per_consumer=2, stage_partitions=6, input_tasks=3 + │ [Stage 21] => NetworkBroadcastExec: partitions_per_consumer=3, stage_partitions=9, input_tasks=3 │ ProjectionExec: expr=[substr(i_item_desc@1, 1, 30) as itemdesc, i_item_sk@0 as i_item_sk] │ RepartitionExec: partitioning=RoundRobinBatch(3), input_partitions=2 - │ PartitionIsolatorExec: tasks=3 partitions=4 - │ DataSourceExec: file_groups={4 groups: [[/testdata/tpcds/plans_sf1_partitions4/item/part-0.parquet], [/testdata/tpcds/plans_sf1_partitions4/item/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/item/part-2.parquet], [/testdata/tpcds/plans_sf1_partitions4/item/part-3.parquet]]}, projection=[i_item_sk, i_item_desc], file_type=parquet, predicate=DynamicFilter [ empty ] + │ DistributedLeafExec: DataSourceExec: file_groups={2 groups: [[/testdata/tpcds/plans_sf1_partitions4/item/part-0.parquet, /testdata/tpcds/plans_sf1_partitions4/item/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/item/part-2.parquet, /testdata/tpcds/plans_sf1_partitions4/item/part-3.parquet]]}, projection=[i_item_sk, i_item_desc], file_type=parquet, predicate=DynamicFilter [ empty ] └────────────────────────────────────────────────── - ┌───── Stage 21 ── Tasks: t0:[p0..p5] t1:[p6..p11] t2:[p12..p17] - │ BroadcastExec: input_partitions=2, consumer_tasks=3, output_partitions=6 + ┌───── Stage 21 ── Tasks: t0:[p0..p8] t1:[p9..p17] t2:[p18..p26] + │ BroadcastExec: input_partitions=3, consumer_tasks=3, output_partitions=9 │ ProjectionExec: expr=[ss_item_sk@1 as ss_item_sk, d_date@0 as d_date] │ HashJoinExec: mode=CollectLeft, join_type=Inner, on=[(CAST(date_dim.d_date_sk AS Float64)@2, ss_sold_date_sk@0)], projection=[d_date@1, ss_item_sk@4] │ CoalescePartitionsExec │ [Stage 20] => NetworkBroadcastExec: partitions_per_consumer=3, stage_partitions=9, input_tasks=2 - │ PartitionIsolatorExec: tasks=3 partitions=6 - │ DataSourceExec: file_groups={6 groups: [[/testdata/tpcds/plans_sf1_partitions4/store_sales/part-0.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_sales/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_sales/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_sales/part-2.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_sales/part-2.parquet:..], ...]}, projection=[ss_sold_date_sk, ss_item_sk], file_type=parquet, predicate=DynamicFilter [ empty ] + │ DistributedLeafExec: DataSourceExec: file_groups={3 groups: [[/testdata/tpcds/plans_sf1_partitions4/store_sales/part-0.parquet:.., /testdata/tpcds/plans_sf1_partitions4/store_sales/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_sales/part-1.parquet:.., /testdata/tpcds/plans_sf1_partitions4/store_sales/part-2.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_sales/part-2.parquet:.., /testdata/tpcds/plans_sf1_partitions4/store_sales/part-3.parquet:..]]}, projection=[ss_sold_date_sk, ss_item_sk], file_type=parquet, predicate=DynamicFilter [ empty ] └────────────────────────────────────────────────── ┌───── Stage 20 ── Tasks: t0:[p0..p8] t1:[p9..p17] │ BroadcastExec: input_partitions=3, consumer_tasks=3, output_partitions=9 │ ProjectionExec: expr=[d_date_sk@0 as d_date_sk, d_date@1 as d_date, CAST(d_date_sk@0 AS Float64) as CAST(date_dim.d_date_sk AS Float64)] │ FilterExec: d_year@2 IN (SET) ([2000, 2001, 2002, 2003]), projection=[d_date_sk@0, d_date@1] │ RepartitionExec: partitioning=RoundRobinBatch(3), input_partitions=2 - │ PartitionIsolatorExec: tasks=2 partitions=4 - │ DataSourceExec: file_groups={4 groups: [[/testdata/tpcds/plans_sf1_partitions4/date_dim/part-0.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-2.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-3.parquet]]}, projection=[d_date_sk, d_date, d_year], file_type=parquet, predicate=d_year@6 IN (SET) ([2000, 2001, 2002, 2003]), pruning_predicate=d_year_null_count@2 != row_count@3 AND d_year_min@0 <= 2000 AND 2000 <= d_year_max@1 OR d_year_null_count@2 != row_count@3 AND d_year_min@0 <= 2001 AND 2001 <= d_year_max@1 OR d_year_null_count@2 != row_count@3 AND d_year_min@0 <= 2002 AND 2002 <= d_year_max@1 OR d_year_null_count@2 != row_count@3 AND d_year_min@0 <= 2003 AND 2003 <= d_year_max@1, required_guarantees=[d_year in (2000, 2001, 2002, 2003)] + │ DistributedLeafExec: DataSourceExec: file_groups={2 groups: [[/testdata/tpcds/plans_sf1_partitions4/date_dim/part-0.parquet, /testdata/tpcds/plans_sf1_partitions4/date_dim/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-2.parquet, /testdata/tpcds/plans_sf1_partitions4/date_dim/part-3.parquet]]}, projection=[d_date_sk, d_date, d_year], file_type=parquet, predicate=d_year@6 IN (SET) ([2000, 2001, 2002, 2003]), pruning_predicate=d_year_null_count@2 != row_count@3 AND d_year_min@0 <= 2000 AND 2000 <= d_year_max@1 OR d_year_null_count@2 != row_count@3 AND d_year_min@0 <= 2001 AND 2001 <= d_year_max@1 OR d_year_null_count@2 != row_count@3 AND d_year_min@0 <= 2002 AND 2002 <= d_year_max@1 OR d_year_null_count@2 != row_count@3 AND d_year_min@0 <= 2003 AND 2003 <= d_year_max@1, required_guarantees=[d_year in (2000, 2001, 2002, 2003)] └────────────────────────────────────────────────── ┌───── Stage 31 ── Tasks: t0:[p0..p8] │ RepartitionExec: partitioning=Hash([c_customer_sk@0], 9), input_partitions=3 @@ -2729,32 +2513,27 @@ mod tests { │ ProjectionExec: expr=[d_date_sk@0 as d_date_sk, CAST(d_date_sk@0 AS Float64) as CAST(date_dim.d_date_sk AS Float64)] │ FilterExec: d_year@1 IN (SET) ([2000, 2001, 2002, 2003]), projection=[d_date_sk@0] │ RepartitionExec: partitioning=RoundRobinBatch(3), input_partitions=2 - │ PartitionIsolatorExec: tasks=2 partitions=4 - │ DataSourceExec: file_groups={4 groups: [[/testdata/tpcds/plans_sf1_partitions4/date_dim/part-0.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-2.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-3.parquet]]}, projection=[d_date_sk, d_year], file_type=parquet, predicate=d_year@6 IN (SET) ([2000, 2001, 2002, 2003]), pruning_predicate=d_year_null_count@2 != row_count@3 AND d_year_min@0 <= 2000 AND 2000 <= d_year_max@1 OR d_year_null_count@2 != row_count@3 AND d_year_min@0 <= 2001 AND 2001 <= d_year_max@1 OR d_year_null_count@2 != row_count@3 AND d_year_min@0 <= 2002 AND 2002 <= d_year_max@1 OR d_year_null_count@2 != row_count@3 AND d_year_min@0 <= 2003 AND 2003 <= d_year_max@1, required_guarantees=[d_year in (2000, 2001, 2002, 2003)] + │ DistributedLeafExec: DataSourceExec: file_groups={2 groups: [[/testdata/tpcds/plans_sf1_partitions4/date_dim/part-0.parquet, /testdata/tpcds/plans_sf1_partitions4/date_dim/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-2.parquet, /testdata/tpcds/plans_sf1_partitions4/date_dim/part-3.parquet]]}, projection=[d_date_sk, d_year], file_type=parquet, predicate=d_year@6 IN (SET) ([2000, 2001, 2002, 2003]), pruning_predicate=d_year_null_count@2 != row_count@3 AND d_year_min@0 <= 2000 AND 2000 <= d_year_max@1 OR d_year_null_count@2 != row_count@3 AND d_year_min@0 <= 2001 AND 2001 <= d_year_max@1 OR d_year_null_count@2 != row_count@3 AND d_year_min@0 <= 2002 AND 2002 <= d_year_max@1 OR d_year_null_count@2 != row_count@3 AND d_year_min@0 <= 2003 AND 2003 <= d_year_max@1, required_guarantees=[d_year in (2000, 2001, 2002, 2003)] └────────────────────────────────────────────────── ┌───── Stage 25 ── Tasks: t0:[p0..p8] t1:[p0..p8] │ RepartitionExec: partitioning=Hash([CAST(customer.c_customer_sk AS Float64)@1], 9), input_partitions=3 │ ProjectionExec: expr=[c_customer_sk@0 as c_customer_sk, CAST(c_customer_sk@0 AS Float64) as CAST(customer.c_customer_sk AS Float64)] │ RepartitionExec: partitioning=RoundRobinBatch(3), input_partitions=2 - │ PartitionIsolatorExec: tasks=2 partitions=4 - │ DataSourceExec: file_groups={4 groups: [[/testdata/tpcds/plans_sf1_partitions4/customer/part-0.parquet], [/testdata/tpcds/plans_sf1_partitions4/customer/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/customer/part-2.parquet], [/testdata/tpcds/plans_sf1_partitions4/customer/part-3.parquet]]}, projection=[c_customer_sk], file_type=parquet + │ DistributedLeafExec: DataSourceExec: file_groups={2 groups: [[/testdata/tpcds/plans_sf1_partitions4/customer/part-0.parquet, /testdata/tpcds/plans_sf1_partitions4/customer/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/customer/part-2.parquet, /testdata/tpcds/plans_sf1_partitions4/customer/part-3.parquet]]}, projection=[c_customer_sk], file_type=parquet └────────────────────────────────────────────────── ┌───── Stage 26 ── Tasks: t0:[p0..p8] t1:[p0..p8] t2:[p0..p8] - │ RepartitionExec: partitioning=Hash([ss_customer_sk@1], 9), input_partitions=2 - │ PartitionIsolatorExec: tasks=3 partitions=6 - │ DataSourceExec: file_groups={6 groups: [[/testdata/tpcds/plans_sf1_partitions4/store_sales/part-0.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_sales/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_sales/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_sales/part-2.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_sales/part-2.parquet:..], ...]}, projection=[ss_sold_date_sk, ss_customer_sk, ss_quantity, ss_sales_price], file_type=parquet, predicate=DynamicFilter [ empty ] AND DynamicFilter [ empty ] + │ RepartitionExec: partitioning=Hash([ss_customer_sk@1], 9), input_partitions=3 + │ DistributedLeafExec: DataSourceExec: file_groups={3 groups: [[/testdata/tpcds/plans_sf1_partitions4/store_sales/part-0.parquet:.., /testdata/tpcds/plans_sf1_partitions4/store_sales/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_sales/part-1.parquet:.., /testdata/tpcds/plans_sf1_partitions4/store_sales/part-2.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_sales/part-2.parquet:.., /testdata/tpcds/plans_sf1_partitions4/store_sales/part-3.parquet:..]]}, projection=[ss_sold_date_sk, ss_customer_sk, ss_quantity, ss_sales_price], file_type=parquet, predicate=DynamicFilter [ empty ] AND DynamicFilter [ empty ] └────────────────────────────────────────────────── ┌───── Stage 29 ── Tasks: t0:[p0..p2] t1:[p0..p2] │ RepartitionExec: partitioning=Hash([CAST(customer.c_customer_sk AS Float64)@1], 3), input_partitions=3 │ ProjectionExec: expr=[c_customer_sk@0 as c_customer_sk, CAST(c_customer_sk@0 AS Float64) as CAST(customer.c_customer_sk AS Float64)] │ RepartitionExec: partitioning=RoundRobinBatch(3), input_partitions=2 - │ PartitionIsolatorExec: tasks=2 partitions=4 - │ DataSourceExec: file_groups={4 groups: [[/testdata/tpcds/plans_sf1_partitions4/customer/part-0.parquet], [/testdata/tpcds/plans_sf1_partitions4/customer/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/customer/part-2.parquet], [/testdata/tpcds/plans_sf1_partitions4/customer/part-3.parquet]]}, projection=[c_customer_sk], file_type=parquet + │ DistributedLeafExec: DataSourceExec: file_groups={2 groups: [[/testdata/tpcds/plans_sf1_partitions4/customer/part-0.parquet, /testdata/tpcds/plans_sf1_partitions4/customer/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/customer/part-2.parquet, /testdata/tpcds/plans_sf1_partitions4/customer/part-3.parquet]]}, projection=[c_customer_sk], file_type=parquet └────────────────────────────────────────────────── ┌───── Stage 30 ── Tasks: t0:[p0..p2] t1:[p0..p2] t2:[p0..p2] - │ RepartitionExec: partitioning=Hash([ss_customer_sk@0], 3), input_partitions=2 - │ PartitionIsolatorExec: tasks=3 partitions=6 - │ DataSourceExec: file_groups={6 groups: [[/testdata/tpcds/plans_sf1_partitions4/store_sales/part-0.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_sales/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_sales/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_sales/part-2.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_sales/part-2.parquet:..], ...]}, projection=[ss_customer_sk, ss_quantity, ss_sales_price], file_type=parquet, predicate=DynamicFilter [ empty ] + │ RepartitionExec: partitioning=Hash([ss_customer_sk@0], 3), input_partitions=3 + │ DistributedLeafExec: DataSourceExec: file_groups={3 groups: [[/testdata/tpcds/plans_sf1_partitions4/store_sales/part-0.parquet:.., /testdata/tpcds/plans_sf1_partitions4/store_sales/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_sales/part-1.parquet:.., /testdata/tpcds/plans_sf1_partitions4/store_sales/part-2.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_sales/part-2.parquet:.., /testdata/tpcds/plans_sf1_partitions4/store_sales/part-3.parquet:..]]}, projection=[ss_customer_sk, ss_quantity, ss_sales_price], file_type=parquet, predicate=DynamicFilter [ empty ] └────────────────────────────────────────────────── "); Ok(()) @@ -2762,7 +2541,7 @@ mod tests { #[tokio::test] async fn test_tpcds_24() -> Result<()> { let display = test_tpcds_query("q24").await?; - assert_snapshot!(display, @" + assert_snapshot!(display, @r" ┌───── DistributedExec ── Tasks: t0:[p0] │ SortPreservingMergeExec: [c_last_name@0 ASC NULLS LAST, c_first_name@1 ASC NULLS LAST, s_store_name@2 ASC NULLS LAST] │ SortExec: expr=[c_last_name@0 ASC NULLS LAST, c_first_name@1 ASC NULLS LAST, s_store_name@2 ASC NULLS LAST], preserve_partitioning=[true] @@ -2789,8 +2568,7 @@ mod tests { │ CoalescePartitionsExec │ [Stage 6] => NetworkBroadcastExec: partitions_per_consumer=3, stage_partitions=9, input_tasks=3 │ RepartitionExec: partitioning=RoundRobinBatch(3), input_partitions=2 - │ PartitionIsolatorExec: tasks=3 partitions=4 - │ DataSourceExec: file_groups={4 groups: [[/testdata/tpcds/plans_sf1_partitions4/customer_address/part-0.parquet], [/testdata/tpcds/plans_sf1_partitions4/customer_address/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/customer_address/part-2.parquet], [/testdata/tpcds/plans_sf1_partitions4/customer_address/part-3.parquet]]}, projection=[ca_address_sk, ca_state, ca_zip, ca_country], file_type=parquet, predicate=DynamicFilter [ empty ] + │ DistributedLeafExec: DataSourceExec: file_groups={2 groups: [[/testdata/tpcds/plans_sf1_partitions4/customer_address/part-0.parquet, /testdata/tpcds/plans_sf1_partitions4/customer_address/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/customer_address/part-2.parquet, /testdata/tpcds/plans_sf1_partitions4/customer_address/part-3.parquet]]}, projection=[ca_address_sk, ca_state, ca_zip, ca_country], file_type=parquet, predicate=DynamicFilter [ empty ] └────────────────────────────────────────────────── ┌───── Stage 6 ── Tasks: t0:[p0..p8] t1:[p9..p17] t2:[p18..p26] │ BroadcastExec: input_partitions=3, consumer_tasks=3, output_partitions=9 @@ -2799,8 +2577,7 @@ mod tests { │ [Stage 5] => NetworkBroadcastExec: partitions_per_consumer=3, stage_partitions=9, input_tasks=3 │ ProjectionExec: expr=[c_customer_sk@0 as c_customer_sk, c_current_addr_sk@1 as c_current_addr_sk, c_first_name@2 as c_first_name, c_last_name@3 as c_last_name, c_birth_country@4 as c_birth_country, CAST(c_customer_sk@0 AS Float64) as CAST(customer.c_customer_sk AS Float64)] │ RepartitionExec: partitioning=RoundRobinBatch(3), input_partitions=2 - │ PartitionIsolatorExec: tasks=3 partitions=4 - │ DataSourceExec: file_groups={4 groups: [[/testdata/tpcds/plans_sf1_partitions4/customer/part-0.parquet], [/testdata/tpcds/plans_sf1_partitions4/customer/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/customer/part-2.parquet], [/testdata/tpcds/plans_sf1_partitions4/customer/part-3.parquet]]}, projection=[c_customer_sk, c_current_addr_sk, c_first_name, c_last_name, c_birth_country], file_type=parquet, predicate=DynamicFilter [ empty ] + │ DistributedLeafExec: DataSourceExec: file_groups={2 groups: [[/testdata/tpcds/plans_sf1_partitions4/customer/part-0.parquet, /testdata/tpcds/plans_sf1_partitions4/customer/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/customer/part-2.parquet, /testdata/tpcds/plans_sf1_partitions4/customer/part-3.parquet]]}, projection=[c_customer_sk, c_current_addr_sk, c_first_name, c_last_name, c_birth_country], file_type=parquet, predicate=DynamicFilter [ empty ] └────────────────────────────────────────────────── ┌───── Stage 5 ── Tasks: t0:[p0..p8] t1:[p9..p17] t2:[p18..p26] │ BroadcastExec: input_partitions=3, consumer_tasks=3, output_partitions=9 @@ -2808,8 +2585,7 @@ mod tests { │ CoalescePartitionsExec │ [Stage 4] => NetworkBroadcastExec: partitions_per_consumer=3, stage_partitions=9, input_tasks=3 │ RepartitionExec: partitioning=RoundRobinBatch(3), input_partitions=2 - │ PartitionIsolatorExec: tasks=3 partitions=4 - │ DataSourceExec: file_groups={4 groups: [[/testdata/tpcds/plans_sf1_partitions4/item/part-0.parquet], [/testdata/tpcds/plans_sf1_partitions4/item/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/item/part-2.parquet], [/testdata/tpcds/plans_sf1_partitions4/item/part-3.parquet]]}, projection=[i_item_sk, i_current_price, i_size, i_color, i_units, i_manager_id], file_type=parquet, predicate=DynamicFilter [ empty ] + │ DistributedLeafExec: DataSourceExec: file_groups={2 groups: [[/testdata/tpcds/plans_sf1_partitions4/item/part-0.parquet, /testdata/tpcds/plans_sf1_partitions4/item/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/item/part-2.parquet, /testdata/tpcds/plans_sf1_partitions4/item/part-3.parquet]]}, projection=[i_item_sk, i_current_price, i_size, i_color, i_units, i_manager_id], file_type=parquet, predicate=DynamicFilter [ empty ] └────────────────────────────────────────────────── ┌───── Stage 4 ── Tasks: t0:[p0..p8] t1:[p9..p17] t2:[p18..p26] │ BroadcastExec: input_partitions=3, consumer_tasks=3, output_partitions=9 @@ -2826,18 +2602,15 @@ mod tests { │ ProjectionExec: expr=[s_store_sk@0 as s_store_sk, s_store_name@1 as s_store_name, s_state@2 as s_state, s_zip@3 as s_zip, CAST(s_store_sk@0 AS Float64) as CAST(store.s_store_sk AS Float64)] │ FilterExec: s_market_id@2 = 8, projection=[s_store_sk@0, s_store_name@1, s_state@3, s_zip@4] │ RepartitionExec: partitioning=RoundRobinBatch(3), input_partitions=2 - │ PartitionIsolatorExec: tasks=2 partitions=4 - │ DataSourceExec: file_groups={4 groups: [[/testdata/tpcds/plans_sf1_partitions4/store/part-0.parquet], [/testdata/tpcds/plans_sf1_partitions4/store/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/store/part-2.parquet], [/testdata/tpcds/plans_sf1_partitions4/store/part-3.parquet]]}, projection=[s_store_sk, s_store_name, s_market_id, s_state, s_zip], file_type=parquet, predicate=s_market_id@10 = 8, pruning_predicate=s_market_id_null_count@2 != row_count@3 AND s_market_id_min@0 <= 8 AND 8 <= s_market_id_max@1, required_guarantees=[s_market_id in (8)] + │ DistributedLeafExec: DataSourceExec: file_groups={2 groups: [[/testdata/tpcds/plans_sf1_partitions4/store/part-0.parquet, /testdata/tpcds/plans_sf1_partitions4/store/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/store/part-2.parquet, /testdata/tpcds/plans_sf1_partitions4/store/part-3.parquet]]}, projection=[s_store_sk, s_store_name, s_market_id, s_state, s_zip], file_type=parquet, predicate=s_market_id@10 = 8, pruning_predicate=s_market_id_null_count@2 != row_count@3 AND s_market_id_min@0 <= 8 AND 8 <= s_market_id_max@1, required_guarantees=[s_market_id in (8)] └────────────────────────────────────────────────── ┌───── Stage 2 ── Tasks: t0:[p0..p8] t1:[p0..p8] t2:[p0..p8] - │ RepartitionExec: partitioning=Hash([sr_ticket_number@1, sr_item_sk@0], 9), input_partitions=2 - │ PartitionIsolatorExec: tasks=3 partitions=6 - │ DataSourceExec: file_groups={6 groups: [[/testdata/tpcds/plans_sf1_partitions4/store_returns/part-0.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_returns/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_returns/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_returns/part-2.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_returns/part-2.parquet:..], ...]}, projection=[sr_item_sk, sr_ticket_number], file_type=parquet + │ RepartitionExec: partitioning=Hash([sr_ticket_number@1, sr_item_sk@0], 9), input_partitions=3 + │ DistributedLeafExec: DataSourceExec: file_groups={3 groups: [[/testdata/tpcds/plans_sf1_partitions4/store_returns/part-0.parquet:.., /testdata/tpcds/plans_sf1_partitions4/store_returns/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_returns/part-1.parquet:.., /testdata/tpcds/plans_sf1_partitions4/store_returns/part-2.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_returns/part-2.parquet:.., /testdata/tpcds/plans_sf1_partitions4/store_returns/part-3.parquet:..]]}, projection=[sr_item_sk, sr_ticket_number], file_type=parquet └────────────────────────────────────────────────── ┌───── Stage 3 ── Tasks: t0:[p0..p8] t1:[p0..p8] t2:[p0..p8] - │ RepartitionExec: partitioning=Hash([ss_ticket_number@3, ss_item_sk@0], 9), input_partitions=2 - │ PartitionIsolatorExec: tasks=3 partitions=6 - │ DataSourceExec: file_groups={6 groups: [[/testdata/tpcds/plans_sf1_partitions4/store_sales/part-0.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_sales/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_sales/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_sales/part-2.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_sales/part-2.parquet:..], ...]}, projection=[ss_item_sk, ss_customer_sk, ss_store_sk, ss_ticket_number, ss_net_paid], file_type=parquet, predicate=DynamicFilter [ empty ] AND DynamicFilter [ empty ] + │ RepartitionExec: partitioning=Hash([ss_ticket_number@3, ss_item_sk@0], 9), input_partitions=3 + │ DistributedLeafExec: DataSourceExec: file_groups={3 groups: [[/testdata/tpcds/plans_sf1_partitions4/store_sales/part-0.parquet:.., /testdata/tpcds/plans_sf1_partitions4/store_sales/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_sales/part-1.parquet:.., /testdata/tpcds/plans_sf1_partitions4/store_sales/part-2.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_sales/part-2.parquet:.., /testdata/tpcds/plans_sf1_partitions4/store_sales/part-3.parquet:..]]}, projection=[ss_item_sk, ss_customer_sk, ss_store_sk, ss_ticket_number, ss_net_paid], file_type=parquet, predicate=DynamicFilter [ empty ] AND DynamicFilter [ empty ] └────────────────────────────────────────────────── ┌───── Stage 16 ── Tasks: t0:[p0..p2] t1:[p0..p2] │ RepartitionExec: partitioning=Hash([c_last_name@0, c_first_name@1, s_store_name@2], 3), input_partitions=3 @@ -2853,8 +2626,7 @@ mod tests { │ CoalescePartitionsExec │ [Stage 14] => NetworkBroadcastExec: partitions_per_consumer=3, stage_partitions=9, input_tasks=3 │ RepartitionExec: partitioning=RoundRobinBatch(3), input_partitions=2 - │ PartitionIsolatorExec: tasks=3 partitions=4 - │ DataSourceExec: file_groups={4 groups: [[/testdata/tpcds/plans_sf1_partitions4/customer_address/part-0.parquet], [/testdata/tpcds/plans_sf1_partitions4/customer_address/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/customer_address/part-2.parquet], [/testdata/tpcds/plans_sf1_partitions4/customer_address/part-3.parquet]]}, projection=[ca_address_sk, ca_state, ca_zip, ca_country], file_type=parquet, predicate=DynamicFilter [ empty ] + │ DistributedLeafExec: DataSourceExec: file_groups={2 groups: [[/testdata/tpcds/plans_sf1_partitions4/customer_address/part-0.parquet, /testdata/tpcds/plans_sf1_partitions4/customer_address/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/customer_address/part-2.parquet, /testdata/tpcds/plans_sf1_partitions4/customer_address/part-3.parquet]]}, projection=[ca_address_sk, ca_state, ca_zip, ca_country], file_type=parquet, predicate=DynamicFilter [ empty ] └────────────────────────────────────────────────── ┌───── Stage 14 ── Tasks: t0:[p0..p8] t1:[p9..p17] t2:[p18..p26] │ BroadcastExec: input_partitions=3, consumer_tasks=3, output_partitions=9 @@ -2863,8 +2635,7 @@ mod tests { │ [Stage 13] => NetworkBroadcastExec: partitions_per_consumer=3, stage_partitions=9, input_tasks=3 │ ProjectionExec: expr=[c_customer_sk@0 as c_customer_sk, c_current_addr_sk@1 as c_current_addr_sk, c_first_name@2 as c_first_name, c_last_name@3 as c_last_name, c_birth_country@4 as c_birth_country, CAST(c_customer_sk@0 AS Float64) as CAST(customer.c_customer_sk AS Float64)] │ RepartitionExec: partitioning=RoundRobinBatch(3), input_partitions=2 - │ PartitionIsolatorExec: tasks=3 partitions=4 - │ DataSourceExec: file_groups={4 groups: [[/testdata/tpcds/plans_sf1_partitions4/customer/part-0.parquet], [/testdata/tpcds/plans_sf1_partitions4/customer/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/customer/part-2.parquet], [/testdata/tpcds/plans_sf1_partitions4/customer/part-3.parquet]]}, projection=[c_customer_sk, c_current_addr_sk, c_first_name, c_last_name, c_birth_country], file_type=parquet, predicate=DynamicFilter [ empty ] + │ DistributedLeafExec: DataSourceExec: file_groups={2 groups: [[/testdata/tpcds/plans_sf1_partitions4/customer/part-0.parquet, /testdata/tpcds/plans_sf1_partitions4/customer/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/customer/part-2.parquet, /testdata/tpcds/plans_sf1_partitions4/customer/part-3.parquet]]}, projection=[c_customer_sk, c_current_addr_sk, c_first_name, c_last_name, c_birth_country], file_type=parquet, predicate=DynamicFilter [ empty ] └────────────────────────────────────────────────── ┌───── Stage 13 ── Tasks: t0:[p0..p8] t1:[p9..p17] t2:[p18..p26] │ BroadcastExec: input_partitions=3, consumer_tasks=3, output_partitions=9 @@ -2873,8 +2644,7 @@ mod tests { │ [Stage 12] => NetworkBroadcastExec: partitions_per_consumer=3, stage_partitions=9, input_tasks=3 │ FilterExec: i_color@3 = peach │ RepartitionExec: partitioning=RoundRobinBatch(3), input_partitions=2 - │ PartitionIsolatorExec: tasks=3 partitions=4 - │ DataSourceExec: file_groups={4 groups: [[/testdata/tpcds/plans_sf1_partitions4/item/part-0.parquet], [/testdata/tpcds/plans_sf1_partitions4/item/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/item/part-2.parquet], [/testdata/tpcds/plans_sf1_partitions4/item/part-3.parquet]]}, projection=[i_item_sk, i_current_price, i_size, i_color, i_units, i_manager_id], file_type=parquet, predicate=i_color@17 = peach AND DynamicFilter [ empty ], pruning_predicate=i_color_null_count@2 != row_count@3 AND i_color_min@0 <= peach AND peach <= i_color_max@1, required_guarantees=[i_color in (peach)] + │ DistributedLeafExec: DataSourceExec: file_groups={2 groups: [[/testdata/tpcds/plans_sf1_partitions4/item/part-0.parquet, /testdata/tpcds/plans_sf1_partitions4/item/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/item/part-2.parquet, /testdata/tpcds/plans_sf1_partitions4/item/part-3.parquet]]}, projection=[i_item_sk, i_current_price, i_size, i_color, i_units, i_manager_id], file_type=parquet, predicate=i_color@17 = peach AND DynamicFilter [ empty ], pruning_predicate=i_color_null_count@2 != row_count@3 AND i_color_min@0 <= peach AND peach <= i_color_max@1, required_guarantees=[i_color in (peach)] └────────────────────────────────────────────────── ┌───── Stage 12 ── Tasks: t0:[p0..p8] t1:[p9..p17] t2:[p18..p26] │ BroadcastExec: input_partitions=3, consumer_tasks=3, output_partitions=9 @@ -2891,18 +2661,15 @@ mod tests { │ ProjectionExec: expr=[s_store_sk@0 as s_store_sk, s_store_name@1 as s_store_name, s_state@2 as s_state, s_zip@3 as s_zip, CAST(s_store_sk@0 AS Float64) as CAST(store.s_store_sk AS Float64)] │ FilterExec: s_market_id@2 = 8, projection=[s_store_sk@0, s_store_name@1, s_state@3, s_zip@4] │ RepartitionExec: partitioning=RoundRobinBatch(3), input_partitions=2 - │ PartitionIsolatorExec: tasks=2 partitions=4 - │ DataSourceExec: file_groups={4 groups: [[/testdata/tpcds/plans_sf1_partitions4/store/part-0.parquet], [/testdata/tpcds/plans_sf1_partitions4/store/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/store/part-2.parquet], [/testdata/tpcds/plans_sf1_partitions4/store/part-3.parquet]]}, projection=[s_store_sk, s_store_name, s_market_id, s_state, s_zip], file_type=parquet, predicate=s_market_id@10 = 8, pruning_predicate=s_market_id_null_count@2 != row_count@3 AND s_market_id_min@0 <= 8 AND 8 <= s_market_id_max@1, required_guarantees=[s_market_id in (8)] + │ DistributedLeafExec: DataSourceExec: file_groups={2 groups: [[/testdata/tpcds/plans_sf1_partitions4/store/part-0.parquet, /testdata/tpcds/plans_sf1_partitions4/store/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/store/part-2.parquet, /testdata/tpcds/plans_sf1_partitions4/store/part-3.parquet]]}, projection=[s_store_sk, s_store_name, s_market_id, s_state, s_zip], file_type=parquet, predicate=s_market_id@10 = 8, pruning_predicate=s_market_id_null_count@2 != row_count@3 AND s_market_id_min@0 <= 8 AND 8 <= s_market_id_max@1, required_guarantees=[s_market_id in (8)] └────────────────────────────────────────────────── ┌───── Stage 10 ── Tasks: t0:[p0..p8] t1:[p0..p8] t2:[p0..p8] - │ RepartitionExec: partitioning=Hash([sr_ticket_number@1, sr_item_sk@0], 9), input_partitions=2 - │ PartitionIsolatorExec: tasks=3 partitions=6 - │ DataSourceExec: file_groups={6 groups: [[/testdata/tpcds/plans_sf1_partitions4/store_returns/part-0.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_returns/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_returns/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_returns/part-2.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_returns/part-2.parquet:..], ...]}, projection=[sr_item_sk, sr_ticket_number], file_type=parquet + │ RepartitionExec: partitioning=Hash([sr_ticket_number@1, sr_item_sk@0], 9), input_partitions=3 + │ DistributedLeafExec: DataSourceExec: file_groups={3 groups: [[/testdata/tpcds/plans_sf1_partitions4/store_returns/part-0.parquet:.., /testdata/tpcds/plans_sf1_partitions4/store_returns/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_returns/part-1.parquet:.., /testdata/tpcds/plans_sf1_partitions4/store_returns/part-2.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_returns/part-2.parquet:.., /testdata/tpcds/plans_sf1_partitions4/store_returns/part-3.parquet:..]]}, projection=[sr_item_sk, sr_ticket_number], file_type=parquet └────────────────────────────────────────────────── ┌───── Stage 11 ── Tasks: t0:[p0..p8] t1:[p0..p8] t2:[p0..p8] - │ RepartitionExec: partitioning=Hash([ss_ticket_number@3, ss_item_sk@0], 9), input_partitions=2 - │ PartitionIsolatorExec: tasks=3 partitions=6 - │ DataSourceExec: file_groups={6 groups: [[/testdata/tpcds/plans_sf1_partitions4/store_sales/part-0.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_sales/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_sales/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_sales/part-2.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_sales/part-2.parquet:..], ...]}, projection=[ss_item_sk, ss_customer_sk, ss_store_sk, ss_ticket_number, ss_net_paid], file_type=parquet, predicate=DynamicFilter [ empty ] AND DynamicFilter [ empty ] + │ RepartitionExec: partitioning=Hash([ss_ticket_number@3, ss_item_sk@0], 9), input_partitions=3 + │ DistributedLeafExec: DataSourceExec: file_groups={3 groups: [[/testdata/tpcds/plans_sf1_partitions4/store_sales/part-0.parquet:.., /testdata/tpcds/plans_sf1_partitions4/store_sales/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_sales/part-1.parquet:.., /testdata/tpcds/plans_sf1_partitions4/store_sales/part-2.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_sales/part-2.parquet:.., /testdata/tpcds/plans_sf1_partitions4/store_sales/part-3.parquet:..]]}, projection=[ss_item_sk, ss_customer_sk, ss_store_sk, ss_ticket_number, ss_net_paid], file_type=parquet, predicate=DynamicFilter [ empty ] AND DynamicFilter [ empty ] └────────────────────────────────────────────────── "); Ok(()) @@ -2910,7 +2677,7 @@ mod tests { #[tokio::test] async fn test_tpcds_25() -> Result<()> { let display = test_tpcds_query("q25").await?; - assert_snapshot!(display, @" + assert_snapshot!(display, @r" ┌───── DistributedExec ── Tasks: t0:[p0] │ SortPreservingMergeExec: [i_item_id@0 ASC NULLS LAST, i_item_desc@1 ASC NULLS LAST, s_store_id@2 ASC NULLS LAST, s_store_name@3 ASC NULLS LAST], fetch=100 │ [Stage 11] => NetworkCoalesceExec: output_partitions=6, input_tasks=2 @@ -2928,8 +2695,7 @@ mod tests { │ CoalescePartitionsExec │ [Stage 9] => NetworkBroadcastExec: partitions_per_consumer=3, stage_partitions=9, input_tasks=3 │ RepartitionExec: partitioning=RoundRobinBatch(3), input_partitions=2 - │ PartitionIsolatorExec: tasks=3 partitions=4 - │ DataSourceExec: file_groups={4 groups: [[/testdata/tpcds/plans_sf1_partitions4/item/part-0.parquet], [/testdata/tpcds/plans_sf1_partitions4/item/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/item/part-2.parquet], [/testdata/tpcds/plans_sf1_partitions4/item/part-3.parquet]]}, projection=[i_item_sk, i_item_id, i_item_desc], file_type=parquet, predicate=DynamicFilter [ empty ] + │ DistributedLeafExec: DataSourceExec: file_groups={2 groups: [[/testdata/tpcds/plans_sf1_partitions4/item/part-0.parquet, /testdata/tpcds/plans_sf1_partitions4/item/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/item/part-2.parquet, /testdata/tpcds/plans_sf1_partitions4/item/part-3.parquet]]}, projection=[i_item_sk, i_item_id, i_item_desc], file_type=parquet, predicate=DynamicFilter [ empty ] └────────────────────────────────────────────────── ┌───── Stage 9 ── Tasks: t0:[p0..p8] t1:[p9..p17] t2:[p18..p26] │ BroadcastExec: input_partitions=3, consumer_tasks=3, output_partitions=9 @@ -2943,13 +2709,11 @@ mod tests { │ ProjectionExec: expr=[d_date_sk@0 as d_date_sk, CAST(d_date_sk@0 AS Float64) as CAST(d3.d_date_sk AS Float64)] │ FilterExec: d_moy@2 >= 4 AND d_moy@2 <= 10 AND d_year@1 = 2001, projection=[d_date_sk@0] │ RepartitionExec: partitioning=RoundRobinBatch(3), input_partitions=2 - │ PartitionIsolatorExec: tasks=3 partitions=4 - │ DataSourceExec: file_groups={4 groups: [[/testdata/tpcds/plans_sf1_partitions4/date_dim/part-0.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-2.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-3.parquet]]}, projection=[d_date_sk, d_year, d_moy], file_type=parquet, predicate=d_moy@8 >= 4 AND d_moy@8 <= 10 AND d_year@6 = 2001 AND DynamicFilter [ empty ], pruning_predicate=d_moy_null_count@1 != row_count@2 AND d_moy_max@0 >= 4 AND d_moy_null_count@1 != row_count@2 AND d_moy_min@3 <= 10 AND d_year_null_count@6 != row_count@2 AND d_year_min@4 <= 2001 AND 2001 <= d_year_max@5, required_guarantees=[d_year in (2001)] + │ DistributedLeafExec: DataSourceExec: file_groups={2 groups: [[/testdata/tpcds/plans_sf1_partitions4/date_dim/part-0.parquet, /testdata/tpcds/plans_sf1_partitions4/date_dim/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-2.parquet, /testdata/tpcds/plans_sf1_partitions4/date_dim/part-3.parquet]]}, projection=[d_date_sk, d_year, d_moy], file_type=parquet, predicate=d_moy@8 >= 4 AND d_moy@8 <= 10 AND d_year@6 = 2001 AND DynamicFilter [ empty ], pruning_predicate=d_moy_null_count@1 != row_count@2 AND d_moy_max@0 >= 4 AND d_moy_null_count@1 != row_count@2 AND d_moy_min@3 <= 10 AND d_year_null_count@6 != row_count@2 AND d_year_min@4 <= 2001 AND 2001 <= d_year_max@5, required_guarantees=[d_year in (2001)] └────────────────────────────────────────────────── ┌───── Stage 1 ── Tasks: t0:[p0..p5] t1:[p6..p11] │ BroadcastExec: input_partitions=2, consumer_tasks=3, output_partitions=6 - │ PartitionIsolatorExec: tasks=2 partitions=4 - │ DataSourceExec: file_groups={4 groups: [[/testdata/tpcds/plans_sf1_partitions4/store/part-0.parquet], [/testdata/tpcds/plans_sf1_partitions4/store/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/store/part-2.parquet], [/testdata/tpcds/plans_sf1_partitions4/store/part-3.parquet]]}, projection=[s_store_sk, s_store_id, s_store_name, CAST(s_store_sk@0 AS Float64) as CAST(store.s_store_sk AS Float64)], file_type=parquet + │ DistributedLeafExec: DataSourceExec: file_groups={2 groups: [[/testdata/tpcds/plans_sf1_partitions4/store/part-0.parquet, /testdata/tpcds/plans_sf1_partitions4/store/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/store/part-2.parquet, /testdata/tpcds/plans_sf1_partitions4/store/part-3.parquet]]}, projection=[s_store_sk, s_store_id, s_store_name, CAST(s_store_sk@0 AS Float64) as CAST(store.s_store_sk AS Float64)], file_type=parquet └────────────────────────────────────────────────── ┌───── Stage 8 ── Tasks: t0:[p0..p8] t1:[p9..p17] t2:[p18..p26] │ BroadcastExec: input_partitions=3, consumer_tasks=3, output_partitions=9 @@ -2959,8 +2723,7 @@ mod tests { │ ProjectionExec: expr=[d_date_sk@0 as d_date_sk, CAST(d_date_sk@0 AS Float64) as CAST(d2.d_date_sk AS Float64)] │ FilterExec: d_moy@2 >= 4 AND d_moy@2 <= 10 AND d_year@1 = 2001, projection=[d_date_sk@0] │ RepartitionExec: partitioning=RoundRobinBatch(3), input_partitions=2 - │ PartitionIsolatorExec: tasks=3 partitions=4 - │ DataSourceExec: file_groups={4 groups: [[/testdata/tpcds/plans_sf1_partitions4/date_dim/part-0.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-2.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-3.parquet]]}, projection=[d_date_sk, d_year, d_moy], file_type=parquet, predicate=d_moy@8 >= 4 AND d_moy@8 <= 10 AND d_year@6 = 2001 AND DynamicFilter [ empty ], pruning_predicate=d_moy_null_count@1 != row_count@2 AND d_moy_max@0 >= 4 AND d_moy_null_count@1 != row_count@2 AND d_moy_min@3 <= 10 AND d_year_null_count@6 != row_count@2 AND d_year_min@4 <= 2001 AND 2001 <= d_year_max@5, required_guarantees=[d_year in (2001)] + │ DistributedLeafExec: DataSourceExec: file_groups={2 groups: [[/testdata/tpcds/plans_sf1_partitions4/date_dim/part-0.parquet, /testdata/tpcds/plans_sf1_partitions4/date_dim/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-2.parquet, /testdata/tpcds/plans_sf1_partitions4/date_dim/part-3.parquet]]}, projection=[d_date_sk, d_year, d_moy], file_type=parquet, predicate=d_moy@8 >= 4 AND d_moy@8 <= 10 AND d_year@6 = 2001 AND DynamicFilter [ empty ], pruning_predicate=d_moy_null_count@1 != row_count@2 AND d_moy_max@0 >= 4 AND d_moy_null_count@1 != row_count@2 AND d_moy_min@3 <= 10 AND d_year_null_count@6 != row_count@2 AND d_year_min@4 <= 2001 AND 2001 <= d_year_max@5, required_guarantees=[d_year in (2001)] └────────────────────────────────────────────────── ┌───── Stage 7 ── Tasks: t0:[p0..p8] t1:[p9..p17] t2:[p18..p26] │ BroadcastExec: input_partitions=3, consumer_tasks=3, output_partitions=9 @@ -2977,13 +2740,11 @@ mod tests { │ ProjectionExec: expr=[d_date_sk@0 as d_date_sk, CAST(d_date_sk@0 AS Float64) as CAST(d1.d_date_sk AS Float64)] │ FilterExec: d_moy@2 = 4 AND d_year@1 = 2001, projection=[d_date_sk@0] │ RepartitionExec: partitioning=RoundRobinBatch(3), input_partitions=2 - │ PartitionIsolatorExec: tasks=2 partitions=4 - │ DataSourceExec: file_groups={4 groups: [[/testdata/tpcds/plans_sf1_partitions4/date_dim/part-0.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-2.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-3.parquet]]}, projection=[d_date_sk, d_year, d_moy], file_type=parquet, predicate=d_moy@8 = 4 AND d_year@6 = 2001, pruning_predicate=d_moy_null_count@2 != row_count@3 AND d_moy_min@0 <= 4 AND 4 <= d_moy_max@1 AND d_year_null_count@6 != row_count@3 AND d_year_min@4 <= 2001 AND 2001 <= d_year_max@5, required_guarantees=[d_moy in (4), d_year in (2001)] + │ DistributedLeafExec: DataSourceExec: file_groups={2 groups: [[/testdata/tpcds/plans_sf1_partitions4/date_dim/part-0.parquet, /testdata/tpcds/plans_sf1_partitions4/date_dim/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-2.parquet, /testdata/tpcds/plans_sf1_partitions4/date_dim/part-3.parquet]]}, projection=[d_date_sk, d_year, d_moy], file_type=parquet, predicate=d_moy@8 = 4 AND d_year@6 = 2001, pruning_predicate=d_moy_null_count@2 != row_count@3 AND d_moy_min@0 <= 4 AND 4 <= d_moy_max@1 AND d_year_null_count@6 != row_count@3 AND d_year_min@4 <= 2001 AND 2001 <= d_year_max@5, required_guarantees=[d_moy in (4), d_year in (2001)] └────────────────────────────────────────────────── ┌───── Stage 3 ── Tasks: t0:[p0..p8] t1:[p0..p8] t2:[p0..p8] - │ RepartitionExec: partitioning=Hash([cs_bill_customer_sk@1, cs_item_sk@2], 9), input_partitions=2 - │ PartitionIsolatorExec: tasks=3 partitions=6 - │ DataSourceExec: file_groups={6 groups: [[/testdata/tpcds/plans_sf1_partitions4/catalog_sales/part-0.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/catalog_sales/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/catalog_sales/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/catalog_sales/part-2.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/catalog_sales/part-2.parquet:..], ...]}, projection=[cs_sold_date_sk, cs_bill_customer_sk, cs_item_sk, cs_net_profit], file_type=parquet + │ RepartitionExec: partitioning=Hash([cs_bill_customer_sk@1, cs_item_sk@2], 9), input_partitions=3 + │ DistributedLeafExec: DataSourceExec: file_groups={3 groups: [[/testdata/tpcds/plans_sf1_partitions4/catalog_sales/part-0.parquet:.., /testdata/tpcds/plans_sf1_partitions4/catalog_sales/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/catalog_sales/part-1.parquet:.., /testdata/tpcds/plans_sf1_partitions4/catalog_sales/part-2.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/catalog_sales/part-2.parquet:.., /testdata/tpcds/plans_sf1_partitions4/catalog_sales/part-3.parquet:..]]}, projection=[cs_sold_date_sk, cs_bill_customer_sk, cs_item_sk, cs_net_profit], file_type=parquet └────────────────────────────────────────────────── ┌───── Stage 6 ── Tasks: t0:[p0..p8] t1:[p0..p8] t2:[p0..p8] │ RepartitionExec: partitioning=Hash([sr_customer_sk@6, sr_item_sk@5], 9), input_partitions=3 @@ -2993,14 +2754,12 @@ mod tests { │ [Stage 5] => NetworkShuffleExec: output_partitions=3, input_tasks=3 └────────────────────────────────────────────────── ┌───── Stage 4 ── Tasks: t0:[p0..p8] t1:[p0..p8] t2:[p0..p8] - │ RepartitionExec: partitioning=Hash([sr_customer_sk@2, sr_item_sk@1, sr_ticket_number@3], 9), input_partitions=2 - │ PartitionIsolatorExec: tasks=3 partitions=6 - │ DataSourceExec: file_groups={6 groups: [[/testdata/tpcds/plans_sf1_partitions4/store_returns/part-0.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_returns/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_returns/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_returns/part-2.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_returns/part-2.parquet:..], ...]}, projection=[sr_returned_date_sk, sr_item_sk, sr_customer_sk, sr_ticket_number, sr_net_loss], file_type=parquet, predicate=DynamicFilter [ empty ] + │ RepartitionExec: partitioning=Hash([sr_customer_sk@2, sr_item_sk@1, sr_ticket_number@3], 9), input_partitions=3 + │ DistributedLeafExec: DataSourceExec: file_groups={3 groups: [[/testdata/tpcds/plans_sf1_partitions4/store_returns/part-0.parquet:.., /testdata/tpcds/plans_sf1_partitions4/store_returns/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_returns/part-1.parquet:.., /testdata/tpcds/plans_sf1_partitions4/store_returns/part-2.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_returns/part-2.parquet:.., /testdata/tpcds/plans_sf1_partitions4/store_returns/part-3.parquet:..]]}, projection=[sr_returned_date_sk, sr_item_sk, sr_customer_sk, sr_ticket_number, sr_net_loss], file_type=parquet, predicate=DynamicFilter [ empty ] └────────────────────────────────────────────────── ┌───── Stage 5 ── Tasks: t0:[p0..p8] t1:[p0..p8] t2:[p0..p8] - │ RepartitionExec: partitioning=Hash([ss_customer_sk@2, ss_item_sk@1, ss_ticket_number@4], 9), input_partitions=2 - │ PartitionIsolatorExec: tasks=3 partitions=6 - │ DataSourceExec: file_groups={6 groups: [[/testdata/tpcds/plans_sf1_partitions4/store_sales/part-0.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_sales/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_sales/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_sales/part-2.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_sales/part-2.parquet:..], ...]}, projection=[ss_sold_date_sk, ss_item_sk, ss_customer_sk, ss_store_sk, ss_ticket_number, ss_net_profit], file_type=parquet, predicate=DynamicFilter [ empty ] AND DynamicFilter [ empty ] AND DynamicFilter [ empty ] + │ RepartitionExec: partitioning=Hash([ss_customer_sk@2, ss_item_sk@1, ss_ticket_number@4], 9), input_partitions=3 + │ DistributedLeafExec: DataSourceExec: file_groups={3 groups: [[/testdata/tpcds/plans_sf1_partitions4/store_sales/part-0.parquet:.., /testdata/tpcds/plans_sf1_partitions4/store_sales/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_sales/part-1.parquet:.., /testdata/tpcds/plans_sf1_partitions4/store_sales/part-2.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_sales/part-2.parquet:.., /testdata/tpcds/plans_sf1_partitions4/store_sales/part-3.parquet:..]]}, projection=[ss_sold_date_sk, ss_item_sk, ss_customer_sk, ss_store_sk, ss_ticket_number, ss_net_profit], file_type=parquet, predicate=DynamicFilter [ empty ] AND DynamicFilter [ empty ] AND DynamicFilter [ empty ] └────────────────────────────────────────────────── "); Ok(()) @@ -3008,7 +2767,7 @@ mod tests { #[tokio::test] async fn test_tpcds_26() -> Result<()> { let display = test_tpcds_query("q26").await?; - assert_snapshot!(display, @" + assert_snapshot!(display, @r" ┌───── DistributedExec ── Tasks: t0:[p0] │ SortPreservingMergeExec: [i_item_id@0 ASC NULLS LAST], fetch=100 │ [Stage 7] => NetworkCoalesceExec: output_partitions=6, input_tasks=2 @@ -3029,16 +2788,14 @@ mod tests { │ CoalescePartitionsExec │ [Stage 5] => NetworkBroadcastExec: partitions_per_consumer=3, stage_partitions=9, input_tasks=3 │ RepartitionExec: partitioning=RoundRobinBatch(3), input_partitions=2 - │ PartitionIsolatorExec: tasks=3 partitions=4 - │ DataSourceExec: file_groups={4 groups: [[/testdata/tpcds/plans_sf1_partitions4/item/part-0.parquet], [/testdata/tpcds/plans_sf1_partitions4/item/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/item/part-2.parquet], [/testdata/tpcds/plans_sf1_partitions4/item/part-3.parquet]]}, projection=[i_item_sk, i_item_id], file_type=parquet, predicate=DynamicFilter [ empty ] + │ DistributedLeafExec: DataSourceExec: file_groups={2 groups: [[/testdata/tpcds/plans_sf1_partitions4/item/part-0.parquet, /testdata/tpcds/plans_sf1_partitions4/item/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/item/part-2.parquet, /testdata/tpcds/plans_sf1_partitions4/item/part-3.parquet]]}, projection=[i_item_sk, i_item_id], file_type=parquet, predicate=DynamicFilter [ empty ] └────────────────────────────────────────────────── ┌───── Stage 1 ── Tasks: t0:[p0..p8] t1:[p9..p17] │ BroadcastExec: input_partitions=3, consumer_tasks=3, output_partitions=9 │ ProjectionExec: expr=[p_promo_sk@0 as p_promo_sk, CAST(p_promo_sk@0 AS Float64) as CAST(promotion.p_promo_sk AS Float64)] │ FilterExec: p_channel_email@1 = N OR p_channel_event@2 = N, projection=[p_promo_sk@0] │ RepartitionExec: partitioning=RoundRobinBatch(3), input_partitions=2 - │ PartitionIsolatorExec: tasks=2 partitions=4 - │ DataSourceExec: file_groups={4 groups: [[/testdata/tpcds/plans_sf1_partitions4/promotion/part-0.parquet], [/testdata/tpcds/plans_sf1_partitions4/promotion/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/promotion/part-2.parquet], [/testdata/tpcds/plans_sf1_partitions4/promotion/part-3.parquet]]}, projection=[p_promo_sk, p_channel_email, p_channel_event], file_type=parquet, predicate=p_channel_email@9 = N OR p_channel_event@14 = N, pruning_predicate=p_channel_email_null_count@2 != row_count@3 AND p_channel_email_min@0 <= N AND N <= p_channel_email_max@1 OR p_channel_event_null_count@6 != row_count@3 AND p_channel_event_min@4 <= N AND N <= p_channel_event_max@5, required_guarantees=[] + │ DistributedLeafExec: DataSourceExec: file_groups={2 groups: [[/testdata/tpcds/plans_sf1_partitions4/promotion/part-0.parquet, /testdata/tpcds/plans_sf1_partitions4/promotion/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/promotion/part-2.parquet, /testdata/tpcds/plans_sf1_partitions4/promotion/part-3.parquet]]}, projection=[p_promo_sk, p_channel_email, p_channel_event], file_type=parquet, predicate=p_channel_email@9 = N OR p_channel_event@14 = N, pruning_predicate=p_channel_email_null_count@2 != row_count@3 AND p_channel_email_min@0 <= N AND N <= p_channel_email_max@1 OR p_channel_event_null_count@6 != row_count@3 AND p_channel_event_min@4 <= N AND N <= p_channel_event_max@5, required_guarantees=[] └────────────────────────────────────────────────── ┌───── Stage 5 ── Tasks: t0:[p0..p8] t1:[p9..p17] t2:[p18..p26] │ BroadcastExec: input_partitions=3, consumer_tasks=3, output_partitions=9 @@ -3054,20 +2811,17 @@ mod tests { │ ProjectionExec: expr=[d_date_sk@0 as d_date_sk, CAST(d_date_sk@0 AS Float64) as CAST(date_dim.d_date_sk AS Float64)] │ FilterExec: d_year@1 = 2000, projection=[d_date_sk@0] │ RepartitionExec: partitioning=RoundRobinBatch(3), input_partitions=2 - │ PartitionIsolatorExec: tasks=2 partitions=4 - │ DataSourceExec: file_groups={4 groups: [[/testdata/tpcds/plans_sf1_partitions4/date_dim/part-0.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-2.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-3.parquet]]}, projection=[d_date_sk, d_year], file_type=parquet, predicate=d_year@6 = 2000, pruning_predicate=d_year_null_count@2 != row_count@3 AND d_year_min@0 <= 2000 AND 2000 <= d_year_max@1, required_guarantees=[d_year in (2000)] + │ DistributedLeafExec: DataSourceExec: file_groups={2 groups: [[/testdata/tpcds/plans_sf1_partitions4/date_dim/part-0.parquet, /testdata/tpcds/plans_sf1_partitions4/date_dim/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-2.parquet, /testdata/tpcds/plans_sf1_partitions4/date_dim/part-3.parquet]]}, projection=[d_date_sk, d_year], file_type=parquet, predicate=d_year@6 = 2000, pruning_predicate=d_year_null_count@2 != row_count@3 AND d_year_min@0 <= 2000 AND 2000 <= d_year_max@1, required_guarantees=[d_year in (2000)] └────────────────────────────────────────────────── ┌───── Stage 3 ── Tasks: t0:[p0..p8] t1:[p0..p8] t2:[p0..p8] - │ RepartitionExec: partitioning=Hash([CAST(customer_demographics.cd_demo_sk AS Float64)@1], 9), input_partitions=2 + │ RepartitionExec: partitioning=Hash([CAST(customer_demographics.cd_demo_sk AS Float64)@1], 9), input_partitions=3 │ ProjectionExec: expr=[cd_demo_sk@0 as cd_demo_sk, CAST(cd_demo_sk@0 AS Float64) as CAST(customer_demographics.cd_demo_sk AS Float64)] │ FilterExec: cd_gender@1 = M AND cd_marital_status@2 = S AND cd_education_status@3 = College, projection=[cd_demo_sk@0] - │ PartitionIsolatorExec: tasks=3 partitions=6 - │ DataSourceExec: file_groups={6 groups: [[/testdata/tpcds/plans_sf1_partitions4/customer_demographics/part-0.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/customer_demographics/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/customer_demographics/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/customer_demographics/part-2.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/customer_demographics/part-2.parquet:..], ...]}, projection=[cd_demo_sk, cd_gender, cd_marital_status, cd_education_status], file_type=parquet, predicate=cd_gender@1 = M AND cd_marital_status@2 = S AND cd_education_status@3 = College, pruning_predicate=cd_gender_null_count@2 != row_count@3 AND cd_gender_min@0 <= M AND M <= cd_gender_max@1 AND cd_marital_status_null_count@6 != row_count@3 AND cd_marital_status_min@4 <= S AND S <= cd_marital_status_max@5 AND cd_education_status_null_count@9 != row_count@3 AND cd_education_status_min@7 <= College AND College <= cd_education_status_max@8, required_guarantees=[cd_education_status in (College), cd_gender in (M), cd_marital_status in (S)] + │ DistributedLeafExec: DataSourceExec: file_groups={3 groups: [[/testdata/tpcds/plans_sf1_partitions4/customer_demographics/part-0.parquet:.., /testdata/tpcds/plans_sf1_partitions4/customer_demographics/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/customer_demographics/part-1.parquet:.., /testdata/tpcds/plans_sf1_partitions4/customer_demographics/part-2.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/customer_demographics/part-2.parquet:.., /testdata/tpcds/plans_sf1_partitions4/customer_demographics/part-3.parquet:..]]}, projection=[cd_demo_sk, cd_gender, cd_marital_status, cd_education_status], file_type=parquet, predicate=cd_gender@1 = M AND cd_marital_status@2 = S AND cd_education_status@3 = College, pruning_predicate=cd_gender_null_count@2 != row_count@3 AND cd_gender_min@0 <= M AND M <= cd_gender_max@1 AND cd_marital_status_null_count@6 != row_count@3 AND cd_marital_status_min@4 <= S AND S <= cd_marital_status_max@5 AND cd_education_status_null_count@9 != row_count@3 AND cd_education_status_min@7 <= College AND College <= cd_education_status_max@8, required_guarantees=[cd_education_status in (College), cd_gender in (M), cd_marital_status in (S)] └────────────────────────────────────────────────── ┌───── Stage 4 ── Tasks: t0:[p0..p8] t1:[p0..p8] t2:[p0..p8] - │ RepartitionExec: partitioning=Hash([cs_bill_cdemo_sk@1], 9), input_partitions=2 - │ PartitionIsolatorExec: tasks=3 partitions=6 - │ DataSourceExec: file_groups={6 groups: [[/testdata/tpcds/plans_sf1_partitions4/catalog_sales/part-0.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/catalog_sales/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/catalog_sales/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/catalog_sales/part-2.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/catalog_sales/part-2.parquet:..], ...]}, projection=[cs_sold_date_sk, cs_bill_cdemo_sk, cs_item_sk, cs_promo_sk, cs_quantity, cs_list_price, cs_sales_price, cs_coupon_amt], file_type=parquet, predicate=DynamicFilter [ empty ] AND DynamicFilter [ empty ] AND DynamicFilter [ empty ] + │ RepartitionExec: partitioning=Hash([cs_bill_cdemo_sk@1], 9), input_partitions=3 + │ DistributedLeafExec: DataSourceExec: file_groups={3 groups: [[/testdata/tpcds/plans_sf1_partitions4/catalog_sales/part-0.parquet:.., /testdata/tpcds/plans_sf1_partitions4/catalog_sales/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/catalog_sales/part-1.parquet:.., /testdata/tpcds/plans_sf1_partitions4/catalog_sales/part-2.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/catalog_sales/part-2.parquet:.., /testdata/tpcds/plans_sf1_partitions4/catalog_sales/part-3.parquet:..]]}, projection=[cs_sold_date_sk, cs_bill_cdemo_sk, cs_item_sk, cs_promo_sk, cs_quantity, cs_list_price, cs_sales_price, cs_coupon_amt], file_type=parquet, predicate=DynamicFilter [ empty ] AND DynamicFilter [ empty ] AND DynamicFilter [ empty ] └────────────────────────────────────────────────── "); Ok(()) @@ -3102,8 +2856,7 @@ mod tests { │ CoalescePartitionsExec │ [Stage 5] => NetworkBroadcastExec: partitions_per_consumer=3, stage_partitions=9, input_tasks=3 │ RepartitionExec: partitioning=RoundRobinBatch(3), input_partitions=2 - │ PartitionIsolatorExec: tasks=3 partitions=4 - │ DataSourceExec: file_groups={4 groups: [[/testdata/tpcds/plans_sf1_partitions4/item/part-0.parquet], [/testdata/tpcds/plans_sf1_partitions4/item/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/item/part-2.parquet], [/testdata/tpcds/plans_sf1_partitions4/item/part-3.parquet]]}, projection=[i_item_sk, i_item_id], file_type=parquet, predicate=DynamicFilter [ empty ] AND DynamicFilter [ empty ] + │ DistributedLeafExec: DataSourceExec: file_groups={2 groups: [[/testdata/tpcds/plans_sf1_partitions4/item/part-0.parquet, /testdata/tpcds/plans_sf1_partitions4/item/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/item/part-2.parquet, /testdata/tpcds/plans_sf1_partitions4/item/part-3.parquet]]}, projection=[i_item_sk, i_item_id], file_type=parquet, predicate=DynamicFilter [ empty ] AND DynamicFilter [ empty ] └────────────────────────────────────────────────── ┌───── Stage 5 ── Tasks: t0:[p0..p8] t1:[p9..p17] t2:[p18..p26] │ BroadcastExec: input_partitions=3, consumer_tasks=3, output_partitions=9 @@ -3123,28 +2876,24 @@ mod tests { │ ProjectionExec: expr=[s_store_sk@0 as s_store_sk, s_state@1 as s_state, CAST(s_store_sk@0 AS Float64) as CAST(store.s_store_sk AS Float64)] │ FilterExec: s_state@1 = TN │ RepartitionExec: partitioning=RoundRobinBatch(3), input_partitions=2 - │ PartitionIsolatorExec: tasks=2 partitions=4 - │ DataSourceExec: file_groups={4 groups: [[/testdata/tpcds/plans_sf1_partitions4/store/part-0.parquet], [/testdata/tpcds/plans_sf1_partitions4/store/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/store/part-2.parquet], [/testdata/tpcds/plans_sf1_partitions4/store/part-3.parquet]]}, projection=[s_store_sk, s_state], file_type=parquet, predicate=s_state@24 = TN, pruning_predicate=s_state_null_count@2 != row_count@3 AND s_state_min@0 <= TN AND TN <= s_state_max@1, required_guarantees=[s_state in (TN)] + │ DistributedLeafExec: DataSourceExec: file_groups={2 groups: [[/testdata/tpcds/plans_sf1_partitions4/store/part-0.parquet, /testdata/tpcds/plans_sf1_partitions4/store/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/store/part-2.parquet, /testdata/tpcds/plans_sf1_partitions4/store/part-3.parquet]]}, projection=[s_store_sk, s_state], file_type=parquet, predicate=s_state@24 = TN, pruning_predicate=s_state_null_count@2 != row_count@3 AND s_state_min@0 <= TN AND TN <= s_state_max@1, required_guarantees=[s_state in (TN)] └────────────────────────────────────────────────── ┌───── Stage 2 ── Tasks: t0:[p0..p8] t1:[p9..p17] │ BroadcastExec: input_partitions=3, consumer_tasks=3, output_partitions=9 │ ProjectionExec: expr=[d_date_sk@0 as d_date_sk, CAST(d_date_sk@0 AS Float64) as CAST(date_dim.d_date_sk AS Float64)] │ FilterExec: d_year@1 = 2002, projection=[d_date_sk@0] │ RepartitionExec: partitioning=RoundRobinBatch(3), input_partitions=2 - │ PartitionIsolatorExec: tasks=2 partitions=4 - │ DataSourceExec: file_groups={4 groups: [[/testdata/tpcds/plans_sf1_partitions4/date_dim/part-0.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-2.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-3.parquet]]}, projection=[d_date_sk, d_year], file_type=parquet, predicate=d_year@6 = 2002, pruning_predicate=d_year_null_count@2 != row_count@3 AND d_year_min@0 <= 2002 AND 2002 <= d_year_max@1, required_guarantees=[d_year in (2002)] + │ DistributedLeafExec: DataSourceExec: file_groups={2 groups: [[/testdata/tpcds/plans_sf1_partitions4/date_dim/part-0.parquet, /testdata/tpcds/plans_sf1_partitions4/date_dim/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-2.parquet, /testdata/tpcds/plans_sf1_partitions4/date_dim/part-3.parquet]]}, projection=[d_date_sk, d_year], file_type=parquet, predicate=d_year@6 = 2002, pruning_predicate=d_year_null_count@2 != row_count@3 AND d_year_min@0 <= 2002 AND 2002 <= d_year_max@1, required_guarantees=[d_year in (2002)] └────────────────────────────────────────────────── ┌───── Stage 3 ── Tasks: t0:[p0..p8] t1:[p0..p8] t2:[p0..p8] - │ RepartitionExec: partitioning=Hash([CAST(customer_demographics.cd_demo_sk AS Float64)@1], 9), input_partitions=2 + │ RepartitionExec: partitioning=Hash([CAST(customer_demographics.cd_demo_sk AS Float64)@1], 9), input_partitions=3 │ ProjectionExec: expr=[cd_demo_sk@0 as cd_demo_sk, CAST(cd_demo_sk@0 AS Float64) as CAST(customer_demographics.cd_demo_sk AS Float64)] │ FilterExec: cd_gender@1 = M AND cd_marital_status@2 = S AND cd_education_status@3 = College, projection=[cd_demo_sk@0] - │ PartitionIsolatorExec: tasks=3 partitions=6 - │ DataSourceExec: file_groups={6 groups: [[/testdata/tpcds/plans_sf1_partitions4/customer_demographics/part-0.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/customer_demographics/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/customer_demographics/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/customer_demographics/part-2.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/customer_demographics/part-2.parquet:..], ...]}, projection=[cd_demo_sk, cd_gender, cd_marital_status, cd_education_status], file_type=parquet, predicate=cd_gender@1 = M AND cd_marital_status@2 = S AND cd_education_status@3 = College, pruning_predicate=cd_gender_null_count@2 != row_count@3 AND cd_gender_min@0 <= M AND M <= cd_gender_max@1 AND cd_marital_status_null_count@6 != row_count@3 AND cd_marital_status_min@4 <= S AND S <= cd_marital_status_max@5 AND cd_education_status_null_count@9 != row_count@3 AND cd_education_status_min@7 <= College AND College <= cd_education_status_max@8, required_guarantees=[cd_education_status in (College), cd_gender in (M), cd_marital_status in (S)] + │ DistributedLeafExec: DataSourceExec: file_groups={3 groups: [[/testdata/tpcds/plans_sf1_partitions4/customer_demographics/part-0.parquet:.., /testdata/tpcds/plans_sf1_partitions4/customer_demographics/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/customer_demographics/part-1.parquet:.., /testdata/tpcds/plans_sf1_partitions4/customer_demographics/part-2.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/customer_demographics/part-2.parquet:.., /testdata/tpcds/plans_sf1_partitions4/customer_demographics/part-3.parquet:..]]}, projection=[cd_demo_sk, cd_gender, cd_marital_status, cd_education_status], file_type=parquet, predicate=cd_gender@1 = M AND cd_marital_status@2 = S AND cd_education_status@3 = College, pruning_predicate=cd_gender_null_count@2 != row_count@3 AND cd_gender_min@0 <= M AND M <= cd_gender_max@1 AND cd_marital_status_null_count@6 != row_count@3 AND cd_marital_status_min@4 <= S AND S <= cd_marital_status_max@5 AND cd_education_status_null_count@9 != row_count@3 AND cd_education_status_min@7 <= College AND College <= cd_education_status_max@8, required_guarantees=[cd_education_status in (College), cd_gender in (M), cd_marital_status in (S)] └────────────────────────────────────────────────── ┌───── Stage 4 ── Tasks: t0:[p0..p8] t1:[p0..p8] t2:[p0..p8] - │ RepartitionExec: partitioning=Hash([ss_cdemo_sk@2], 9), input_partitions=2 - │ PartitionIsolatorExec: tasks=3 partitions=6 - │ DataSourceExec: file_groups={6 groups: [[/testdata/tpcds/plans_sf1_partitions4/store_sales/part-0.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_sales/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_sales/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_sales/part-2.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_sales/part-2.parquet:..], ...]}, projection=[ss_sold_date_sk, ss_item_sk, ss_cdemo_sk, ss_store_sk, ss_quantity, ss_list_price, ss_sales_price, ss_coupon_amt], file_type=parquet, predicate=DynamicFilter [ empty ] AND DynamicFilter [ empty ] AND DynamicFilter [ empty ] + │ RepartitionExec: partitioning=Hash([ss_cdemo_sk@2], 9), input_partitions=3 + │ DistributedLeafExec: DataSourceExec: file_groups={3 groups: [[/testdata/tpcds/plans_sf1_partitions4/store_sales/part-0.parquet:.., /testdata/tpcds/plans_sf1_partitions4/store_sales/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_sales/part-1.parquet:.., /testdata/tpcds/plans_sf1_partitions4/store_sales/part-2.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_sales/part-2.parquet:.., /testdata/tpcds/plans_sf1_partitions4/store_sales/part-3.parquet:..]]}, projection=[ss_sold_date_sk, ss_item_sk, ss_cdemo_sk, ss_store_sk, ss_quantity, ss_list_price, ss_sales_price, ss_coupon_amt], file_type=parquet, predicate=DynamicFilter [ empty ] AND DynamicFilter [ empty ] AND DynamicFilter [ empty ] └────────────────────────────────────────────────── ┌───── Stage 12 ── Tasks: t0:[p0..p2] t1:[p0..p2] t2:[p0..p2] │ RepartitionExec: partitioning=Hash([i_item_id@0], 3), input_partitions=3 @@ -3154,8 +2903,7 @@ mod tests { │ CoalescePartitionsExec │ [Stage 11] => NetworkBroadcastExec: partitions_per_consumer=3, stage_partitions=9, input_tasks=3 │ RepartitionExec: partitioning=RoundRobinBatch(3), input_partitions=2 - │ PartitionIsolatorExec: tasks=3 partitions=4 - │ DataSourceExec: file_groups={4 groups: [[/testdata/tpcds/plans_sf1_partitions4/item/part-0.parquet], [/testdata/tpcds/plans_sf1_partitions4/item/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/item/part-2.parquet], [/testdata/tpcds/plans_sf1_partitions4/item/part-3.parquet]]}, projection=[i_item_sk, i_item_id], file_type=parquet, predicate=DynamicFilter [ empty ] AND DynamicFilter [ empty ] + │ DistributedLeafExec: DataSourceExec: file_groups={2 groups: [[/testdata/tpcds/plans_sf1_partitions4/item/part-0.parquet, /testdata/tpcds/plans_sf1_partitions4/item/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/item/part-2.parquet, /testdata/tpcds/plans_sf1_partitions4/item/part-3.parquet]]}, projection=[i_item_sk, i_item_id], file_type=parquet, predicate=DynamicFilter [ empty ] AND DynamicFilter [ empty ] └────────────────────────────────────────────────── ┌───── Stage 11 ── Tasks: t0:[p0..p8] t1:[p9..p17] t2:[p18..p26] │ BroadcastExec: input_partitions=3, consumer_tasks=3, output_partitions=9 @@ -3174,28 +2922,24 @@ mod tests { │ ProjectionExec: expr=[s_store_sk@0 as s_store_sk, CAST(s_store_sk@0 AS Float64) as CAST(store.s_store_sk AS Float64)] │ FilterExec: s_state@1 = TN, projection=[s_store_sk@0] │ RepartitionExec: partitioning=RoundRobinBatch(3), input_partitions=2 - │ PartitionIsolatorExec: tasks=2 partitions=4 - │ DataSourceExec: file_groups={4 groups: [[/testdata/tpcds/plans_sf1_partitions4/store/part-0.parquet], [/testdata/tpcds/plans_sf1_partitions4/store/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/store/part-2.parquet], [/testdata/tpcds/plans_sf1_partitions4/store/part-3.parquet]]}, projection=[s_store_sk, s_state], file_type=parquet, predicate=s_state@24 = TN, pruning_predicate=s_state_null_count@2 != row_count@3 AND s_state_min@0 <= TN AND TN <= s_state_max@1, required_guarantees=[s_state in (TN)] + │ DistributedLeafExec: DataSourceExec: file_groups={2 groups: [[/testdata/tpcds/plans_sf1_partitions4/store/part-0.parquet, /testdata/tpcds/plans_sf1_partitions4/store/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/store/part-2.parquet, /testdata/tpcds/plans_sf1_partitions4/store/part-3.parquet]]}, projection=[s_store_sk, s_state], file_type=parquet, predicate=s_state@24 = TN, pruning_predicate=s_state_null_count@2 != row_count@3 AND s_state_min@0 <= TN AND TN <= s_state_max@1, required_guarantees=[s_state in (TN)] └────────────────────────────────────────────────── ┌───── Stage 8 ── Tasks: t0:[p0..p8] t1:[p9..p17] │ BroadcastExec: input_partitions=3, consumer_tasks=3, output_partitions=9 │ ProjectionExec: expr=[d_date_sk@0 as d_date_sk, CAST(d_date_sk@0 AS Float64) as CAST(date_dim.d_date_sk AS Float64)] │ FilterExec: d_year@1 = 2002, projection=[d_date_sk@0] │ RepartitionExec: partitioning=RoundRobinBatch(3), input_partitions=2 - │ PartitionIsolatorExec: tasks=2 partitions=4 - │ DataSourceExec: file_groups={4 groups: [[/testdata/tpcds/plans_sf1_partitions4/date_dim/part-0.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-2.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-3.parquet]]}, projection=[d_date_sk, d_year], file_type=parquet, predicate=d_year@6 = 2002, pruning_predicate=d_year_null_count@2 != row_count@3 AND d_year_min@0 <= 2002 AND 2002 <= d_year_max@1, required_guarantees=[d_year in (2002)] + │ DistributedLeafExec: DataSourceExec: file_groups={2 groups: [[/testdata/tpcds/plans_sf1_partitions4/date_dim/part-0.parquet, /testdata/tpcds/plans_sf1_partitions4/date_dim/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-2.parquet, /testdata/tpcds/plans_sf1_partitions4/date_dim/part-3.parquet]]}, projection=[d_date_sk, d_year], file_type=parquet, predicate=d_year@6 = 2002, pruning_predicate=d_year_null_count@2 != row_count@3 AND d_year_min@0 <= 2002 AND 2002 <= d_year_max@1, required_guarantees=[d_year in (2002)] └────────────────────────────────────────────────── ┌───── Stage 9 ── Tasks: t0:[p0..p8] t1:[p0..p8] t2:[p0..p8] - │ RepartitionExec: partitioning=Hash([CAST(customer_demographics.cd_demo_sk AS Float64)@1], 9), input_partitions=2 + │ RepartitionExec: partitioning=Hash([CAST(customer_demographics.cd_demo_sk AS Float64)@1], 9), input_partitions=3 │ ProjectionExec: expr=[cd_demo_sk@0 as cd_demo_sk, CAST(cd_demo_sk@0 AS Float64) as CAST(customer_demographics.cd_demo_sk AS Float64)] │ FilterExec: cd_gender@1 = M AND cd_marital_status@2 = S AND cd_education_status@3 = College, projection=[cd_demo_sk@0] - │ PartitionIsolatorExec: tasks=3 partitions=6 - │ DataSourceExec: file_groups={6 groups: [[/testdata/tpcds/plans_sf1_partitions4/customer_demographics/part-0.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/customer_demographics/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/customer_demographics/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/customer_demographics/part-2.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/customer_demographics/part-2.parquet:..], ...]}, projection=[cd_demo_sk, cd_gender, cd_marital_status, cd_education_status], file_type=parquet, predicate=cd_gender@1 = M AND cd_marital_status@2 = S AND cd_education_status@3 = College, pruning_predicate=cd_gender_null_count@2 != row_count@3 AND cd_gender_min@0 <= M AND M <= cd_gender_max@1 AND cd_marital_status_null_count@6 != row_count@3 AND cd_marital_status_min@4 <= S AND S <= cd_marital_status_max@5 AND cd_education_status_null_count@9 != row_count@3 AND cd_education_status_min@7 <= College AND College <= cd_education_status_max@8, required_guarantees=[cd_education_status in (College), cd_gender in (M), cd_marital_status in (S)] + │ DistributedLeafExec: DataSourceExec: file_groups={3 groups: [[/testdata/tpcds/plans_sf1_partitions4/customer_demographics/part-0.parquet:.., /testdata/tpcds/plans_sf1_partitions4/customer_demographics/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/customer_demographics/part-1.parquet:.., /testdata/tpcds/plans_sf1_partitions4/customer_demographics/part-2.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/customer_demographics/part-2.parquet:.., /testdata/tpcds/plans_sf1_partitions4/customer_demographics/part-3.parquet:..]]}, projection=[cd_demo_sk, cd_gender, cd_marital_status, cd_education_status], file_type=parquet, predicate=cd_gender@1 = M AND cd_marital_status@2 = S AND cd_education_status@3 = College, pruning_predicate=cd_gender_null_count@2 != row_count@3 AND cd_gender_min@0 <= M AND M <= cd_gender_max@1 AND cd_marital_status_null_count@6 != row_count@3 AND cd_marital_status_min@4 <= S AND S <= cd_marital_status_max@5 AND cd_education_status_null_count@9 != row_count@3 AND cd_education_status_min@7 <= College AND College <= cd_education_status_max@8, required_guarantees=[cd_education_status in (College), cd_gender in (M), cd_marital_status in (S)] └────────────────────────────────────────────────── ┌───── Stage 10 ── Tasks: t0:[p0..p8] t1:[p0..p8] t2:[p0..p8] - │ RepartitionExec: partitioning=Hash([ss_cdemo_sk@2], 9), input_partitions=2 - │ PartitionIsolatorExec: tasks=3 partitions=6 - │ DataSourceExec: file_groups={6 groups: [[/testdata/tpcds/plans_sf1_partitions4/store_sales/part-0.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_sales/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_sales/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_sales/part-2.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_sales/part-2.parquet:..], ...]}, projection=[ss_sold_date_sk, ss_item_sk, ss_cdemo_sk, ss_store_sk, ss_quantity, ss_list_price, ss_sales_price, ss_coupon_amt], file_type=parquet, predicate=DynamicFilter [ empty ] AND DynamicFilter [ empty ] AND DynamicFilter [ empty ] + │ RepartitionExec: partitioning=Hash([ss_cdemo_sk@2], 9), input_partitions=3 + │ DistributedLeafExec: DataSourceExec: file_groups={3 groups: [[/testdata/tpcds/plans_sf1_partitions4/store_sales/part-0.parquet:.., /testdata/tpcds/plans_sf1_partitions4/store_sales/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_sales/part-1.parquet:.., /testdata/tpcds/plans_sf1_partitions4/store_sales/part-2.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_sales/part-2.parquet:.., /testdata/tpcds/plans_sf1_partitions4/store_sales/part-3.parquet:..]]}, projection=[ss_sold_date_sk, ss_item_sk, ss_cdemo_sk, ss_store_sk, ss_quantity, ss_list_price, ss_sales_price, ss_coupon_amt], file_type=parquet, predicate=DynamicFilter [ empty ] AND DynamicFilter [ empty ] AND DynamicFilter [ empty ] └────────────────────────────────────────────────── ┌───── Stage 18 ── Tasks: t0:[p0..p2] t1:[p3..p5] t2:[p6..p8] │ AggregateExec: mode=Partial, gby=[], aggr=[avg(results.agg1), avg(results.agg2), avg(results.agg3), avg(results.agg4)] @@ -3204,8 +2948,7 @@ mod tests { │ CoalescePartitionsExec │ [Stage 17] => NetworkBroadcastExec: partitions_per_consumer=3, stage_partitions=9, input_tasks=3 │ RepartitionExec: partitioning=RoundRobinBatch(3), input_partitions=2 - │ PartitionIsolatorExec: tasks=3 partitions=4 - │ DataSourceExec: file_groups={4 groups: [[/testdata/tpcds/plans_sf1_partitions4/item/part-0.parquet], [/testdata/tpcds/plans_sf1_partitions4/item/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/item/part-2.parquet], [/testdata/tpcds/plans_sf1_partitions4/item/part-3.parquet]]}, projection=[i_item_sk], file_type=parquet, predicate=DynamicFilter [ empty ] AND DynamicFilter [ empty ] + │ DistributedLeafExec: DataSourceExec: file_groups={2 groups: [[/testdata/tpcds/plans_sf1_partitions4/item/part-0.parquet, /testdata/tpcds/plans_sf1_partitions4/item/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/item/part-2.parquet, /testdata/tpcds/plans_sf1_partitions4/item/part-3.parquet]]}, projection=[i_item_sk], file_type=parquet, predicate=DynamicFilter [ empty ] AND DynamicFilter [ empty ] └────────────────────────────────────────────────── ┌───── Stage 17 ── Tasks: t0:[p0..p8] t1:[p9..p17] t2:[p18..p26] │ BroadcastExec: input_partitions=3, consumer_tasks=3, output_partitions=9 @@ -3224,28 +2967,24 @@ mod tests { │ ProjectionExec: expr=[s_store_sk@0 as s_store_sk, CAST(s_store_sk@0 AS Float64) as CAST(store.s_store_sk AS Float64)] │ FilterExec: s_state@1 = TN, projection=[s_store_sk@0] │ RepartitionExec: partitioning=RoundRobinBatch(3), input_partitions=2 - │ PartitionIsolatorExec: tasks=2 partitions=4 - │ DataSourceExec: file_groups={4 groups: [[/testdata/tpcds/plans_sf1_partitions4/store/part-0.parquet], [/testdata/tpcds/plans_sf1_partitions4/store/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/store/part-2.parquet], [/testdata/tpcds/plans_sf1_partitions4/store/part-3.parquet]]}, projection=[s_store_sk, s_state], file_type=parquet, predicate=s_state@24 = TN AND DynamicFilter [ empty ], pruning_predicate=s_state_null_count@2 != row_count@3 AND s_state_min@0 <= TN AND TN <= s_state_max@1, required_guarantees=[s_state in (TN)] + │ DistributedLeafExec: DataSourceExec: file_groups={2 groups: [[/testdata/tpcds/plans_sf1_partitions4/store/part-0.parquet, /testdata/tpcds/plans_sf1_partitions4/store/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/store/part-2.parquet, /testdata/tpcds/plans_sf1_partitions4/store/part-3.parquet]]}, projection=[s_store_sk, s_state], file_type=parquet, predicate=s_state@24 = TN AND DynamicFilter [ empty ], pruning_predicate=s_state_null_count@2 != row_count@3 AND s_state_min@0 <= TN AND TN <= s_state_max@1, required_guarantees=[s_state in (TN)] └────────────────────────────────────────────────── ┌───── Stage 14 ── Tasks: t0:[p0..p8] t1:[p9..p17] │ BroadcastExec: input_partitions=3, consumer_tasks=3, output_partitions=9 │ ProjectionExec: expr=[d_date_sk@0 as d_date_sk, CAST(d_date_sk@0 AS Float64) as CAST(date_dim.d_date_sk AS Float64)] │ FilterExec: d_year@1 = 2002, projection=[d_date_sk@0] │ RepartitionExec: partitioning=RoundRobinBatch(3), input_partitions=2 - │ PartitionIsolatorExec: tasks=2 partitions=4 - │ DataSourceExec: file_groups={4 groups: [[/testdata/tpcds/plans_sf1_partitions4/date_dim/part-0.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-2.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-3.parquet]]}, projection=[d_date_sk, d_year], file_type=parquet, predicate=d_year@6 = 2002 AND DynamicFilter [ empty ], pruning_predicate=d_year_null_count@2 != row_count@3 AND d_year_min@0 <= 2002 AND 2002 <= d_year_max@1, required_guarantees=[d_year in (2002)] + │ DistributedLeafExec: DataSourceExec: file_groups={2 groups: [[/testdata/tpcds/plans_sf1_partitions4/date_dim/part-0.parquet, /testdata/tpcds/plans_sf1_partitions4/date_dim/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-2.parquet, /testdata/tpcds/plans_sf1_partitions4/date_dim/part-3.parquet]]}, projection=[d_date_sk, d_year], file_type=parquet, predicate=d_year@6 = 2002 AND DynamicFilter [ empty ], pruning_predicate=d_year_null_count@2 != row_count@3 AND d_year_min@0 <= 2002 AND 2002 <= d_year_max@1, required_guarantees=[d_year in (2002)] └────────────────────────────────────────────────── ┌───── Stage 15 ── Tasks: t0:[p0..p8] t1:[p0..p8] t2:[p0..p8] - │ RepartitionExec: partitioning=Hash([CAST(customer_demographics.cd_demo_sk AS Float64)@1], 9), input_partitions=2 + │ RepartitionExec: partitioning=Hash([CAST(customer_demographics.cd_demo_sk AS Float64)@1], 9), input_partitions=3 │ ProjectionExec: expr=[cd_demo_sk@0 as cd_demo_sk, CAST(cd_demo_sk@0 AS Float64) as CAST(customer_demographics.cd_demo_sk AS Float64)] │ FilterExec: cd_gender@1 = M AND cd_marital_status@2 = S AND cd_education_status@3 = College, projection=[cd_demo_sk@0] - │ PartitionIsolatorExec: tasks=3 partitions=6 - │ DataSourceExec: file_groups={6 groups: [[/testdata/tpcds/plans_sf1_partitions4/customer_demographics/part-0.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/customer_demographics/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/customer_demographics/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/customer_demographics/part-2.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/customer_demographics/part-2.parquet:..], ...]}, projection=[cd_demo_sk, cd_gender, cd_marital_status, cd_education_status], file_type=parquet, predicate=cd_gender@1 = M AND cd_marital_status@2 = S AND cd_education_status@3 = College AND DynamicFilter [ empty ], pruning_predicate=cd_gender_null_count@2 != row_count@3 AND cd_gender_min@0 <= M AND M <= cd_gender_max@1 AND cd_marital_status_null_count@6 != row_count@3 AND cd_marital_status_min@4 <= S AND S <= cd_marital_status_max@5 AND cd_education_status_null_count@9 != row_count@3 AND cd_education_status_min@7 <= College AND College <= cd_education_status_max@8, required_guarantees=[cd_education_status in (College), cd_gender in (M), cd_marital_status in (S)] + │ DistributedLeafExec: DataSourceExec: file_groups={3 groups: [[/testdata/tpcds/plans_sf1_partitions4/customer_demographics/part-0.parquet:.., /testdata/tpcds/plans_sf1_partitions4/customer_demographics/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/customer_demographics/part-1.parquet:.., /testdata/tpcds/plans_sf1_partitions4/customer_demographics/part-2.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/customer_demographics/part-2.parquet:.., /testdata/tpcds/plans_sf1_partitions4/customer_demographics/part-3.parquet:..]]}, projection=[cd_demo_sk, cd_gender, cd_marital_status, cd_education_status], file_type=parquet, predicate=cd_gender@1 = M AND cd_marital_status@2 = S AND cd_education_status@3 = College AND DynamicFilter [ empty ], pruning_predicate=cd_gender_null_count@2 != row_count@3 AND cd_gender_min@0 <= M AND M <= cd_gender_max@1 AND cd_marital_status_null_count@6 != row_count@3 AND cd_marital_status_min@4 <= S AND S <= cd_marital_status_max@5 AND cd_education_status_null_count@9 != row_count@3 AND cd_education_status_min@7 <= College AND College <= cd_education_status_max@8, required_guarantees=[cd_education_status in (College), cd_gender in (M), cd_marital_status in (S)] └────────────────────────────────────────────────── ┌───── Stage 16 ── Tasks: t0:[p0..p8] t1:[p0..p8] t2:[p0..p8] - │ RepartitionExec: partitioning=Hash([ss_cdemo_sk@2], 9), input_partitions=2 - │ PartitionIsolatorExec: tasks=3 partitions=6 - │ DataSourceExec: file_groups={6 groups: [[/testdata/tpcds/plans_sf1_partitions4/store_sales/part-0.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_sales/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_sales/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_sales/part-2.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_sales/part-2.parquet:..], ...]}, projection=[ss_sold_date_sk, ss_item_sk, ss_cdemo_sk, ss_store_sk, ss_quantity, ss_list_price, ss_sales_price, ss_coupon_amt], file_type=parquet, predicate=DynamicFilter [ empty ] AND DynamicFilter [ empty ] AND DynamicFilter [ empty ] AND DynamicFilter [ empty ] + │ RepartitionExec: partitioning=Hash([ss_cdemo_sk@2], 9), input_partitions=3 + │ DistributedLeafExec: DataSourceExec: file_groups={3 groups: [[/testdata/tpcds/plans_sf1_partitions4/store_sales/part-0.parquet:.., /testdata/tpcds/plans_sf1_partitions4/store_sales/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_sales/part-1.parquet:.., /testdata/tpcds/plans_sf1_partitions4/store_sales/part-2.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_sales/part-2.parquet:.., /testdata/tpcds/plans_sf1_partitions4/store_sales/part-3.parquet:..]]}, projection=[ss_sold_date_sk, ss_item_sk, ss_cdemo_sk, ss_store_sk, ss_quantity, ss_list_price, ss_sales_price, ss_coupon_amt], file_type=parquet, predicate=DynamicFilter [ empty ] AND DynamicFilter [ empty ] AND DynamicFilter [ empty ] AND DynamicFilter [ empty ] └────────────────────────────────────────────────── "); Ok(()) @@ -3253,7 +2992,7 @@ mod tests { #[tokio::test] async fn test_tpcds_28() -> Result<()> { let display = test_tpcds_query("q28").await?; - assert_snapshot!(display, @" + assert_snapshot!(display, @r" ┌───── DistributedExec ── Tasks: t0:[p0] │ ProjectionExec: expr=[b1_lp@3 as b1_lp, b1_cnt@4 as b1_cnt, b1_cntd@5 as b1_cntd, b2_lp@6 as b2_lp, b2_cnt@7 as b2_cnt, b2_cntd@8 as b2_cntd, b3_lp@9 as b3_lp, b3_cnt@10 as b3_cnt, b3_cntd@11 as b3_cntd, b4_lp@12 as b4_lp, b4_cnt@13 as b4_cnt, b4_cntd@14 as b4_cntd, b5_lp@15 as b5_lp, b5_cnt@16 as b5_cnt, b5_cntd@17 as b5_cntd, b6_lp@0 as b6_lp, b6_cnt@1 as b6_cnt, b6_cntd@2 as b6_cntd] │ GlobalLimitExec: skip=0, fetch=100 @@ -3262,7 +3001,7 @@ mod tests { │ GlobalLimitExec: skip=0, fetch=100 │ AggregateExec: mode=Final, gby=[], aggr=[avg(store_sales.ss_list_price), count(store_sales.ss_list_price), count(DISTINCT store_sales.ss_list_price)] │ CoalescePartitionsExec - │ [Stage 1] => NetworkCoalesceExec: output_partitions=6, input_tasks=3 + │ [Stage 1] => NetworkCoalesceExec: output_partitions=9, input_tasks=3 │ ProjectionExec: expr=[b1_lp@3 as b1_lp, b1_cnt@4 as b1_cnt, b1_cntd@5 as b1_cntd, b2_lp@6 as b2_lp, b2_cnt@7 as b2_cnt, b2_cntd@8 as b2_cntd, b3_lp@9 as b3_lp, b3_cnt@10 as b3_cnt, b3_cntd@11 as b3_cntd, b4_lp@12 as b4_lp, b4_cnt@13 as b4_cnt, b4_cntd@14 as b4_cntd, b5_lp@0 as b5_lp, b5_cnt@1 as b5_cnt, b5_cntd@2 as b5_cntd] │ GlobalLimitExec: skip=0, fetch=100 │ CrossJoinExec @@ -3270,7 +3009,7 @@ mod tests { │ GlobalLimitExec: skip=0, fetch=100 │ AggregateExec: mode=Final, gby=[], aggr=[avg(store_sales.ss_list_price), count(store_sales.ss_list_price), count(DISTINCT store_sales.ss_list_price)] │ CoalescePartitionsExec - │ [Stage 2] => NetworkCoalesceExec: output_partitions=6, input_tasks=3 + │ [Stage 2] => NetworkCoalesceExec: output_partitions=9, input_tasks=3 │ ProjectionExec: expr=[b1_lp@3 as b1_lp, b1_cnt@4 as b1_cnt, b1_cntd@5 as b1_cntd, b2_lp@6 as b2_lp, b2_cnt@7 as b2_cnt, b2_cntd@8 as b2_cntd, b3_lp@9 as b3_lp, b3_cnt@10 as b3_cnt, b3_cntd@11 as b3_cntd, b4_lp@0 as b4_lp, b4_cnt@1 as b4_cnt, b4_cntd@2 as b4_cntd] │ GlobalLimitExec: skip=0, fetch=100 │ CrossJoinExec @@ -3278,7 +3017,7 @@ mod tests { │ GlobalLimitExec: skip=0, fetch=100 │ AggregateExec: mode=Final, gby=[], aggr=[avg(store_sales.ss_list_price), count(store_sales.ss_list_price), count(DISTINCT store_sales.ss_list_price)] │ CoalescePartitionsExec - │ [Stage 3] => NetworkCoalesceExec: output_partitions=6, input_tasks=3 + │ [Stage 3] => NetworkCoalesceExec: output_partitions=9, input_tasks=3 │ ProjectionExec: expr=[b1_lp@3 as b1_lp, b1_cnt@4 as b1_cnt, b1_cntd@5 as b1_cntd, b2_lp@6 as b2_lp, b2_cnt@7 as b2_cnt, b2_cntd@8 as b2_cntd, b3_lp@0 as b3_lp, b3_cnt@1 as b3_cnt, b3_cntd@2 as b3_cntd] │ GlobalLimitExec: skip=0, fetch=100 │ CrossJoinExec @@ -3286,55 +3025,49 @@ mod tests { │ GlobalLimitExec: skip=0, fetch=100 │ AggregateExec: mode=Final, gby=[], aggr=[avg(store_sales.ss_list_price), count(store_sales.ss_list_price), count(DISTINCT store_sales.ss_list_price)] │ CoalescePartitionsExec - │ [Stage 4] => NetworkCoalesceExec: output_partitions=6, input_tasks=3 + │ [Stage 4] => NetworkCoalesceExec: output_partitions=9, input_tasks=3 │ GlobalLimitExec: skip=0, fetch=100 │ CrossJoinExec │ ProjectionExec: expr=[avg(store_sales.ss_list_price)@0 as b1_lp, count(store_sales.ss_list_price)@1 as b1_cnt, count(DISTINCT store_sales.ss_list_price)@2 as b1_cntd] │ GlobalLimitExec: skip=0, fetch=100 │ AggregateExec: mode=Final, gby=[], aggr=[avg(store_sales.ss_list_price), count(store_sales.ss_list_price), count(DISTINCT store_sales.ss_list_price)] │ CoalescePartitionsExec - │ [Stage 5] => NetworkCoalesceExec: output_partitions=6, input_tasks=3 + │ [Stage 5] => NetworkCoalesceExec: output_partitions=9, input_tasks=3 │ ProjectionExec: expr=[avg(store_sales.ss_list_price)@0 as b2_lp, count(store_sales.ss_list_price)@1 as b2_cnt, count(DISTINCT store_sales.ss_list_price)@2 as b2_cntd] │ GlobalLimitExec: skip=0, fetch=100 │ AggregateExec: mode=Final, gby=[], aggr=[avg(store_sales.ss_list_price), count(store_sales.ss_list_price), count(DISTINCT store_sales.ss_list_price)] │ CoalescePartitionsExec - │ [Stage 6] => NetworkCoalesceExec: output_partitions=6, input_tasks=3 + │ [Stage 6] => NetworkCoalesceExec: output_partitions=9, input_tasks=3 └────────────────────────────────────────────────── - ┌───── Stage 1 ── Tasks: t0:[p0..p1] t1:[p2..p3] t2:[p4..p5] + ┌───── Stage 1 ── Tasks: t0:[p0..p2] t1:[p3..p5] t2:[p6..p8] │ AggregateExec: mode=Partial, gby=[], aggr=[avg(store_sales.ss_list_price), count(store_sales.ss_list_price), count(DISTINCT store_sales.ss_list_price)] │ FilterExec: ss_quantity@0 >= 26 AND ss_quantity@0 <= 30 AND (ss_list_price@2 >= Some(15400),5,2 AND ss_list_price@2 <= Some(16400),5,2 OR ss_coupon_amt@3 >= Some(732600),7,2 AND ss_coupon_amt@3 <= Some(832600),7,2 OR ss_wholesale_cost@1 >= Some(700),5,2 AND ss_wholesale_cost@1 <= Some(2700),5,2), projection=[ss_list_price@2] - │ PartitionIsolatorExec: tasks=3 partitions=6 - │ DataSourceExec: file_groups={6 groups: [[/testdata/tpcds/plans_sf1_partitions4/store_sales/part-0.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_sales/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_sales/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_sales/part-2.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_sales/part-2.parquet:..], ...]}, projection=[ss_quantity, ss_wholesale_cost, ss_list_price, ss_coupon_amt], file_type=parquet, predicate=ss_quantity@10 >= 26 AND ss_quantity@10 <= 30 AND (ss_list_price@12 >= Some(15400),5,2 AND ss_list_price@12 <= Some(16400),5,2 OR ss_coupon_amt@19 >= Some(732600),7,2 AND ss_coupon_amt@19 <= Some(832600),7,2 OR ss_wholesale_cost@11 >= Some(700),5,2 AND ss_wholesale_cost@11 <= Some(2700),5,2), pruning_predicate=ss_quantity_null_count@1 != row_count@2 AND ss_quantity_max@0 >= 26 AND ss_quantity_null_count@1 != row_count@2 AND ss_quantity_min@3 <= 30 AND (ss_list_price_null_count@5 != row_count@2 AND ss_list_price_max@4 >= Some(15400),5,2 AND ss_list_price_null_count@5 != row_count@2 AND ss_list_price_min@6 <= Some(16400),5,2 OR ss_coupon_amt_null_count@8 != row_count@2 AND ss_coupon_amt_max@7 >= Some(732600),7,2 AND ss_coupon_amt_null_count@8 != row_count@2 AND ss_coupon_amt_min@9 <= Some(832600),7,2 OR ss_wholesale_cost_null_count@11 != row_count@2 AND ss_wholesale_cost_max@10 >= Some(700),5,2 AND ss_wholesale_cost_null_count@11 != row_count@2 AND ss_wholesale_cost_min@12 <= Some(2700),5,2), required_guarantees=[] + │ DistributedLeafExec: DataSourceExec: file_groups={3 groups: [[/testdata/tpcds/plans_sf1_partitions4/store_sales/part-0.parquet:.., /testdata/tpcds/plans_sf1_partitions4/store_sales/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_sales/part-1.parquet:.., /testdata/tpcds/plans_sf1_partitions4/store_sales/part-2.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_sales/part-2.parquet:.., /testdata/tpcds/plans_sf1_partitions4/store_sales/part-3.parquet:..]]}, projection=[ss_quantity, ss_wholesale_cost, ss_list_price, ss_coupon_amt], file_type=parquet, predicate=ss_quantity@10 >= 26 AND ss_quantity@10 <= 30 AND (ss_list_price@12 >= Some(15400),5,2 AND ss_list_price@12 <= Some(16400),5,2 OR ss_coupon_amt@19 >= Some(732600),7,2 AND ss_coupon_amt@19 <= Some(832600),7,2 OR ss_wholesale_cost@11 >= Some(700),5,2 AND ss_wholesale_cost@11 <= Some(2700),5,2), pruning_predicate=ss_quantity_null_count@1 != row_count@2 AND ss_quantity_max@0 >= 26 AND ss_quantity_null_count@1 != row_count@2 AND ss_quantity_min@3 <= 30 AND (ss_list_price_null_count@5 != row_count@2 AND ss_list_price_max@4 >= Some(15400),5,2 AND ss_list_price_null_count@5 != row_count@2 AND ss_list_price_min@6 <= Some(16400),5,2 OR ss_coupon_amt_null_count@8 != row_count@2 AND ss_coupon_amt_max@7 >= Some(732600),7,2 AND ss_coupon_amt_null_count@8 != row_count@2 AND ss_coupon_amt_min@9 <= Some(832600),7,2 OR ss_wholesale_cost_null_count@11 != row_count@2 AND ss_wholesale_cost_max@10 >= Some(700),5,2 AND ss_wholesale_cost_null_count@11 != row_count@2 AND ss_wholesale_cost_min@12 <= Some(2700),5,2), required_guarantees=[] └────────────────────────────────────────────────── - ┌───── Stage 2 ── Tasks: t0:[p0..p1] t1:[p2..p3] t2:[p4..p5] + ┌───── Stage 2 ── Tasks: t0:[p0..p2] t1:[p3..p5] t2:[p6..p8] │ AggregateExec: mode=Partial, gby=[], aggr=[avg(store_sales.ss_list_price), count(store_sales.ss_list_price), count(DISTINCT store_sales.ss_list_price)] │ FilterExec: ss_quantity@0 >= 21 AND ss_quantity@0 <= 25 AND (ss_list_price@2 >= Some(12200),5,2 AND ss_list_price@2 <= Some(13200),5,2 OR ss_coupon_amt@3 >= Some(83600),7,2 AND ss_coupon_amt@3 <= Some(183600),7,2 OR ss_wholesale_cost@1 >= Some(1700),5,2 AND ss_wholesale_cost@1 <= Some(3700),5,2), projection=[ss_list_price@2] - │ PartitionIsolatorExec: tasks=3 partitions=6 - │ DataSourceExec: file_groups={6 groups: [[/testdata/tpcds/plans_sf1_partitions4/store_sales/part-0.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_sales/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_sales/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_sales/part-2.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_sales/part-2.parquet:..], ...]}, projection=[ss_quantity, ss_wholesale_cost, ss_list_price, ss_coupon_amt], file_type=parquet, predicate=ss_quantity@10 >= 21 AND ss_quantity@10 <= 25 AND (ss_list_price@12 >= Some(12200),5,2 AND ss_list_price@12 <= Some(13200),5,2 OR ss_coupon_amt@19 >= Some(83600),7,2 AND ss_coupon_amt@19 <= Some(183600),7,2 OR ss_wholesale_cost@11 >= Some(1700),5,2 AND ss_wholesale_cost@11 <= Some(3700),5,2), pruning_predicate=ss_quantity_null_count@1 != row_count@2 AND ss_quantity_max@0 >= 21 AND ss_quantity_null_count@1 != row_count@2 AND ss_quantity_min@3 <= 25 AND (ss_list_price_null_count@5 != row_count@2 AND ss_list_price_max@4 >= Some(12200),5,2 AND ss_list_price_null_count@5 != row_count@2 AND ss_list_price_min@6 <= Some(13200),5,2 OR ss_coupon_amt_null_count@8 != row_count@2 AND ss_coupon_amt_max@7 >= Some(83600),7,2 AND ss_coupon_amt_null_count@8 != row_count@2 AND ss_coupon_amt_min@9 <= Some(183600),7,2 OR ss_wholesale_cost_null_count@11 != row_count@2 AND ss_wholesale_cost_max@10 >= Some(1700),5,2 AND ss_wholesale_cost_null_count@11 != row_count@2 AND ss_wholesale_cost_min@12 <= Some(3700),5,2), required_guarantees=[] + │ DistributedLeafExec: DataSourceExec: file_groups={3 groups: [[/testdata/tpcds/plans_sf1_partitions4/store_sales/part-0.parquet:.., /testdata/tpcds/plans_sf1_partitions4/store_sales/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_sales/part-1.parquet:.., /testdata/tpcds/plans_sf1_partitions4/store_sales/part-2.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_sales/part-2.parquet:.., /testdata/tpcds/plans_sf1_partitions4/store_sales/part-3.parquet:..]]}, projection=[ss_quantity, ss_wholesale_cost, ss_list_price, ss_coupon_amt], file_type=parquet, predicate=ss_quantity@10 >= 21 AND ss_quantity@10 <= 25 AND (ss_list_price@12 >= Some(12200),5,2 AND ss_list_price@12 <= Some(13200),5,2 OR ss_coupon_amt@19 >= Some(83600),7,2 AND ss_coupon_amt@19 <= Some(183600),7,2 OR ss_wholesale_cost@11 >= Some(1700),5,2 AND ss_wholesale_cost@11 <= Some(3700),5,2), pruning_predicate=ss_quantity_null_count@1 != row_count@2 AND ss_quantity_max@0 >= 21 AND ss_quantity_null_count@1 != row_count@2 AND ss_quantity_min@3 <= 25 AND (ss_list_price_null_count@5 != row_count@2 AND ss_list_price_max@4 >= Some(12200),5,2 AND ss_list_price_null_count@5 != row_count@2 AND ss_list_price_min@6 <= Some(13200),5,2 OR ss_coupon_amt_null_count@8 != row_count@2 AND ss_coupon_amt_max@7 >= Some(83600),7,2 AND ss_coupon_amt_null_count@8 != row_count@2 AND ss_coupon_amt_min@9 <= Some(183600),7,2 OR ss_wholesale_cost_null_count@11 != row_count@2 AND ss_wholesale_cost_max@10 >= Some(1700),5,2 AND ss_wholesale_cost_null_count@11 != row_count@2 AND ss_wholesale_cost_min@12 <= Some(3700),5,2), required_guarantees=[] └────────────────────────────────────────────────── - ┌───── Stage 3 ── Tasks: t0:[p0..p1] t1:[p2..p3] t2:[p4..p5] + ┌───── Stage 3 ── Tasks: t0:[p0..p2] t1:[p3..p5] t2:[p6..p8] │ AggregateExec: mode=Partial, gby=[], aggr=[avg(store_sales.ss_list_price), count(store_sales.ss_list_price), count(DISTINCT store_sales.ss_list_price)] │ FilterExec: ss_quantity@0 >= 16 AND ss_quantity@0 <= 20 AND (ss_list_price@2 >= Some(13500),5,2 AND ss_list_price@2 <= Some(14500),5,2 OR ss_coupon_amt@3 >= Some(607100),7,2 AND ss_coupon_amt@3 <= Some(707100),7,2 OR ss_wholesale_cost@1 >= Some(3800),5,2 AND ss_wholesale_cost@1 <= Some(5800),5,2), projection=[ss_list_price@2] - │ PartitionIsolatorExec: tasks=3 partitions=6 - │ DataSourceExec: file_groups={6 groups: [[/testdata/tpcds/plans_sf1_partitions4/store_sales/part-0.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_sales/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_sales/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_sales/part-2.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_sales/part-2.parquet:..], ...]}, projection=[ss_quantity, ss_wholesale_cost, ss_list_price, ss_coupon_amt], file_type=parquet, predicate=ss_quantity@10 >= 16 AND ss_quantity@10 <= 20 AND (ss_list_price@12 >= Some(13500),5,2 AND ss_list_price@12 <= Some(14500),5,2 OR ss_coupon_amt@19 >= Some(607100),7,2 AND ss_coupon_amt@19 <= Some(707100),7,2 OR ss_wholesale_cost@11 >= Some(3800),5,2 AND ss_wholesale_cost@11 <= Some(5800),5,2), pruning_predicate=ss_quantity_null_count@1 != row_count@2 AND ss_quantity_max@0 >= 16 AND ss_quantity_null_count@1 != row_count@2 AND ss_quantity_min@3 <= 20 AND (ss_list_price_null_count@5 != row_count@2 AND ss_list_price_max@4 >= Some(13500),5,2 AND ss_list_price_null_count@5 != row_count@2 AND ss_list_price_min@6 <= Some(14500),5,2 OR ss_coupon_amt_null_count@8 != row_count@2 AND ss_coupon_amt_max@7 >= Some(607100),7,2 AND ss_coupon_amt_null_count@8 != row_count@2 AND ss_coupon_amt_min@9 <= Some(707100),7,2 OR ss_wholesale_cost_null_count@11 != row_count@2 AND ss_wholesale_cost_max@10 >= Some(3800),5,2 AND ss_wholesale_cost_null_count@11 != row_count@2 AND ss_wholesale_cost_min@12 <= Some(5800),5,2), required_guarantees=[] + │ DistributedLeafExec: DataSourceExec: file_groups={3 groups: [[/testdata/tpcds/plans_sf1_partitions4/store_sales/part-0.parquet:.., /testdata/tpcds/plans_sf1_partitions4/store_sales/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_sales/part-1.parquet:.., /testdata/tpcds/plans_sf1_partitions4/store_sales/part-2.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_sales/part-2.parquet:.., /testdata/tpcds/plans_sf1_partitions4/store_sales/part-3.parquet:..]]}, projection=[ss_quantity, ss_wholesale_cost, ss_list_price, ss_coupon_amt], file_type=parquet, predicate=ss_quantity@10 >= 16 AND ss_quantity@10 <= 20 AND (ss_list_price@12 >= Some(13500),5,2 AND ss_list_price@12 <= Some(14500),5,2 OR ss_coupon_amt@19 >= Some(607100),7,2 AND ss_coupon_amt@19 <= Some(707100),7,2 OR ss_wholesale_cost@11 >= Some(3800),5,2 AND ss_wholesale_cost@11 <= Some(5800),5,2), pruning_predicate=ss_quantity_null_count@1 != row_count@2 AND ss_quantity_max@0 >= 16 AND ss_quantity_null_count@1 != row_count@2 AND ss_quantity_min@3 <= 20 AND (ss_list_price_null_count@5 != row_count@2 AND ss_list_price_max@4 >= Some(13500),5,2 AND ss_list_price_null_count@5 != row_count@2 AND ss_list_price_min@6 <= Some(14500),5,2 OR ss_coupon_amt_null_count@8 != row_count@2 AND ss_coupon_amt_max@7 >= Some(607100),7,2 AND ss_coupon_amt_null_count@8 != row_count@2 AND ss_coupon_amt_min@9 <= Some(707100),7,2 OR ss_wholesale_cost_null_count@11 != row_count@2 AND ss_wholesale_cost_max@10 >= Some(3800),5,2 AND ss_wholesale_cost_null_count@11 != row_count@2 AND ss_wholesale_cost_min@12 <= Some(5800),5,2), required_guarantees=[] └────────────────────────────────────────────────── - ┌───── Stage 4 ── Tasks: t0:[p0..p1] t1:[p2..p3] t2:[p4..p5] + ┌───── Stage 4 ── Tasks: t0:[p0..p2] t1:[p3..p5] t2:[p6..p8] │ AggregateExec: mode=Partial, gby=[], aggr=[avg(store_sales.ss_list_price), count(store_sales.ss_list_price), count(DISTINCT store_sales.ss_list_price)] │ FilterExec: ss_quantity@0 >= 11 AND ss_quantity@0 <= 15 AND (ss_list_price@2 >= Some(14200),5,2 AND ss_list_price@2 <= Some(15200),5,2 OR ss_coupon_amt@3 >= Some(1221400),7,2 AND ss_coupon_amt@3 <= Some(1321400),7,2 OR ss_wholesale_cost@1 >= Some(7900),5,2 AND ss_wholesale_cost@1 <= Some(9900),5,2), projection=[ss_list_price@2] - │ PartitionIsolatorExec: tasks=3 partitions=6 - │ DataSourceExec: file_groups={6 groups: [[/testdata/tpcds/plans_sf1_partitions4/store_sales/part-0.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_sales/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_sales/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_sales/part-2.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_sales/part-2.parquet:..], ...]}, projection=[ss_quantity, ss_wholesale_cost, ss_list_price, ss_coupon_amt], file_type=parquet, predicate=ss_quantity@10 >= 11 AND ss_quantity@10 <= 15 AND (ss_list_price@12 >= Some(14200),5,2 AND ss_list_price@12 <= Some(15200),5,2 OR ss_coupon_amt@19 >= Some(1221400),7,2 AND ss_coupon_amt@19 <= Some(1321400),7,2 OR ss_wholesale_cost@11 >= Some(7900),5,2 AND ss_wholesale_cost@11 <= Some(9900),5,2), pruning_predicate=ss_quantity_null_count@1 != row_count@2 AND ss_quantity_max@0 >= 11 AND ss_quantity_null_count@1 != row_count@2 AND ss_quantity_min@3 <= 15 AND (ss_list_price_null_count@5 != row_count@2 AND ss_list_price_max@4 >= Some(14200),5,2 AND ss_list_price_null_count@5 != row_count@2 AND ss_list_price_min@6 <= Some(15200),5,2 OR ss_coupon_amt_null_count@8 != row_count@2 AND ss_coupon_amt_max@7 >= Some(1221400),7,2 AND ss_coupon_amt_null_count@8 != row_count@2 AND ss_coupon_amt_min@9 <= Some(1321400),7,2 OR ss_wholesale_cost_null_count@11 != row_count@2 AND ss_wholesale_cost_max@10 >= Some(7900),5,2 AND ss_wholesale_cost_null_count@11 != row_count@2 AND ss_wholesale_cost_min@12 <= Some(9900),5,2), required_guarantees=[] + │ DistributedLeafExec: DataSourceExec: file_groups={3 groups: [[/testdata/tpcds/plans_sf1_partitions4/store_sales/part-0.parquet:.., /testdata/tpcds/plans_sf1_partitions4/store_sales/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_sales/part-1.parquet:.., /testdata/tpcds/plans_sf1_partitions4/store_sales/part-2.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_sales/part-2.parquet:.., /testdata/tpcds/plans_sf1_partitions4/store_sales/part-3.parquet:..]]}, projection=[ss_quantity, ss_wholesale_cost, ss_list_price, ss_coupon_amt], file_type=parquet, predicate=ss_quantity@10 >= 11 AND ss_quantity@10 <= 15 AND (ss_list_price@12 >= Some(14200),5,2 AND ss_list_price@12 <= Some(15200),5,2 OR ss_coupon_amt@19 >= Some(1221400),7,2 AND ss_coupon_amt@19 <= Some(1321400),7,2 OR ss_wholesale_cost@11 >= Some(7900),5,2 AND ss_wholesale_cost@11 <= Some(9900),5,2), pruning_predicate=ss_quantity_null_count@1 != row_count@2 AND ss_quantity_max@0 >= 11 AND ss_quantity_null_count@1 != row_count@2 AND ss_quantity_min@3 <= 15 AND (ss_list_price_null_count@5 != row_count@2 AND ss_list_price_max@4 >= Some(14200),5,2 AND ss_list_price_null_count@5 != row_count@2 AND ss_list_price_min@6 <= Some(15200),5,2 OR ss_coupon_amt_null_count@8 != row_count@2 AND ss_coupon_amt_max@7 >= Some(1221400),7,2 AND ss_coupon_amt_null_count@8 != row_count@2 AND ss_coupon_amt_min@9 <= Some(1321400),7,2 OR ss_wholesale_cost_null_count@11 != row_count@2 AND ss_wholesale_cost_max@10 >= Some(7900),5,2 AND ss_wholesale_cost_null_count@11 != row_count@2 AND ss_wholesale_cost_min@12 <= Some(9900),5,2), required_guarantees=[] └────────────────────────────────────────────────── - ┌───── Stage 5 ── Tasks: t0:[p0..p1] t1:[p2..p3] t2:[p4..p5] + ┌───── Stage 5 ── Tasks: t0:[p0..p2] t1:[p3..p5] t2:[p6..p8] │ AggregateExec: mode=Partial, gby=[], aggr=[avg(store_sales.ss_list_price), count(store_sales.ss_list_price), count(DISTINCT store_sales.ss_list_price)] │ FilterExec: ss_quantity@0 >= 0 AND ss_quantity@0 <= 5 AND (ss_list_price@2 >= Some(800),5,2 AND ss_list_price@2 <= Some(1800),5,2 OR ss_coupon_amt@3 >= Some(45900),7,2 AND ss_coupon_amt@3 <= Some(145900),7,2 OR ss_wholesale_cost@1 >= Some(5700),5,2 AND ss_wholesale_cost@1 <= Some(7700),5,2), projection=[ss_list_price@2] - │ PartitionIsolatorExec: tasks=3 partitions=6 - │ DataSourceExec: file_groups={6 groups: [[/testdata/tpcds/plans_sf1_partitions4/store_sales/part-0.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_sales/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_sales/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_sales/part-2.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_sales/part-2.parquet:..], ...]}, projection=[ss_quantity, ss_wholesale_cost, ss_list_price, ss_coupon_amt], file_type=parquet, predicate=ss_quantity@10 >= 0 AND ss_quantity@10 <= 5 AND (ss_list_price@12 >= Some(800),5,2 AND ss_list_price@12 <= Some(1800),5,2 OR ss_coupon_amt@19 >= Some(45900),7,2 AND ss_coupon_amt@19 <= Some(145900),7,2 OR ss_wholesale_cost@11 >= Some(5700),5,2 AND ss_wholesale_cost@11 <= Some(7700),5,2), pruning_predicate=ss_quantity_null_count@1 != row_count@2 AND ss_quantity_max@0 >= 0 AND ss_quantity_null_count@1 != row_count@2 AND ss_quantity_min@3 <= 5 AND (ss_list_price_null_count@5 != row_count@2 AND ss_list_price_max@4 >= Some(800),5,2 AND ss_list_price_null_count@5 != row_count@2 AND ss_list_price_min@6 <= Some(1800),5,2 OR ss_coupon_amt_null_count@8 != row_count@2 AND ss_coupon_amt_max@7 >= Some(45900),7,2 AND ss_coupon_amt_null_count@8 != row_count@2 AND ss_coupon_amt_min@9 <= Some(145900),7,2 OR ss_wholesale_cost_null_count@11 != row_count@2 AND ss_wholesale_cost_max@10 >= Some(5700),5,2 AND ss_wholesale_cost_null_count@11 != row_count@2 AND ss_wholesale_cost_min@12 <= Some(7700),5,2), required_guarantees=[] + │ DistributedLeafExec: DataSourceExec: file_groups={3 groups: [[/testdata/tpcds/plans_sf1_partitions4/store_sales/part-0.parquet:.., /testdata/tpcds/plans_sf1_partitions4/store_sales/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_sales/part-1.parquet:.., /testdata/tpcds/plans_sf1_partitions4/store_sales/part-2.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_sales/part-2.parquet:.., /testdata/tpcds/plans_sf1_partitions4/store_sales/part-3.parquet:..]]}, projection=[ss_quantity, ss_wholesale_cost, ss_list_price, ss_coupon_amt], file_type=parquet, predicate=ss_quantity@10 >= 0 AND ss_quantity@10 <= 5 AND (ss_list_price@12 >= Some(800),5,2 AND ss_list_price@12 <= Some(1800),5,2 OR ss_coupon_amt@19 >= Some(45900),7,2 AND ss_coupon_amt@19 <= Some(145900),7,2 OR ss_wholesale_cost@11 >= Some(5700),5,2 AND ss_wholesale_cost@11 <= Some(7700),5,2), pruning_predicate=ss_quantity_null_count@1 != row_count@2 AND ss_quantity_max@0 >= 0 AND ss_quantity_null_count@1 != row_count@2 AND ss_quantity_min@3 <= 5 AND (ss_list_price_null_count@5 != row_count@2 AND ss_list_price_max@4 >= Some(800),5,2 AND ss_list_price_null_count@5 != row_count@2 AND ss_list_price_min@6 <= Some(1800),5,2 OR ss_coupon_amt_null_count@8 != row_count@2 AND ss_coupon_amt_max@7 >= Some(45900),7,2 AND ss_coupon_amt_null_count@8 != row_count@2 AND ss_coupon_amt_min@9 <= Some(145900),7,2 OR ss_wholesale_cost_null_count@11 != row_count@2 AND ss_wholesale_cost_max@10 >= Some(5700),5,2 AND ss_wholesale_cost_null_count@11 != row_count@2 AND ss_wholesale_cost_min@12 <= Some(7700),5,2), required_guarantees=[] └────────────────────────────────────────────────── - ┌───── Stage 6 ── Tasks: t0:[p0..p1] t1:[p2..p3] t2:[p4..p5] + ┌───── Stage 6 ── Tasks: t0:[p0..p2] t1:[p3..p5] t2:[p6..p8] │ AggregateExec: mode=Partial, gby=[], aggr=[avg(store_sales.ss_list_price), count(store_sales.ss_list_price), count(DISTINCT store_sales.ss_list_price)] │ FilterExec: ss_quantity@0 >= 6 AND ss_quantity@0 <= 10 AND (ss_list_price@2 >= Some(9000),5,2 AND ss_list_price@2 <= Some(10000),5,2 OR ss_coupon_amt@3 >= Some(232300),7,2 AND ss_coupon_amt@3 <= Some(332300),7,2 OR ss_wholesale_cost@1 >= Some(3100),5,2 AND ss_wholesale_cost@1 <= Some(5100),5,2), projection=[ss_list_price@2] - │ PartitionIsolatorExec: tasks=3 partitions=6 - │ DataSourceExec: file_groups={6 groups: [[/testdata/tpcds/plans_sf1_partitions4/store_sales/part-0.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_sales/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_sales/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_sales/part-2.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_sales/part-2.parquet:..], ...]}, projection=[ss_quantity, ss_wholesale_cost, ss_list_price, ss_coupon_amt], file_type=parquet, predicate=ss_quantity@10 >= 6 AND ss_quantity@10 <= 10 AND (ss_list_price@12 >= Some(9000),5,2 AND ss_list_price@12 <= Some(10000),5,2 OR ss_coupon_amt@19 >= Some(232300),7,2 AND ss_coupon_amt@19 <= Some(332300),7,2 OR ss_wholesale_cost@11 >= Some(3100),5,2 AND ss_wholesale_cost@11 <= Some(5100),5,2), pruning_predicate=ss_quantity_null_count@1 != row_count@2 AND ss_quantity_max@0 >= 6 AND ss_quantity_null_count@1 != row_count@2 AND ss_quantity_min@3 <= 10 AND (ss_list_price_null_count@5 != row_count@2 AND ss_list_price_max@4 >= Some(9000),5,2 AND ss_list_price_null_count@5 != row_count@2 AND ss_list_price_min@6 <= Some(10000),5,2 OR ss_coupon_amt_null_count@8 != row_count@2 AND ss_coupon_amt_max@7 >= Some(232300),7,2 AND ss_coupon_amt_null_count@8 != row_count@2 AND ss_coupon_amt_min@9 <= Some(332300),7,2 OR ss_wholesale_cost_null_count@11 != row_count@2 AND ss_wholesale_cost_max@10 >= Some(3100),5,2 AND ss_wholesale_cost_null_count@11 != row_count@2 AND ss_wholesale_cost_min@12 <= Some(5100),5,2), required_guarantees=[] + │ DistributedLeafExec: DataSourceExec: file_groups={3 groups: [[/testdata/tpcds/plans_sf1_partitions4/store_sales/part-0.parquet:.., /testdata/tpcds/plans_sf1_partitions4/store_sales/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_sales/part-1.parquet:.., /testdata/tpcds/plans_sf1_partitions4/store_sales/part-2.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_sales/part-2.parquet:.., /testdata/tpcds/plans_sf1_partitions4/store_sales/part-3.parquet:..]]}, projection=[ss_quantity, ss_wholesale_cost, ss_list_price, ss_coupon_amt], file_type=parquet, predicate=ss_quantity@10 >= 6 AND ss_quantity@10 <= 10 AND (ss_list_price@12 >= Some(9000),5,2 AND ss_list_price@12 <= Some(10000),5,2 OR ss_coupon_amt@19 >= Some(232300),7,2 AND ss_coupon_amt@19 <= Some(332300),7,2 OR ss_wholesale_cost@11 >= Some(3100),5,2 AND ss_wholesale_cost@11 <= Some(5100),5,2), pruning_predicate=ss_quantity_null_count@1 != row_count@2 AND ss_quantity_max@0 >= 6 AND ss_quantity_null_count@1 != row_count@2 AND ss_quantity_min@3 <= 10 AND (ss_list_price_null_count@5 != row_count@2 AND ss_list_price_max@4 >= Some(9000),5,2 AND ss_list_price_null_count@5 != row_count@2 AND ss_list_price_min@6 <= Some(10000),5,2 OR ss_coupon_amt_null_count@8 != row_count@2 AND ss_coupon_amt_max@7 >= Some(232300),7,2 AND ss_coupon_amt_null_count@8 != row_count@2 AND ss_coupon_amt_min@9 <= Some(332300),7,2 OR ss_wholesale_cost_null_count@11 != row_count@2 AND ss_wholesale_cost_max@10 >= Some(3100),5,2 AND ss_wholesale_cost_null_count@11 != row_count@2 AND ss_wholesale_cost_min@12 <= Some(5100),5,2), required_guarantees=[] └────────────────────────────────────────────────── "); Ok(()) @@ -3342,7 +3075,7 @@ mod tests { #[tokio::test] async fn test_tpcds_29() -> Result<()> { let display = test_tpcds_query("q29").await?; - assert_snapshot!(display, @" + assert_snapshot!(display, @r" ┌───── DistributedExec ── Tasks: t0:[p0] │ SortPreservingMergeExec: [i_item_id@0 ASC NULLS LAST, i_item_desc@1 ASC NULLS LAST, s_store_id@2 ASC NULLS LAST, s_store_name@3 ASC NULLS LAST], fetch=100 │ [Stage 11] => NetworkCoalesceExec: output_partitions=6, input_tasks=2 @@ -3360,8 +3093,7 @@ mod tests { │ CoalescePartitionsExec │ [Stage 9] => NetworkBroadcastExec: partitions_per_consumer=3, stage_partitions=9, input_tasks=3 │ RepartitionExec: partitioning=RoundRobinBatch(3), input_partitions=2 - │ PartitionIsolatorExec: tasks=3 partitions=4 - │ DataSourceExec: file_groups={4 groups: [[/testdata/tpcds/plans_sf1_partitions4/item/part-0.parquet], [/testdata/tpcds/plans_sf1_partitions4/item/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/item/part-2.parquet], [/testdata/tpcds/plans_sf1_partitions4/item/part-3.parquet]]}, projection=[i_item_sk, i_item_id, i_item_desc], file_type=parquet, predicate=DynamicFilter [ empty ] + │ DistributedLeafExec: DataSourceExec: file_groups={2 groups: [[/testdata/tpcds/plans_sf1_partitions4/item/part-0.parquet, /testdata/tpcds/plans_sf1_partitions4/item/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/item/part-2.parquet, /testdata/tpcds/plans_sf1_partitions4/item/part-3.parquet]]}, projection=[i_item_sk, i_item_id, i_item_desc], file_type=parquet, predicate=DynamicFilter [ empty ] └────────────────────────────────────────────────── ┌───── Stage 9 ── Tasks: t0:[p0..p8] t1:[p9..p17] t2:[p18..p26] │ BroadcastExec: input_partitions=3, consumer_tasks=3, output_partitions=9 @@ -3375,13 +3107,11 @@ mod tests { │ ProjectionExec: expr=[d_date_sk@0 as d_date_sk, CAST(d_date_sk@0 AS Float64) as CAST(d3.d_date_sk AS Float64)] │ FilterExec: d_year@1 = 1999 OR d_year@1 = 2000 OR d_year@1 = 2001, projection=[d_date_sk@0] │ RepartitionExec: partitioning=RoundRobinBatch(3), input_partitions=2 - │ PartitionIsolatorExec: tasks=3 partitions=4 - │ DataSourceExec: file_groups={4 groups: [[/testdata/tpcds/plans_sf1_partitions4/date_dim/part-0.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-2.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-3.parquet]]}, projection=[d_date_sk, d_year], file_type=parquet, predicate=(d_year@6 = 1999 OR d_year@6 = 2000 OR d_year@6 = 2001) AND DynamicFilter [ empty ], pruning_predicate=d_year_null_count@2 != row_count@3 AND d_year_min@0 <= 1999 AND 1999 <= d_year_max@1 OR d_year_null_count@2 != row_count@3 AND d_year_min@0 <= 2000 AND 2000 <= d_year_max@1 OR d_year_null_count@2 != row_count@3 AND d_year_min@0 <= 2001 AND 2001 <= d_year_max@1, required_guarantees=[d_year in (1999, 2000, 2001)] + │ DistributedLeafExec: DataSourceExec: file_groups={2 groups: [[/testdata/tpcds/plans_sf1_partitions4/date_dim/part-0.parquet, /testdata/tpcds/plans_sf1_partitions4/date_dim/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-2.parquet, /testdata/tpcds/plans_sf1_partitions4/date_dim/part-3.parquet]]}, projection=[d_date_sk, d_year], file_type=parquet, predicate=(d_year@6 = 1999 OR d_year@6 = 2000 OR d_year@6 = 2001) AND DynamicFilter [ empty ], pruning_predicate=d_year_null_count@2 != row_count@3 AND d_year_min@0 <= 1999 AND 1999 <= d_year_max@1 OR d_year_null_count@2 != row_count@3 AND d_year_min@0 <= 2000 AND 2000 <= d_year_max@1 OR d_year_null_count@2 != row_count@3 AND d_year_min@0 <= 2001 AND 2001 <= d_year_max@1, required_guarantees=[d_year in (1999, 2000, 2001)] └────────────────────────────────────────────────── ┌───── Stage 1 ── Tasks: t0:[p0..p5] t1:[p6..p11] │ BroadcastExec: input_partitions=2, consumer_tasks=3, output_partitions=6 - │ PartitionIsolatorExec: tasks=2 partitions=4 - │ DataSourceExec: file_groups={4 groups: [[/testdata/tpcds/plans_sf1_partitions4/store/part-0.parquet], [/testdata/tpcds/plans_sf1_partitions4/store/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/store/part-2.parquet], [/testdata/tpcds/plans_sf1_partitions4/store/part-3.parquet]]}, projection=[s_store_sk, s_store_id, s_store_name, CAST(s_store_sk@0 AS Float64) as CAST(store.s_store_sk AS Float64)], file_type=parquet + │ DistributedLeafExec: DataSourceExec: file_groups={2 groups: [[/testdata/tpcds/plans_sf1_partitions4/store/part-0.parquet, /testdata/tpcds/plans_sf1_partitions4/store/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/store/part-2.parquet, /testdata/tpcds/plans_sf1_partitions4/store/part-3.parquet]]}, projection=[s_store_sk, s_store_id, s_store_name, CAST(s_store_sk@0 AS Float64) as CAST(store.s_store_sk AS Float64)], file_type=parquet └────────────────────────────────────────────────── ┌───── Stage 8 ── Tasks: t0:[p0..p8] t1:[p9..p17] t2:[p18..p26] │ BroadcastExec: input_partitions=3, consumer_tasks=3, output_partitions=9 @@ -3401,21 +3131,18 @@ mod tests { │ ProjectionExec: expr=[d_date_sk@0 as d_date_sk, CAST(d_date_sk@0 AS Float64) as CAST(d2.d_date_sk AS Float64)] │ FilterExec: d_moy@2 >= 9 AND d_moy@2 <= 12 AND d_year@1 = 1999, projection=[d_date_sk@0] │ RepartitionExec: partitioning=RoundRobinBatch(3), input_partitions=2 - │ PartitionIsolatorExec: tasks=2 partitions=4 - │ DataSourceExec: file_groups={4 groups: [[/testdata/tpcds/plans_sf1_partitions4/date_dim/part-0.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-2.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-3.parquet]]}, projection=[d_date_sk, d_year, d_moy], file_type=parquet, predicate=d_moy@8 >= 9 AND d_moy@8 <= 12 AND d_year@6 = 1999, pruning_predicate=d_moy_null_count@1 != row_count@2 AND d_moy_max@0 >= 9 AND d_moy_null_count@1 != row_count@2 AND d_moy_min@3 <= 12 AND d_year_null_count@6 != row_count@2 AND d_year_min@4 <= 1999 AND 1999 <= d_year_max@5, required_guarantees=[d_year in (1999)] + │ DistributedLeafExec: DataSourceExec: file_groups={2 groups: [[/testdata/tpcds/plans_sf1_partitions4/date_dim/part-0.parquet, /testdata/tpcds/plans_sf1_partitions4/date_dim/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-2.parquet, /testdata/tpcds/plans_sf1_partitions4/date_dim/part-3.parquet]]}, projection=[d_date_sk, d_year, d_moy], file_type=parquet, predicate=d_moy@8 >= 9 AND d_moy@8 <= 12 AND d_year@6 = 1999, pruning_predicate=d_moy_null_count@1 != row_count@2 AND d_moy_max@0 >= 9 AND d_moy_null_count@1 != row_count@2 AND d_moy_min@3 <= 12 AND d_year_null_count@6 != row_count@2 AND d_year_min@4 <= 1999 AND 1999 <= d_year_max@5, required_guarantees=[d_year in (1999)] └────────────────────────────────────────────────── ┌───── Stage 3 ── Tasks: t0:[p0..p8] t1:[p9..p17] │ BroadcastExec: input_partitions=3, consumer_tasks=3, output_partitions=9 │ ProjectionExec: expr=[d_date_sk@0 as d_date_sk, CAST(d_date_sk@0 AS Float64) as CAST(d1.d_date_sk AS Float64)] │ FilterExec: d_moy@2 = 9 AND d_year@1 = 1999, projection=[d_date_sk@0] │ RepartitionExec: partitioning=RoundRobinBatch(3), input_partitions=2 - │ PartitionIsolatorExec: tasks=2 partitions=4 - │ DataSourceExec: file_groups={4 groups: [[/testdata/tpcds/plans_sf1_partitions4/date_dim/part-0.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-2.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-3.parquet]]}, projection=[d_date_sk, d_year, d_moy], file_type=parquet, predicate=d_moy@8 = 9 AND d_year@6 = 1999, pruning_predicate=d_moy_null_count@2 != row_count@3 AND d_moy_min@0 <= 9 AND 9 <= d_moy_max@1 AND d_year_null_count@6 != row_count@3 AND d_year_min@4 <= 1999 AND 1999 <= d_year_max@5, required_guarantees=[d_moy in (9), d_year in (1999)] + │ DistributedLeafExec: DataSourceExec: file_groups={2 groups: [[/testdata/tpcds/plans_sf1_partitions4/date_dim/part-0.parquet, /testdata/tpcds/plans_sf1_partitions4/date_dim/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-2.parquet, /testdata/tpcds/plans_sf1_partitions4/date_dim/part-3.parquet]]}, projection=[d_date_sk, d_year, d_moy], file_type=parquet, predicate=d_moy@8 = 9 AND d_year@6 = 1999, pruning_predicate=d_moy_null_count@2 != row_count@3 AND d_moy_min@0 <= 9 AND 9 <= d_moy_max@1 AND d_year_null_count@6 != row_count@3 AND d_year_min@4 <= 1999 AND 1999 <= d_year_max@5, required_guarantees=[d_moy in (9), d_year in (1999)] └────────────────────────────────────────────────── ┌───── Stage 4 ── Tasks: t0:[p0..p8] t1:[p0..p8] t2:[p0..p8] - │ RepartitionExec: partitioning=Hash([cs_bill_customer_sk@1, cs_item_sk@2], 9), input_partitions=2 - │ PartitionIsolatorExec: tasks=3 partitions=6 - │ DataSourceExec: file_groups={6 groups: [[/testdata/tpcds/plans_sf1_partitions4/catalog_sales/part-0.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/catalog_sales/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/catalog_sales/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/catalog_sales/part-2.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/catalog_sales/part-2.parquet:..], ...]}, projection=[cs_sold_date_sk, cs_bill_customer_sk, cs_item_sk, cs_quantity], file_type=parquet + │ RepartitionExec: partitioning=Hash([cs_bill_customer_sk@1, cs_item_sk@2], 9), input_partitions=3 + │ DistributedLeafExec: DataSourceExec: file_groups={3 groups: [[/testdata/tpcds/plans_sf1_partitions4/catalog_sales/part-0.parquet:.., /testdata/tpcds/plans_sf1_partitions4/catalog_sales/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/catalog_sales/part-1.parquet:.., /testdata/tpcds/plans_sf1_partitions4/catalog_sales/part-2.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/catalog_sales/part-2.parquet:.., /testdata/tpcds/plans_sf1_partitions4/catalog_sales/part-3.parquet:..]]}, projection=[cs_sold_date_sk, cs_bill_customer_sk, cs_item_sk, cs_quantity], file_type=parquet └────────────────────────────────────────────────── ┌───── Stage 7 ── Tasks: t0:[p0..p8] t1:[p0..p8] t2:[p0..p8] │ RepartitionExec: partitioning=Hash([sr_customer_sk@6, sr_item_sk@5], 9), input_partitions=3 @@ -3425,14 +3152,12 @@ mod tests { │ [Stage 6] => NetworkShuffleExec: output_partitions=3, input_tasks=3 └────────────────────────────────────────────────── ┌───── Stage 5 ── Tasks: t0:[p0..p8] t1:[p0..p8] t2:[p0..p8] - │ RepartitionExec: partitioning=Hash([sr_customer_sk@2, sr_item_sk@1, sr_ticket_number@3], 9), input_partitions=2 - │ PartitionIsolatorExec: tasks=3 partitions=6 - │ DataSourceExec: file_groups={6 groups: [[/testdata/tpcds/plans_sf1_partitions4/store_returns/part-0.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_returns/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_returns/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_returns/part-2.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_returns/part-2.parquet:..], ...]}, projection=[sr_returned_date_sk, sr_item_sk, sr_customer_sk, sr_ticket_number, sr_return_quantity], file_type=parquet, predicate=DynamicFilter [ empty ] AND DynamicFilter [ empty ] + │ RepartitionExec: partitioning=Hash([sr_customer_sk@2, sr_item_sk@1, sr_ticket_number@3], 9), input_partitions=3 + │ DistributedLeafExec: DataSourceExec: file_groups={3 groups: [[/testdata/tpcds/plans_sf1_partitions4/store_returns/part-0.parquet:.., /testdata/tpcds/plans_sf1_partitions4/store_returns/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_returns/part-1.parquet:.., /testdata/tpcds/plans_sf1_partitions4/store_returns/part-2.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_returns/part-2.parquet:.., /testdata/tpcds/plans_sf1_partitions4/store_returns/part-3.parquet:..]]}, projection=[sr_returned_date_sk, sr_item_sk, sr_customer_sk, sr_ticket_number, sr_return_quantity], file_type=parquet, predicate=DynamicFilter [ empty ] AND DynamicFilter [ empty ] └────────────────────────────────────────────────── ┌───── Stage 6 ── Tasks: t0:[p0..p8] t1:[p0..p8] t2:[p0..p8] - │ RepartitionExec: partitioning=Hash([ss_customer_sk@2, ss_item_sk@1, ss_ticket_number@4], 9), input_partitions=2 - │ PartitionIsolatorExec: tasks=3 partitions=6 - │ DataSourceExec: file_groups={6 groups: [[/testdata/tpcds/plans_sf1_partitions4/store_sales/part-0.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_sales/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_sales/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_sales/part-2.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_sales/part-2.parquet:..], ...]}, projection=[ss_sold_date_sk, ss_item_sk, ss_customer_sk, ss_store_sk, ss_ticket_number, ss_quantity], file_type=parquet, predicate=DynamicFilter [ empty ] AND DynamicFilter [ empty ] AND DynamicFilter [ empty ] + │ RepartitionExec: partitioning=Hash([ss_customer_sk@2, ss_item_sk@1, ss_ticket_number@4], 9), input_partitions=3 + │ DistributedLeafExec: DataSourceExec: file_groups={3 groups: [[/testdata/tpcds/plans_sf1_partitions4/store_sales/part-0.parquet:.., /testdata/tpcds/plans_sf1_partitions4/store_sales/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_sales/part-1.parquet:.., /testdata/tpcds/plans_sf1_partitions4/store_sales/part-2.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_sales/part-2.parquet:.., /testdata/tpcds/plans_sf1_partitions4/store_sales/part-3.parquet:..]]}, projection=[ss_sold_date_sk, ss_item_sk, ss_customer_sk, ss_store_sk, ss_ticket_number, ss_quantity], file_type=parquet, predicate=DynamicFilter [ empty ] AND DynamicFilter [ empty ] AND DynamicFilter [ empty ] └────────────────────────────────────────────────── "); Ok(()) @@ -3546,7 +3271,7 @@ mod tests { #[tokio::test] async fn test_tpcds_31() -> Result<()> { let display = test_tpcds_query("q31").await?; - assert_snapshot!(display, @" + assert_snapshot!(display, @r" ┌───── DistributedExec ── Tasks: t0:[p0] │ SortPreservingMergeExec: [ca_county@0 ASC NULLS LAST] │ [Stage 24] => NetworkCoalesceExec: output_partitions=6, input_tasks=2 @@ -3595,28 +3320,25 @@ mod tests { │ AggregateExec: mode=Partial, gby=[ca_county@3 as ca_county, d_qoy@2 as d_qoy, d_year@1 as d_year], aggr=[sum(web_sales.ws_ext_sales_price)], ordering_mode=PartiallySorted([1, 2]) │ HashJoinExec: mode=CollectLeft, join_type=Inner, on=[(ws_bill_addr_sk@0, CAST(customer_address.ca_address_sk AS Float64)@2)], projection=[ws_ext_sales_price@1, d_year@2, d_qoy@3, ca_county@5] │ CoalescePartitionsExec - │ [Stage 2] => NetworkBroadcastExec: partitions_per_consumer=2, stage_partitions=6, input_tasks=3 + │ [Stage 2] => NetworkBroadcastExec: partitions_per_consumer=3, stage_partitions=9, input_tasks=3 │ ProjectionExec: expr=[ca_address_sk@0 as ca_address_sk, ca_county@1 as ca_county, CAST(ca_address_sk@0 AS Float64) as CAST(customer_address.ca_address_sk AS Float64)] │ RepartitionExec: partitioning=RoundRobinBatch(3), input_partitions=2 - │ PartitionIsolatorExec: tasks=3 partitions=4 - │ DataSourceExec: file_groups={4 groups: [[/testdata/tpcds/plans_sf1_partitions4/customer_address/part-0.parquet], [/testdata/tpcds/plans_sf1_partitions4/customer_address/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/customer_address/part-2.parquet], [/testdata/tpcds/plans_sf1_partitions4/customer_address/part-3.parquet]]}, projection=[ca_address_sk, ca_county], file_type=parquet, predicate=DynamicFilter [ empty ] + │ DistributedLeafExec: DataSourceExec: file_groups={2 groups: [[/testdata/tpcds/plans_sf1_partitions4/customer_address/part-0.parquet, /testdata/tpcds/plans_sf1_partitions4/customer_address/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/customer_address/part-2.parquet, /testdata/tpcds/plans_sf1_partitions4/customer_address/part-3.parquet]]}, projection=[ca_address_sk, ca_county], file_type=parquet, predicate=DynamicFilter [ empty ] └────────────────────────────────────────────────── - ┌───── Stage 2 ── Tasks: t0:[p0..p5] t1:[p6..p11] t2:[p12..p17] - │ BroadcastExec: input_partitions=2, consumer_tasks=3, output_partitions=6 + ┌───── Stage 2 ── Tasks: t0:[p0..p8] t1:[p9..p17] t2:[p18..p26] + │ BroadcastExec: input_partitions=3, consumer_tasks=3, output_partitions=9 │ ProjectionExec: expr=[ws_bill_addr_sk@2 as ws_bill_addr_sk, ws_ext_sales_price@3 as ws_ext_sales_price, d_year@0 as d_year, d_qoy@1 as d_qoy] │ HashJoinExec: mode=CollectLeft, join_type=Inner, on=[(CAST(date_dim.d_date_sk AS Float64)@3, ws_sold_date_sk@0)], projection=[d_year@1, d_qoy@2, ws_bill_addr_sk@5, ws_ext_sales_price@6] │ CoalescePartitionsExec │ [Stage 1] => NetworkBroadcastExec: partitions_per_consumer=3, stage_partitions=9, input_tasks=2 - │ PartitionIsolatorExec: tasks=3 partitions=6 - │ DataSourceExec: file_groups={6 groups: [[/testdata/tpcds/plans_sf1_partitions4/web_sales/part-0.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/web_sales/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/web_sales/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/web_sales/part-2.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/web_sales/part-2.parquet:..], ...]}, projection=[ws_sold_date_sk, ws_bill_addr_sk, ws_ext_sales_price], file_type=parquet, predicate=DynamicFilter [ empty ] + │ DistributedLeafExec: DataSourceExec: file_groups={3 groups: [[/testdata/tpcds/plans_sf1_partitions4/web_sales/part-0.parquet:.., /testdata/tpcds/plans_sf1_partitions4/web_sales/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/web_sales/part-1.parquet:.., /testdata/tpcds/plans_sf1_partitions4/web_sales/part-2.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/web_sales/part-2.parquet:.., /testdata/tpcds/plans_sf1_partitions4/web_sales/part-3.parquet:..]]}, projection=[ws_sold_date_sk, ws_bill_addr_sk, ws_ext_sales_price], file_type=parquet, predicate=DynamicFilter [ empty ] └────────────────────────────────────────────────── ┌───── Stage 1 ── Tasks: t0:[p0..p8] t1:[p9..p17] │ BroadcastExec: input_partitions=3, consumer_tasks=3, output_partitions=9 │ ProjectionExec: expr=[d_date_sk@0 as d_date_sk, d_year@1 as d_year, d_qoy@2 as d_qoy, CAST(d_date_sk@0 AS Float64) as CAST(date_dim.d_date_sk AS Float64)] │ FilterExec: d_qoy@2 = 1 AND d_year@1 = 2000 │ RepartitionExec: partitioning=RoundRobinBatch(3), input_partitions=2 - │ PartitionIsolatorExec: tasks=2 partitions=4 - │ DataSourceExec: file_groups={4 groups: [[/testdata/tpcds/plans_sf1_partitions4/date_dim/part-0.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-2.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-3.parquet]]}, projection=[d_date_sk, d_year, d_qoy], file_type=parquet, predicate=d_qoy@10 = 1 AND d_year@6 = 2000, pruning_predicate=d_qoy_null_count@2 != row_count@3 AND d_qoy_min@0 <= 1 AND 1 <= d_qoy_max@1 AND d_year_null_count@6 != row_count@3 AND d_year_min@4 <= 2000 AND 2000 <= d_year_max@5, required_guarantees=[d_qoy in (1), d_year in (2000)] + │ DistributedLeafExec: DataSourceExec: file_groups={2 groups: [[/testdata/tpcds/plans_sf1_partitions4/date_dim/part-0.parquet, /testdata/tpcds/plans_sf1_partitions4/date_dim/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-2.parquet, /testdata/tpcds/plans_sf1_partitions4/date_dim/part-3.parquet]]}, projection=[d_date_sk, d_year, d_qoy], file_type=parquet, predicate=d_qoy@10 = 1 AND d_year@6 = 2000, pruning_predicate=d_qoy_null_count@2 != row_count@3 AND d_qoy_min@0 <= 1 AND 1 <= d_qoy_max@1 AND d_year_null_count@6 != row_count@3 AND d_year_min@4 <= 2000 AND 2000 <= d_year_max@5, required_guarantees=[d_qoy in (1), d_year in (2000)] └────────────────────────────────────────────────── ┌───── Stage 12 ── Tasks: t0:[p0..p5] t1:[p6..p11] │ BroadcastExec: input_partitions=3, consumer_tasks=2, output_partitions=6 @@ -3638,140 +3360,125 @@ mod tests { │ AggregateExec: mode=Partial, gby=[ca_county@3 as ca_county, d_qoy@2 as d_qoy, d_year@1 as d_year], aggr=[sum(store_sales.ss_ext_sales_price)], ordering_mode=PartiallySorted([1, 2]) │ HashJoinExec: mode=CollectLeft, join_type=Inner, on=[(ss_addr_sk@0, CAST(customer_address.ca_address_sk AS Float64)@2)], projection=[ss_ext_sales_price@1, d_year@2, d_qoy@3, ca_county@5] │ CoalescePartitionsExec - │ [Stage 6] => NetworkBroadcastExec: partitions_per_consumer=2, stage_partitions=6, input_tasks=3 + │ [Stage 6] => NetworkBroadcastExec: partitions_per_consumer=3, stage_partitions=9, input_tasks=3 │ ProjectionExec: expr=[ca_address_sk@0 as ca_address_sk, ca_county@1 as ca_county, CAST(ca_address_sk@0 AS Float64) as CAST(customer_address.ca_address_sk AS Float64)] │ RepartitionExec: partitioning=RoundRobinBatch(3), input_partitions=2 - │ PartitionIsolatorExec: tasks=3 partitions=4 - │ DataSourceExec: file_groups={4 groups: [[/testdata/tpcds/plans_sf1_partitions4/customer_address/part-0.parquet], [/testdata/tpcds/plans_sf1_partitions4/customer_address/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/customer_address/part-2.parquet], [/testdata/tpcds/plans_sf1_partitions4/customer_address/part-3.parquet]]}, projection=[ca_address_sk, ca_county], file_type=parquet, predicate=DynamicFilter [ empty ] + │ DistributedLeafExec: DataSourceExec: file_groups={2 groups: [[/testdata/tpcds/plans_sf1_partitions4/customer_address/part-0.parquet, /testdata/tpcds/plans_sf1_partitions4/customer_address/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/customer_address/part-2.parquet, /testdata/tpcds/plans_sf1_partitions4/customer_address/part-3.parquet]]}, projection=[ca_address_sk, ca_county], file_type=parquet, predicate=DynamicFilter [ empty ] └────────────────────────────────────────────────── - ┌───── Stage 6 ── Tasks: t0:[p0..p5] t1:[p6..p11] t2:[p12..p17] - │ BroadcastExec: input_partitions=2, consumer_tasks=3, output_partitions=6 + ┌───── Stage 6 ── Tasks: t0:[p0..p8] t1:[p9..p17] t2:[p18..p26] + │ BroadcastExec: input_partitions=3, consumer_tasks=3, output_partitions=9 │ ProjectionExec: expr=[ss_addr_sk@2 as ss_addr_sk, ss_ext_sales_price@3 as ss_ext_sales_price, d_year@0 as d_year, d_qoy@1 as d_qoy] │ HashJoinExec: mode=CollectLeft, join_type=Inner, on=[(CAST(date_dim.d_date_sk AS Float64)@3, ss_sold_date_sk@0)], projection=[d_year@1, d_qoy@2, ss_addr_sk@5, ss_ext_sales_price@6] │ CoalescePartitionsExec │ [Stage 5] => NetworkBroadcastExec: partitions_per_consumer=3, stage_partitions=9, input_tasks=2 - │ PartitionIsolatorExec: tasks=3 partitions=6 - │ DataSourceExec: file_groups={6 groups: [[/testdata/tpcds/plans_sf1_partitions4/store_sales/part-0.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_sales/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_sales/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_sales/part-2.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_sales/part-2.parquet:..], ...]}, projection=[ss_sold_date_sk, ss_addr_sk, ss_ext_sales_price], file_type=parquet, predicate=DynamicFilter [ empty ] + │ DistributedLeafExec: DataSourceExec: file_groups={3 groups: [[/testdata/tpcds/plans_sf1_partitions4/store_sales/part-0.parquet:.., /testdata/tpcds/plans_sf1_partitions4/store_sales/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_sales/part-1.parquet:.., /testdata/tpcds/plans_sf1_partitions4/store_sales/part-2.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_sales/part-2.parquet:.., /testdata/tpcds/plans_sf1_partitions4/store_sales/part-3.parquet:..]]}, projection=[ss_sold_date_sk, ss_addr_sk, ss_ext_sales_price], file_type=parquet, predicate=DynamicFilter [ empty ] └────────────────────────────────────────────────── ┌───── Stage 5 ── Tasks: t0:[p0..p8] t1:[p9..p17] │ BroadcastExec: input_partitions=3, consumer_tasks=3, output_partitions=9 │ ProjectionExec: expr=[d_date_sk@0 as d_date_sk, d_year@1 as d_year, d_qoy@2 as d_qoy, CAST(d_date_sk@0 AS Float64) as CAST(date_dim.d_date_sk AS Float64)] │ FilterExec: d_qoy@2 = 1 AND d_year@1 = 2000 │ RepartitionExec: partitioning=RoundRobinBatch(3), input_partitions=2 - │ PartitionIsolatorExec: tasks=2 partitions=4 - │ DataSourceExec: file_groups={4 groups: [[/testdata/tpcds/plans_sf1_partitions4/date_dim/part-0.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-2.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-3.parquet]]}, projection=[d_date_sk, d_year, d_qoy], file_type=parquet, predicate=d_qoy@10 = 1 AND d_year@6 = 2000, pruning_predicate=d_qoy_null_count@2 != row_count@3 AND d_qoy_min@0 <= 1 AND 1 <= d_qoy_max@1 AND d_year_null_count@6 != row_count@3 AND d_year_min@4 <= 2000 AND 2000 <= d_year_max@5, required_guarantees=[d_qoy in (1), d_year in (2000)] + │ DistributedLeafExec: DataSourceExec: file_groups={2 groups: [[/testdata/tpcds/plans_sf1_partitions4/date_dim/part-0.parquet, /testdata/tpcds/plans_sf1_partitions4/date_dim/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-2.parquet, /testdata/tpcds/plans_sf1_partitions4/date_dim/part-3.parquet]]}, projection=[d_date_sk, d_year, d_qoy], file_type=parquet, predicate=d_qoy@10 = 1 AND d_year@6 = 2000, pruning_predicate=d_qoy_null_count@2 != row_count@3 AND d_qoy_min@0 <= 1 AND 1 <= d_qoy_max@1 AND d_year_null_count@6 != row_count@3 AND d_year_min@4 <= 2000 AND 2000 <= d_year_max@5, required_guarantees=[d_qoy in (1), d_year in (2000)] └────────────────────────────────────────────────── ┌───── Stage 11 ── Tasks: t0:[p0..p5] t1:[p0..p5] t2:[p0..p5] │ RepartitionExec: partitioning=Hash([ca_county@0, d_qoy@1, d_year@2], 6), input_partitions=3 │ AggregateExec: mode=Partial, gby=[ca_county@3 as ca_county, d_qoy@2 as d_qoy, d_year@1 as d_year], aggr=[sum(store_sales.ss_ext_sales_price)], ordering_mode=PartiallySorted([1, 2]) │ HashJoinExec: mode=CollectLeft, join_type=Inner, on=[(ss_addr_sk@0, CAST(customer_address.ca_address_sk AS Float64)@2)], projection=[ss_ext_sales_price@1, d_year@2, d_qoy@3, ca_county@5] │ CoalescePartitionsExec - │ [Stage 10] => NetworkBroadcastExec: partitions_per_consumer=2, stage_partitions=6, input_tasks=3 + │ [Stage 10] => NetworkBroadcastExec: partitions_per_consumer=3, stage_partitions=9, input_tasks=3 │ ProjectionExec: expr=[ca_address_sk@0 as ca_address_sk, ca_county@1 as ca_county, CAST(ca_address_sk@0 AS Float64) as CAST(customer_address.ca_address_sk AS Float64)] │ RepartitionExec: partitioning=RoundRobinBatch(3), input_partitions=2 - │ PartitionIsolatorExec: tasks=3 partitions=4 - │ DataSourceExec: file_groups={4 groups: [[/testdata/tpcds/plans_sf1_partitions4/customer_address/part-0.parquet], [/testdata/tpcds/plans_sf1_partitions4/customer_address/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/customer_address/part-2.parquet], [/testdata/tpcds/plans_sf1_partitions4/customer_address/part-3.parquet]]}, projection=[ca_address_sk, ca_county], file_type=parquet, predicate=DynamicFilter [ empty ] + │ DistributedLeafExec: DataSourceExec: file_groups={2 groups: [[/testdata/tpcds/plans_sf1_partitions4/customer_address/part-0.parquet, /testdata/tpcds/plans_sf1_partitions4/customer_address/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/customer_address/part-2.parquet, /testdata/tpcds/plans_sf1_partitions4/customer_address/part-3.parquet]]}, projection=[ca_address_sk, ca_county], file_type=parquet, predicate=DynamicFilter [ empty ] └────────────────────────────────────────────────── - ┌───── Stage 10 ── Tasks: t0:[p0..p5] t1:[p6..p11] t2:[p12..p17] - │ BroadcastExec: input_partitions=2, consumer_tasks=3, output_partitions=6 + ┌───── Stage 10 ── Tasks: t0:[p0..p8] t1:[p9..p17] t2:[p18..p26] + │ BroadcastExec: input_partitions=3, consumer_tasks=3, output_partitions=9 │ ProjectionExec: expr=[ss_addr_sk@2 as ss_addr_sk, ss_ext_sales_price@3 as ss_ext_sales_price, d_year@0 as d_year, d_qoy@1 as d_qoy] │ HashJoinExec: mode=CollectLeft, join_type=Inner, on=[(CAST(date_dim.d_date_sk AS Float64)@3, ss_sold_date_sk@0)], projection=[d_year@1, d_qoy@2, ss_addr_sk@5, ss_ext_sales_price@6] │ CoalescePartitionsExec │ [Stage 9] => NetworkBroadcastExec: partitions_per_consumer=3, stage_partitions=9, input_tasks=2 - │ PartitionIsolatorExec: tasks=3 partitions=6 - │ DataSourceExec: file_groups={6 groups: [[/testdata/tpcds/plans_sf1_partitions4/store_sales/part-0.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_sales/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_sales/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_sales/part-2.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_sales/part-2.parquet:..], ...]}, projection=[ss_sold_date_sk, ss_addr_sk, ss_ext_sales_price], file_type=parquet, predicate=DynamicFilter [ empty ] + │ DistributedLeafExec: DataSourceExec: file_groups={3 groups: [[/testdata/tpcds/plans_sf1_partitions4/store_sales/part-0.parquet:.., /testdata/tpcds/plans_sf1_partitions4/store_sales/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_sales/part-1.parquet:.., /testdata/tpcds/plans_sf1_partitions4/store_sales/part-2.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_sales/part-2.parquet:.., /testdata/tpcds/plans_sf1_partitions4/store_sales/part-3.parquet:..]]}, projection=[ss_sold_date_sk, ss_addr_sk, ss_ext_sales_price], file_type=parquet, predicate=DynamicFilter [ empty ] └────────────────────────────────────────────────── ┌───── Stage 9 ── Tasks: t0:[p0..p8] t1:[p9..p17] │ BroadcastExec: input_partitions=3, consumer_tasks=3, output_partitions=9 │ ProjectionExec: expr=[d_date_sk@0 as d_date_sk, d_year@1 as d_year, d_qoy@2 as d_qoy, CAST(d_date_sk@0 AS Float64) as CAST(date_dim.d_date_sk AS Float64)] │ FilterExec: d_qoy@2 = 2 AND d_year@1 = 2000 │ RepartitionExec: partitioning=RoundRobinBatch(3), input_partitions=2 - │ PartitionIsolatorExec: tasks=2 partitions=4 - │ DataSourceExec: file_groups={4 groups: [[/testdata/tpcds/plans_sf1_partitions4/date_dim/part-0.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-2.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-3.parquet]]}, projection=[d_date_sk, d_year, d_qoy], file_type=parquet, predicate=d_qoy@10 = 2 AND d_year@6 = 2000, pruning_predicate=d_qoy_null_count@2 != row_count@3 AND d_qoy_min@0 <= 2 AND 2 <= d_qoy_max@1 AND d_year_null_count@6 != row_count@3 AND d_year_min@4 <= 2000 AND 2000 <= d_year_max@5, required_guarantees=[d_qoy in (2), d_year in (2000)] + │ DistributedLeafExec: DataSourceExec: file_groups={2 groups: [[/testdata/tpcds/plans_sf1_partitions4/date_dim/part-0.parquet, /testdata/tpcds/plans_sf1_partitions4/date_dim/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-2.parquet, /testdata/tpcds/plans_sf1_partitions4/date_dim/part-3.parquet]]}, projection=[d_date_sk, d_year, d_qoy], file_type=parquet, predicate=d_qoy@10 = 2 AND d_year@6 = 2000, pruning_predicate=d_qoy_null_count@2 != row_count@3 AND d_qoy_min@0 <= 2 AND 2 <= d_qoy_max@1 AND d_year_null_count@6 != row_count@3 AND d_year_min@4 <= 2000 AND 2000 <= d_year_max@5, required_guarantees=[d_qoy in (2), d_year in (2000)] └────────────────────────────────────────────────── ┌───── Stage 15 ── Tasks: t0:[p0..p5] t1:[p0..p5] t2:[p0..p5] │ RepartitionExec: partitioning=Hash([ca_county@0, d_qoy@1, d_year@2], 6), input_partitions=3 │ AggregateExec: mode=Partial, gby=[ca_county@3 as ca_county, d_qoy@2 as d_qoy, d_year@1 as d_year], aggr=[sum(store_sales.ss_ext_sales_price)], ordering_mode=PartiallySorted([1, 2]) │ HashJoinExec: mode=CollectLeft, join_type=Inner, on=[(ss_addr_sk@0, CAST(customer_address.ca_address_sk AS Float64)@2)], projection=[ss_ext_sales_price@1, d_year@2, d_qoy@3, ca_county@5] │ CoalescePartitionsExec - │ [Stage 14] => NetworkBroadcastExec: partitions_per_consumer=2, stage_partitions=6, input_tasks=3 + │ [Stage 14] => NetworkBroadcastExec: partitions_per_consumer=3, stage_partitions=9, input_tasks=3 │ ProjectionExec: expr=[ca_address_sk@0 as ca_address_sk, ca_county@1 as ca_county, CAST(ca_address_sk@0 AS Float64) as CAST(customer_address.ca_address_sk AS Float64)] │ RepartitionExec: partitioning=RoundRobinBatch(3), input_partitions=2 - │ PartitionIsolatorExec: tasks=3 partitions=4 - │ DataSourceExec: file_groups={4 groups: [[/testdata/tpcds/plans_sf1_partitions4/customer_address/part-0.parquet], [/testdata/tpcds/plans_sf1_partitions4/customer_address/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/customer_address/part-2.parquet], [/testdata/tpcds/plans_sf1_partitions4/customer_address/part-3.parquet]]}, projection=[ca_address_sk, ca_county], file_type=parquet, predicate=DynamicFilter [ empty ] + │ DistributedLeafExec: DataSourceExec: file_groups={2 groups: [[/testdata/tpcds/plans_sf1_partitions4/customer_address/part-0.parquet, /testdata/tpcds/plans_sf1_partitions4/customer_address/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/customer_address/part-2.parquet, /testdata/tpcds/plans_sf1_partitions4/customer_address/part-3.parquet]]}, projection=[ca_address_sk, ca_county], file_type=parquet, predicate=DynamicFilter [ empty ] └────────────────────────────────────────────────── - ┌───── Stage 14 ── Tasks: t0:[p0..p5] t1:[p6..p11] t2:[p12..p17] - │ BroadcastExec: input_partitions=2, consumer_tasks=3, output_partitions=6 + ┌───── Stage 14 ── Tasks: t0:[p0..p8] t1:[p9..p17] t2:[p18..p26] + │ BroadcastExec: input_partitions=3, consumer_tasks=3, output_partitions=9 │ ProjectionExec: expr=[ss_addr_sk@2 as ss_addr_sk, ss_ext_sales_price@3 as ss_ext_sales_price, d_year@0 as d_year, d_qoy@1 as d_qoy] │ HashJoinExec: mode=CollectLeft, join_type=Inner, on=[(CAST(date_dim.d_date_sk AS Float64)@3, ss_sold_date_sk@0)], projection=[d_year@1, d_qoy@2, ss_addr_sk@5, ss_ext_sales_price@6] │ CoalescePartitionsExec │ [Stage 13] => NetworkBroadcastExec: partitions_per_consumer=3, stage_partitions=9, input_tasks=2 - │ PartitionIsolatorExec: tasks=3 partitions=6 - │ DataSourceExec: file_groups={6 groups: [[/testdata/tpcds/plans_sf1_partitions4/store_sales/part-0.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_sales/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_sales/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_sales/part-2.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_sales/part-2.parquet:..], ...]}, projection=[ss_sold_date_sk, ss_addr_sk, ss_ext_sales_price], file_type=parquet, predicate=DynamicFilter [ empty ] + │ DistributedLeafExec: DataSourceExec: file_groups={3 groups: [[/testdata/tpcds/plans_sf1_partitions4/store_sales/part-0.parquet:.., /testdata/tpcds/plans_sf1_partitions4/store_sales/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_sales/part-1.parquet:.., /testdata/tpcds/plans_sf1_partitions4/store_sales/part-2.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_sales/part-2.parquet:.., /testdata/tpcds/plans_sf1_partitions4/store_sales/part-3.parquet:..]]}, projection=[ss_sold_date_sk, ss_addr_sk, ss_ext_sales_price], file_type=parquet, predicate=DynamicFilter [ empty ] └────────────────────────────────────────────────── ┌───── Stage 13 ── Tasks: t0:[p0..p8] t1:[p9..p17] │ BroadcastExec: input_partitions=3, consumer_tasks=3, output_partitions=9 │ ProjectionExec: expr=[d_date_sk@0 as d_date_sk, d_year@1 as d_year, d_qoy@2 as d_qoy, CAST(d_date_sk@0 AS Float64) as CAST(date_dim.d_date_sk AS Float64)] │ FilterExec: d_qoy@2 = 3 AND d_year@1 = 2000 │ RepartitionExec: partitioning=RoundRobinBatch(3), input_partitions=2 - │ PartitionIsolatorExec: tasks=2 partitions=4 - │ DataSourceExec: file_groups={4 groups: [[/testdata/tpcds/plans_sf1_partitions4/date_dim/part-0.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-2.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-3.parquet]]}, projection=[d_date_sk, d_year, d_qoy], file_type=parquet, predicate=d_qoy@10 = 3 AND d_year@6 = 2000, pruning_predicate=d_qoy_null_count@2 != row_count@3 AND d_qoy_min@0 <= 3 AND 3 <= d_qoy_max@1 AND d_year_null_count@6 != row_count@3 AND d_year_min@4 <= 2000 AND 2000 <= d_year_max@5, required_guarantees=[d_qoy in (3), d_year in (2000)] + │ DistributedLeafExec: DataSourceExec: file_groups={2 groups: [[/testdata/tpcds/plans_sf1_partitions4/date_dim/part-0.parquet, /testdata/tpcds/plans_sf1_partitions4/date_dim/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-2.parquet, /testdata/tpcds/plans_sf1_partitions4/date_dim/part-3.parquet]]}, projection=[d_date_sk, d_year, d_qoy], file_type=parquet, predicate=d_qoy@10 = 3 AND d_year@6 = 2000, pruning_predicate=d_qoy_null_count@2 != row_count@3 AND d_qoy_min@0 <= 3 AND 3 <= d_qoy_max@1 AND d_year_null_count@6 != row_count@3 AND d_year_min@4 <= 2000 AND 2000 <= d_year_max@5, required_guarantees=[d_qoy in (3), d_year in (2000)] └────────────────────────────────────────────────── ┌───── Stage 19 ── Tasks: t0:[p0..p5] t1:[p0..p5] t2:[p0..p5] │ RepartitionExec: partitioning=Hash([ca_county@0, d_qoy@1, d_year@2], 6), input_partitions=3 │ AggregateExec: mode=Partial, gby=[ca_county@3 as ca_county, d_qoy@2 as d_qoy, d_year@1 as d_year], aggr=[sum(web_sales.ws_ext_sales_price)], ordering_mode=PartiallySorted([1, 2]) │ HashJoinExec: mode=CollectLeft, join_type=Inner, on=[(ws_bill_addr_sk@0, CAST(customer_address.ca_address_sk AS Float64)@2)], projection=[ws_ext_sales_price@1, d_year@2, d_qoy@3, ca_county@5] │ CoalescePartitionsExec - │ [Stage 18] => NetworkBroadcastExec: partitions_per_consumer=2, stage_partitions=6, input_tasks=3 + │ [Stage 18] => NetworkBroadcastExec: partitions_per_consumer=3, stage_partitions=9, input_tasks=3 │ ProjectionExec: expr=[ca_address_sk@0 as ca_address_sk, ca_county@1 as ca_county, CAST(ca_address_sk@0 AS Float64) as CAST(customer_address.ca_address_sk AS Float64)] │ RepartitionExec: partitioning=RoundRobinBatch(3), input_partitions=2 - │ PartitionIsolatorExec: tasks=3 partitions=4 - │ DataSourceExec: file_groups={4 groups: [[/testdata/tpcds/plans_sf1_partitions4/customer_address/part-0.parquet], [/testdata/tpcds/plans_sf1_partitions4/customer_address/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/customer_address/part-2.parquet], [/testdata/tpcds/plans_sf1_partitions4/customer_address/part-3.parquet]]}, projection=[ca_address_sk, ca_county], file_type=parquet, predicate=DynamicFilter [ empty ] + │ DistributedLeafExec: DataSourceExec: file_groups={2 groups: [[/testdata/tpcds/plans_sf1_partitions4/customer_address/part-0.parquet, /testdata/tpcds/plans_sf1_partitions4/customer_address/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/customer_address/part-2.parquet, /testdata/tpcds/plans_sf1_partitions4/customer_address/part-3.parquet]]}, projection=[ca_address_sk, ca_county], file_type=parquet, predicate=DynamicFilter [ empty ] └────────────────────────────────────────────────── - ┌───── Stage 18 ── Tasks: t0:[p0..p5] t1:[p6..p11] t2:[p12..p17] - │ BroadcastExec: input_partitions=2, consumer_tasks=3, output_partitions=6 + ┌───── Stage 18 ── Tasks: t0:[p0..p8] t1:[p9..p17] t2:[p18..p26] + │ BroadcastExec: input_partitions=3, consumer_tasks=3, output_partitions=9 │ ProjectionExec: expr=[ws_bill_addr_sk@2 as ws_bill_addr_sk, ws_ext_sales_price@3 as ws_ext_sales_price, d_year@0 as d_year, d_qoy@1 as d_qoy] │ HashJoinExec: mode=CollectLeft, join_type=Inner, on=[(CAST(date_dim.d_date_sk AS Float64)@3, ws_sold_date_sk@0)], projection=[d_year@1, d_qoy@2, ws_bill_addr_sk@5, ws_ext_sales_price@6] │ CoalescePartitionsExec │ [Stage 17] => NetworkBroadcastExec: partitions_per_consumer=3, stage_partitions=9, input_tasks=2 - │ PartitionIsolatorExec: tasks=3 partitions=6 - │ DataSourceExec: file_groups={6 groups: [[/testdata/tpcds/plans_sf1_partitions4/web_sales/part-0.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/web_sales/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/web_sales/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/web_sales/part-2.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/web_sales/part-2.parquet:..], ...]}, projection=[ws_sold_date_sk, ws_bill_addr_sk, ws_ext_sales_price], file_type=parquet, predicate=DynamicFilter [ empty ] + │ DistributedLeafExec: DataSourceExec: file_groups={3 groups: [[/testdata/tpcds/plans_sf1_partitions4/web_sales/part-0.parquet:.., /testdata/tpcds/plans_sf1_partitions4/web_sales/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/web_sales/part-1.parquet:.., /testdata/tpcds/plans_sf1_partitions4/web_sales/part-2.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/web_sales/part-2.parquet:.., /testdata/tpcds/plans_sf1_partitions4/web_sales/part-3.parquet:..]]}, projection=[ws_sold_date_sk, ws_bill_addr_sk, ws_ext_sales_price], file_type=parquet, predicate=DynamicFilter [ empty ] └────────────────────────────────────────────────── ┌───── Stage 17 ── Tasks: t0:[p0..p8] t1:[p9..p17] │ BroadcastExec: input_partitions=3, consumer_tasks=3, output_partitions=9 │ ProjectionExec: expr=[d_date_sk@0 as d_date_sk, d_year@1 as d_year, d_qoy@2 as d_qoy, CAST(d_date_sk@0 AS Float64) as CAST(date_dim.d_date_sk AS Float64)] │ FilterExec: d_qoy@2 = 2 AND d_year@1 = 2000 │ RepartitionExec: partitioning=RoundRobinBatch(3), input_partitions=2 - │ PartitionIsolatorExec: tasks=2 partitions=4 - │ DataSourceExec: file_groups={4 groups: [[/testdata/tpcds/plans_sf1_partitions4/date_dim/part-0.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-2.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-3.parquet]]}, projection=[d_date_sk, d_year, d_qoy], file_type=parquet, predicate=d_qoy@10 = 2 AND d_year@6 = 2000, pruning_predicate=d_qoy_null_count@2 != row_count@3 AND d_qoy_min@0 <= 2 AND 2 <= d_qoy_max@1 AND d_year_null_count@6 != row_count@3 AND d_year_min@4 <= 2000 AND 2000 <= d_year_max@5, required_guarantees=[d_qoy in (2), d_year in (2000)] + │ DistributedLeafExec: DataSourceExec: file_groups={2 groups: [[/testdata/tpcds/plans_sf1_partitions4/date_dim/part-0.parquet, /testdata/tpcds/plans_sf1_partitions4/date_dim/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-2.parquet, /testdata/tpcds/plans_sf1_partitions4/date_dim/part-3.parquet]]}, projection=[d_date_sk, d_year, d_qoy], file_type=parquet, predicate=d_qoy@10 = 2 AND d_year@6 = 2000, pruning_predicate=d_qoy_null_count@2 != row_count@3 AND d_qoy_min@0 <= 2 AND 2 <= d_qoy_max@1 AND d_year_null_count@6 != row_count@3 AND d_year_min@4 <= 2000 AND 2000 <= d_year_max@5, required_guarantees=[d_qoy in (2), d_year in (2000)] └────────────────────────────────────────────────── ┌───── Stage 23 ── Tasks: t0:[p0..p5] t1:[p0..p5] t2:[p0..p5] │ RepartitionExec: partitioning=Hash([ca_county@0, d_qoy@1, d_year@2], 6), input_partitions=3 │ AggregateExec: mode=Partial, gby=[ca_county@3 as ca_county, d_qoy@2 as d_qoy, d_year@1 as d_year], aggr=[sum(web_sales.ws_ext_sales_price)], ordering_mode=PartiallySorted([1, 2]) │ HashJoinExec: mode=CollectLeft, join_type=Inner, on=[(ws_bill_addr_sk@0, CAST(customer_address.ca_address_sk AS Float64)@2)], projection=[ws_ext_sales_price@1, d_year@2, d_qoy@3, ca_county@5] │ CoalescePartitionsExec - │ [Stage 22] => NetworkBroadcastExec: partitions_per_consumer=2, stage_partitions=6, input_tasks=3 + │ [Stage 22] => NetworkBroadcastExec: partitions_per_consumer=3, stage_partitions=9, input_tasks=3 │ ProjectionExec: expr=[ca_address_sk@0 as ca_address_sk, ca_county@1 as ca_county, CAST(ca_address_sk@0 AS Float64) as CAST(customer_address.ca_address_sk AS Float64)] │ RepartitionExec: partitioning=RoundRobinBatch(3), input_partitions=2 - │ PartitionIsolatorExec: tasks=3 partitions=4 - │ DataSourceExec: file_groups={4 groups: [[/testdata/tpcds/plans_sf1_partitions4/customer_address/part-0.parquet], [/testdata/tpcds/plans_sf1_partitions4/customer_address/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/customer_address/part-2.parquet], [/testdata/tpcds/plans_sf1_partitions4/customer_address/part-3.parquet]]}, projection=[ca_address_sk, ca_county], file_type=parquet, predicate=DynamicFilter [ empty ] + │ DistributedLeafExec: DataSourceExec: file_groups={2 groups: [[/testdata/tpcds/plans_sf1_partitions4/customer_address/part-0.parquet, /testdata/tpcds/plans_sf1_partitions4/customer_address/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/customer_address/part-2.parquet, /testdata/tpcds/plans_sf1_partitions4/customer_address/part-3.parquet]]}, projection=[ca_address_sk, ca_county], file_type=parquet, predicate=DynamicFilter [ empty ] └────────────────────────────────────────────────── - ┌───── Stage 22 ── Tasks: t0:[p0..p5] t1:[p6..p11] t2:[p12..p17] - │ BroadcastExec: input_partitions=2, consumer_tasks=3, output_partitions=6 + ┌───── Stage 22 ── Tasks: t0:[p0..p8] t1:[p9..p17] t2:[p18..p26] + │ BroadcastExec: input_partitions=3, consumer_tasks=3, output_partitions=9 │ ProjectionExec: expr=[ws_bill_addr_sk@2 as ws_bill_addr_sk, ws_ext_sales_price@3 as ws_ext_sales_price, d_year@0 as d_year, d_qoy@1 as d_qoy] │ HashJoinExec: mode=CollectLeft, join_type=Inner, on=[(CAST(date_dim.d_date_sk AS Float64)@3, ws_sold_date_sk@0)], projection=[d_year@1, d_qoy@2, ws_bill_addr_sk@5, ws_ext_sales_price@6] │ CoalescePartitionsExec │ [Stage 21] => NetworkBroadcastExec: partitions_per_consumer=3, stage_partitions=9, input_tasks=2 - │ PartitionIsolatorExec: tasks=3 partitions=6 - │ DataSourceExec: file_groups={6 groups: [[/testdata/tpcds/plans_sf1_partitions4/web_sales/part-0.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/web_sales/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/web_sales/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/web_sales/part-2.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/web_sales/part-2.parquet:..], ...]}, projection=[ws_sold_date_sk, ws_bill_addr_sk, ws_ext_sales_price], file_type=parquet, predicate=DynamicFilter [ empty ] + │ DistributedLeafExec: DataSourceExec: file_groups={3 groups: [[/testdata/tpcds/plans_sf1_partitions4/web_sales/part-0.parquet:.., /testdata/tpcds/plans_sf1_partitions4/web_sales/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/web_sales/part-1.parquet:.., /testdata/tpcds/plans_sf1_partitions4/web_sales/part-2.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/web_sales/part-2.parquet:.., /testdata/tpcds/plans_sf1_partitions4/web_sales/part-3.parquet:..]]}, projection=[ws_sold_date_sk, ws_bill_addr_sk, ws_ext_sales_price], file_type=parquet, predicate=DynamicFilter [ empty ] └────────────────────────────────────────────────── ┌───── Stage 21 ── Tasks: t0:[p0..p8] t1:[p9..p17] │ BroadcastExec: input_partitions=3, consumer_tasks=3, output_partitions=9 │ ProjectionExec: expr=[d_date_sk@0 as d_date_sk, d_year@1 as d_year, d_qoy@2 as d_qoy, CAST(d_date_sk@0 AS Float64) as CAST(date_dim.d_date_sk AS Float64)] │ FilterExec: d_qoy@2 = 3 AND d_year@1 = 2000 │ RepartitionExec: partitioning=RoundRobinBatch(3), input_partitions=2 - │ PartitionIsolatorExec: tasks=2 partitions=4 - │ DataSourceExec: file_groups={4 groups: [[/testdata/tpcds/plans_sf1_partitions4/date_dim/part-0.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-2.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-3.parquet]]}, projection=[d_date_sk, d_year, d_qoy], file_type=parquet, predicate=d_qoy@10 = 3 AND d_year@6 = 2000, pruning_predicate=d_qoy_null_count@2 != row_count@3 AND d_qoy_min@0 <= 3 AND 3 <= d_qoy_max@1 AND d_year_null_count@6 != row_count@3 AND d_year_min@4 <= 2000 AND 2000 <= d_year_max@5, required_guarantees=[d_qoy in (3), d_year in (2000)] + │ DistributedLeafExec: DataSourceExec: file_groups={2 groups: [[/testdata/tpcds/plans_sf1_partitions4/date_dim/part-0.parquet, /testdata/tpcds/plans_sf1_partitions4/date_dim/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-2.parquet, /testdata/tpcds/plans_sf1_partitions4/date_dim/part-3.parquet]]}, projection=[d_date_sk, d_year, d_qoy], file_type=parquet, predicate=d_qoy@10 = 3 AND d_year@6 = 2000, pruning_predicate=d_qoy_null_count@2 != row_count@3 AND d_qoy_min@0 <= 3 AND 3 <= d_qoy_max@1 AND d_year_null_count@6 != row_count@3 AND d_year_min@4 <= 2000 AND 2000 <= d_year_max@5, required_guarantees=[d_qoy in (3), d_year in (2000)] └────────────────────────────────────────────────── "); Ok(()) @@ -3779,7 +3486,7 @@ mod tests { #[tokio::test] async fn test_tpcds_32() -> Result<()> { let display = test_tpcds_query("q32").await?; - assert_snapshot!(display, @" + assert_snapshot!(display, @r" ┌───── DistributedExec ── Tasks: t0:[p0] │ ProjectionExec: expr=[sum(catalog_sales.cs_ext_discount_amt)@0 as excess discount amount] │ GlobalLimitExec: skip=0, fetch=100 @@ -3791,13 +3498,13 @@ mod tests { │ AggregateExec: mode=Partial, gby=[], aggr=[sum(catalog_sales.cs_ext_discount_amt)] │ HashJoinExec: mode=CollectLeft, join_type=Inner, on=[(i_item_sk@1, cs_item_sk@1)], filter=CAST(cs_ext_discount_amt@0 AS Decimal128(30, 15)) > Float64(1.3) * avg(catalog_sales.cs_ext_discount_amt)@1, projection=[cs_ext_discount_amt@0] │ CoalescePartitionsExec - │ [Stage 3] => NetworkBroadcastExec: partitions_per_consumer=2, stage_partitions=6, input_tasks=3 + │ [Stage 3] => NetworkBroadcastExec: partitions_per_consumer=3, stage_partitions=9, input_tasks=3 │ ProjectionExec: expr=[CAST(1.3 * CAST(avg(catalog_sales.cs_ext_discount_amt)@1 AS Float64) AS Decimal128(30, 15)) as Float64(1.3) * avg(catalog_sales.cs_ext_discount_amt), cs_item_sk@0 as cs_item_sk] │ AggregateExec: mode=FinalPartitioned, gby=[cs_item_sk@0 as cs_item_sk], aggr=[avg(catalog_sales.cs_ext_discount_amt)] │ [Stage 5] => NetworkShuffleExec: output_partitions=3, input_tasks=3 └────────────────────────────────────────────────── - ┌───── Stage 3 ── Tasks: t0:[p0..p5] t1:[p6..p11] t2:[p12..p17] - │ BroadcastExec: input_partitions=2, consumer_tasks=3, output_partitions=6 + ┌───── Stage 3 ── Tasks: t0:[p0..p8] t1:[p9..p17] t2:[p18..p26] + │ BroadcastExec: input_partitions=3, consumer_tasks=3, output_partitions=9 │ HashJoinExec: mode=CollectLeft, join_type=Inner, on=[(CAST(date_dim.d_date_sk AS Float64)@1, cs_sold_date_sk@0)], projection=[cs_ext_discount_amt@3, i_item_sk@4] │ CoalescePartitionsExec │ [Stage 1] => NetworkBroadcastExec: partitions_per_consumer=3, stage_partitions=9, input_tasks=2 @@ -3805,40 +3512,35 @@ mod tests { │ HashJoinExec: mode=CollectLeft, join_type=Inner, on=[(i_item_sk@0, cs_item_sk@1)], projection=[i_item_sk@0, cs_sold_date_sk@1, cs_ext_discount_amt@3] │ CoalescePartitionsExec │ [Stage 2] => NetworkBroadcastExec: partitions_per_consumer=3, stage_partitions=9, input_tasks=2 - │ PartitionIsolatorExec: tasks=3 partitions=6 - │ DataSourceExec: file_groups={6 groups: [[/testdata/tpcds/plans_sf1_partitions4/catalog_sales/part-0.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/catalog_sales/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/catalog_sales/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/catalog_sales/part-2.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/catalog_sales/part-2.parquet:..], ...]}, projection=[cs_sold_date_sk, cs_item_sk, cs_ext_discount_amt], file_type=parquet, predicate=DynamicFilter [ empty ] AND DynamicFilter [ empty ] + │ DistributedLeafExec: DataSourceExec: file_groups={3 groups: [[/testdata/tpcds/plans_sf1_partitions4/catalog_sales/part-0.parquet:.., /testdata/tpcds/plans_sf1_partitions4/catalog_sales/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/catalog_sales/part-1.parquet:.., /testdata/tpcds/plans_sf1_partitions4/catalog_sales/part-2.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/catalog_sales/part-2.parquet:.., /testdata/tpcds/plans_sf1_partitions4/catalog_sales/part-3.parquet:..]]}, projection=[cs_sold_date_sk, cs_item_sk, cs_ext_discount_amt], file_type=parquet, predicate=DynamicFilter [ empty ] AND DynamicFilter [ empty ] └────────────────────────────────────────────────── ┌───── Stage 1 ── Tasks: t0:[p0..p8] t1:[p9..p17] │ BroadcastExec: input_partitions=3, consumer_tasks=3, output_partitions=9 │ ProjectionExec: expr=[d_date_sk@0 as d_date_sk, CAST(d_date_sk@0 AS Float64) as CAST(date_dim.d_date_sk AS Float64)] │ FilterExec: d_date@1 >= 2000-01-27 AND d_date@1 <= 2000-04-26, projection=[d_date_sk@0] │ RepartitionExec: partitioning=RoundRobinBatch(3), input_partitions=2 - │ PartitionIsolatorExec: tasks=2 partitions=4 - │ DataSourceExec: file_groups={4 groups: [[/testdata/tpcds/plans_sf1_partitions4/date_dim/part-0.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-2.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-3.parquet]]}, projection=[d_date_sk, d_date], file_type=parquet, predicate=d_date@2 >= 2000-01-27 AND d_date@2 <= 2000-04-26, pruning_predicate=d_date_null_count@1 != row_count@2 AND d_date_max@0 >= 2000-01-27 AND d_date_null_count@1 != row_count@2 AND d_date_min@3 <= 2000-04-26, required_guarantees=[] + │ DistributedLeafExec: DataSourceExec: file_groups={2 groups: [[/testdata/tpcds/plans_sf1_partitions4/date_dim/part-0.parquet, /testdata/tpcds/plans_sf1_partitions4/date_dim/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-2.parquet, /testdata/tpcds/plans_sf1_partitions4/date_dim/part-3.parquet]]}, projection=[d_date_sk, d_date], file_type=parquet, predicate=d_date@2 >= 2000-01-27 AND d_date@2 <= 2000-04-26, pruning_predicate=d_date_null_count@1 != row_count@2 AND d_date_max@0 >= 2000-01-27 AND d_date_null_count@1 != row_count@2 AND d_date_min@3 <= 2000-04-26, required_guarantees=[] └────────────────────────────────────────────────── ┌───── Stage 2 ── Tasks: t0:[p0..p8] t1:[p9..p17] │ BroadcastExec: input_partitions=3, consumer_tasks=3, output_partitions=9 │ FilterExec: i_manufact_id@1 = 977, projection=[i_item_sk@0] │ RepartitionExec: partitioning=RoundRobinBatch(3), input_partitions=2 - │ PartitionIsolatorExec: tasks=2 partitions=4 - │ DataSourceExec: file_groups={4 groups: [[/testdata/tpcds/plans_sf1_partitions4/item/part-0.parquet], [/testdata/tpcds/plans_sf1_partitions4/item/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/item/part-2.parquet], [/testdata/tpcds/plans_sf1_partitions4/item/part-3.parquet]]}, projection=[i_item_sk, i_manufact_id], file_type=parquet, predicate=i_manufact_id@13 = 977, pruning_predicate=i_manufact_id_null_count@2 != row_count@3 AND i_manufact_id_min@0 <= 977 AND 977 <= i_manufact_id_max@1, required_guarantees=[i_manufact_id in (977)] + │ DistributedLeafExec: DataSourceExec: file_groups={2 groups: [[/testdata/tpcds/plans_sf1_partitions4/item/part-0.parquet, /testdata/tpcds/plans_sf1_partitions4/item/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/item/part-2.parquet, /testdata/tpcds/plans_sf1_partitions4/item/part-3.parquet]]}, projection=[i_item_sk, i_manufact_id], file_type=parquet, predicate=i_manufact_id@13 = 977, pruning_predicate=i_manufact_id_null_count@2 != row_count@3 AND i_manufact_id_min@0 <= 977 AND 977 <= i_manufact_id_max@1, required_guarantees=[i_manufact_id in (977)] └────────────────────────────────────────────────── ┌───── Stage 5 ── Tasks: t0:[p0..p8] t1:[p0..p8] t2:[p0..p8] - │ RepartitionExec: partitioning=Hash([cs_item_sk@0], 9), input_partitions=2 + │ RepartitionExec: partitioning=Hash([cs_item_sk@0], 9), input_partitions=3 │ AggregateExec: mode=Partial, gby=[cs_item_sk@0 as cs_item_sk], aggr=[avg(catalog_sales.cs_ext_discount_amt)] │ HashJoinExec: mode=CollectLeft, join_type=Inner, on=[(CAST(date_dim.d_date_sk AS Float64)@1, cs_sold_date_sk@0)], projection=[cs_item_sk@3, cs_ext_discount_amt@4] │ CoalescePartitionsExec │ [Stage 4] => NetworkBroadcastExec: partitions_per_consumer=3, stage_partitions=9, input_tasks=2 - │ PartitionIsolatorExec: tasks=3 partitions=6 - │ DataSourceExec: file_groups={6 groups: [[/testdata/tpcds/plans_sf1_partitions4/catalog_sales/part-0.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/catalog_sales/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/catalog_sales/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/catalog_sales/part-2.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/catalog_sales/part-2.parquet:..], ...]}, projection=[cs_sold_date_sk, cs_item_sk, cs_ext_discount_amt], file_type=parquet, predicate=DynamicFilter [ empty ] AND DynamicFilter [ empty ] + │ DistributedLeafExec: DataSourceExec: file_groups={3 groups: [[/testdata/tpcds/plans_sf1_partitions4/catalog_sales/part-0.parquet:.., /testdata/tpcds/plans_sf1_partitions4/catalog_sales/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/catalog_sales/part-1.parquet:.., /testdata/tpcds/plans_sf1_partitions4/catalog_sales/part-2.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/catalog_sales/part-2.parquet:.., /testdata/tpcds/plans_sf1_partitions4/catalog_sales/part-3.parquet:..]]}, projection=[cs_sold_date_sk, cs_item_sk, cs_ext_discount_amt], file_type=parquet, predicate=DynamicFilter [ empty ] AND DynamicFilter [ empty ] └────────────────────────────────────────────────── ┌───── Stage 4 ── Tasks: t0:[p0..p8] t1:[p9..p17] │ BroadcastExec: input_partitions=3, consumer_tasks=3, output_partitions=9 │ ProjectionExec: expr=[d_date_sk@0 as d_date_sk, CAST(d_date_sk@0 AS Float64) as CAST(date_dim.d_date_sk AS Float64)] │ FilterExec: d_date@1 >= 2000-01-27 AND d_date@1 <= 2000-04-26, projection=[d_date_sk@0] │ RepartitionExec: partitioning=RoundRobinBatch(3), input_partitions=2 - │ PartitionIsolatorExec: tasks=2 partitions=4 - │ DataSourceExec: file_groups={4 groups: [[/testdata/tpcds/plans_sf1_partitions4/date_dim/part-0.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-2.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-3.parquet]]}, projection=[d_date_sk, d_date], file_type=parquet, predicate=d_date@2 >= 2000-01-27 AND d_date@2 <= 2000-04-26, pruning_predicate=d_date_null_count@1 != row_count@2 AND d_date_max@0 >= 2000-01-27 AND d_date_null_count@1 != row_count@2 AND d_date_min@3 <= 2000-04-26, required_guarantees=[] + │ DistributedLeafExec: DataSourceExec: file_groups={2 groups: [[/testdata/tpcds/plans_sf1_partitions4/date_dim/part-0.parquet, /testdata/tpcds/plans_sf1_partitions4/date_dim/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-2.parquet, /testdata/tpcds/plans_sf1_partitions4/date_dim/part-3.parquet]]}, projection=[d_date_sk, d_date], file_type=parquet, predicate=d_date@2 >= 2000-01-27 AND d_date@2 <= 2000-04-26, pruning_predicate=d_date_null_count@1 != row_count@2 AND d_date_max@0 >= 2000-01-27 AND d_date_null_count@1 != row_count@2 AND d_date_min@3 <= 2000-04-26, required_guarantees=[] └────────────────────────────────────────────────── "); Ok(()) @@ -3846,7 +3548,7 @@ mod tests { #[tokio::test] async fn test_tpcds_33() -> Result<()> { let display = test_tpcds_query("q33").await?; - assert_snapshot!(display, @" + assert_snapshot!(display, @r" ┌───── DistributedExec ── Tasks: t0:[p0] │ SortPreservingMergeExec: [total_sales@1 ASC NULLS LAST], fetch=100 │ SortExec: TopK(fetch=100), expr=[total_sales@1 ASC NULLS LAST], preserve_partitioning=[true] @@ -3889,105 +3591,93 @@ mod tests { │ CoalescePartitionsExec │ [Stage 3] => NetworkBroadcastExec: partitions_per_consumer=3, stage_partitions=9, input_tasks=3 │ RepartitionExec: partitioning=RoundRobinBatch(3), input_partitions=2 - │ PartitionIsolatorExec: tasks=3 partitions=4 - │ DataSourceExec: file_groups={4 groups: [[/testdata/tpcds/plans_sf1_partitions4/item/part-0.parquet], [/testdata/tpcds/plans_sf1_partitions4/item/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/item/part-2.parquet], [/testdata/tpcds/plans_sf1_partitions4/item/part-3.parquet]]}, projection=[i_item_sk, i_manufact_id], file_type=parquet, predicate=DynamicFilter [ empty ] + │ DistributedLeafExec: DataSourceExec: file_groups={2 groups: [[/testdata/tpcds/plans_sf1_partitions4/item/part-0.parquet, /testdata/tpcds/plans_sf1_partitions4/item/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/item/part-2.parquet, /testdata/tpcds/plans_sf1_partitions4/item/part-3.parquet]]}, projection=[i_item_sk, i_manufact_id], file_type=parquet, predicate=DynamicFilter [ empty ] └────────────────────────────────────────────────── ┌───── Stage 3 ── Tasks: t0:[p0..p8] t1:[p9..p17] t2:[p18..p26] │ BroadcastExec: input_partitions=3, consumer_tasks=3, output_partitions=9 │ HashJoinExec: mode=CollectLeft, join_type=Inner, on=[(ss_addr_sk@1, CAST(customer_address.ca_address_sk AS Float64)@1)], projection=[ss_item_sk@0, ss_ext_sales_price@2] │ CoalescePartitionsExec - │ [Stage 2] => NetworkBroadcastExec: partitions_per_consumer=2, stage_partitions=6, input_tasks=3 + │ [Stage 2] => NetworkBroadcastExec: partitions_per_consumer=3, stage_partitions=9, input_tasks=3 │ ProjectionExec: expr=[ca_address_sk@0 as ca_address_sk, CAST(ca_address_sk@0 AS Float64) as CAST(customer_address.ca_address_sk AS Float64)] │ FilterExec: ca_gmt_offset@1 = Some(-500),4,2, projection=[ca_address_sk@0] │ RepartitionExec: partitioning=RoundRobinBatch(3), input_partitions=2 - │ PartitionIsolatorExec: tasks=3 partitions=4 - │ DataSourceExec: file_groups={4 groups: [[/testdata/tpcds/plans_sf1_partitions4/customer_address/part-0.parquet], [/testdata/tpcds/plans_sf1_partitions4/customer_address/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/customer_address/part-2.parquet], [/testdata/tpcds/plans_sf1_partitions4/customer_address/part-3.parquet]]}, projection=[ca_address_sk, ca_gmt_offset], file_type=parquet, predicate=ca_gmt_offset@11 = Some(-500),4,2 AND DynamicFilter [ empty ], pruning_predicate=ca_gmt_offset_null_count@2 != row_count@3 AND ca_gmt_offset_min@0 <= Some(-500),4,2 AND Some(-500),4,2 <= ca_gmt_offset_max@1, required_guarantees=[ca_gmt_offset in (Some(-500),4,2)] + │ DistributedLeafExec: DataSourceExec: file_groups={2 groups: [[/testdata/tpcds/plans_sf1_partitions4/customer_address/part-0.parquet, /testdata/tpcds/plans_sf1_partitions4/customer_address/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/customer_address/part-2.parquet, /testdata/tpcds/plans_sf1_partitions4/customer_address/part-3.parquet]]}, projection=[ca_address_sk, ca_gmt_offset], file_type=parquet, predicate=ca_gmt_offset@11 = Some(-500),4,2 AND DynamicFilter [ empty ], pruning_predicate=ca_gmt_offset_null_count@2 != row_count@3 AND ca_gmt_offset_min@0 <= Some(-500),4,2 AND Some(-500),4,2 <= ca_gmt_offset_max@1, required_guarantees=[ca_gmt_offset in (Some(-500),4,2)] └────────────────────────────────────────────────── - ┌───── Stage 2 ── Tasks: t0:[p0..p5] t1:[p6..p11] t2:[p12..p17] - │ BroadcastExec: input_partitions=2, consumer_tasks=3, output_partitions=6 + ┌───── Stage 2 ── Tasks: t0:[p0..p8] t1:[p9..p17] t2:[p18..p26] + │ BroadcastExec: input_partitions=3, consumer_tasks=3, output_partitions=9 │ HashJoinExec: mode=CollectLeft, join_type=Inner, on=[(CAST(date_dim.d_date_sk AS Float64)@1, ss_sold_date_sk@0)], projection=[ss_item_sk@3, ss_addr_sk@4, ss_ext_sales_price@5] │ CoalescePartitionsExec │ [Stage 1] => NetworkBroadcastExec: partitions_per_consumer=3, stage_partitions=9, input_tasks=2 - │ PartitionIsolatorExec: tasks=3 partitions=6 - │ DataSourceExec: file_groups={6 groups: [[/testdata/tpcds/plans_sf1_partitions4/store_sales/part-0.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_sales/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_sales/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_sales/part-2.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_sales/part-2.parquet:..], ...]}, projection=[ss_sold_date_sk, ss_item_sk, ss_addr_sk, ss_ext_sales_price], file_type=parquet, predicate=DynamicFilter [ empty ] + │ DistributedLeafExec: DataSourceExec: file_groups={3 groups: [[/testdata/tpcds/plans_sf1_partitions4/store_sales/part-0.parquet:.., /testdata/tpcds/plans_sf1_partitions4/store_sales/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_sales/part-1.parquet:.., /testdata/tpcds/plans_sf1_partitions4/store_sales/part-2.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_sales/part-2.parquet:.., /testdata/tpcds/plans_sf1_partitions4/store_sales/part-3.parquet:..]]}, projection=[ss_sold_date_sk, ss_item_sk, ss_addr_sk, ss_ext_sales_price], file_type=parquet, predicate=DynamicFilter [ empty ] └────────────────────────────────────────────────── ┌───── Stage 1 ── Tasks: t0:[p0..p8] t1:[p9..p17] │ BroadcastExec: input_partitions=3, consumer_tasks=3, output_partitions=9 │ ProjectionExec: expr=[d_date_sk@0 as d_date_sk, CAST(d_date_sk@0 AS Float64) as CAST(date_dim.d_date_sk AS Float64)] │ FilterExec: d_year@1 = 1998 AND d_moy@2 = 5, projection=[d_date_sk@0] │ RepartitionExec: partitioning=RoundRobinBatch(3), input_partitions=2 - │ PartitionIsolatorExec: tasks=2 partitions=4 - │ DataSourceExec: file_groups={4 groups: [[/testdata/tpcds/plans_sf1_partitions4/date_dim/part-0.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-2.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-3.parquet]]}, projection=[d_date_sk, d_year, d_moy], file_type=parquet, predicate=d_year@6 = 1998 AND d_moy@8 = 5, pruning_predicate=d_year_null_count@2 != row_count@3 AND d_year_min@0 <= 1998 AND 1998 <= d_year_max@1 AND d_moy_null_count@6 != row_count@3 AND d_moy_min@4 <= 5 AND 5 <= d_moy_max@5, required_guarantees=[d_moy in (5), d_year in (1998)] + │ DistributedLeafExec: DataSourceExec: file_groups={2 groups: [[/testdata/tpcds/plans_sf1_partitions4/date_dim/part-0.parquet, /testdata/tpcds/plans_sf1_partitions4/date_dim/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-2.parquet, /testdata/tpcds/plans_sf1_partitions4/date_dim/part-3.parquet]]}, projection=[d_date_sk, d_year, d_moy], file_type=parquet, predicate=d_year@6 = 1998 AND d_moy@8 = 5, pruning_predicate=d_year_null_count@2 != row_count@3 AND d_year_min@0 <= 1998 AND 1998 <= d_year_max@1 AND d_moy_null_count@6 != row_count@3 AND d_moy_min@4 <= 5 AND 5 <= d_moy_max@5, required_guarantees=[d_moy in (5), d_year in (1998)] └────────────────────────────────────────────────── ┌───── Stage 8 ── Tasks: t0:[p0..p2] t1:[p3..p5] t2:[p6..p8] │ HashJoinExec: mode=CollectLeft, join_type=Inner, on=[(cs_item_sk@0, i_item_sk@0)], projection=[cs_ext_sales_price@1, i_manufact_id@3] │ CoalescePartitionsExec │ [Stage 7] => NetworkBroadcastExec: partitions_per_consumer=3, stage_partitions=9, input_tasks=3 │ RepartitionExec: partitioning=RoundRobinBatch(3), input_partitions=2 - │ PartitionIsolatorExec: tasks=3 partitions=4 - │ DataSourceExec: file_groups={4 groups: [[/testdata/tpcds/plans_sf1_partitions4/item/part-0.parquet], [/testdata/tpcds/plans_sf1_partitions4/item/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/item/part-2.parquet], [/testdata/tpcds/plans_sf1_partitions4/item/part-3.parquet]]}, projection=[i_item_sk, i_manufact_id], file_type=parquet, predicate=DynamicFilter [ empty ] + │ DistributedLeafExec: DataSourceExec: file_groups={2 groups: [[/testdata/tpcds/plans_sf1_partitions4/item/part-0.parquet, /testdata/tpcds/plans_sf1_partitions4/item/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/item/part-2.parquet, /testdata/tpcds/plans_sf1_partitions4/item/part-3.parquet]]}, projection=[i_item_sk, i_manufact_id], file_type=parquet, predicate=DynamicFilter [ empty ] └────────────────────────────────────────────────── ┌───── Stage 7 ── Tasks: t0:[p0..p8] t1:[p9..p17] t2:[p18..p26] │ BroadcastExec: input_partitions=3, consumer_tasks=3, output_partitions=9 │ HashJoinExec: mode=CollectLeft, join_type=Inner, on=[(cs_bill_addr_sk@0, CAST(customer_address.ca_address_sk AS Float64)@1)], projection=[cs_item_sk@1, cs_ext_sales_price@2] │ CoalescePartitionsExec - │ [Stage 6] => NetworkBroadcastExec: partitions_per_consumer=2, stage_partitions=6, input_tasks=3 + │ [Stage 6] => NetworkBroadcastExec: partitions_per_consumer=3, stage_partitions=9, input_tasks=3 │ ProjectionExec: expr=[ca_address_sk@0 as ca_address_sk, CAST(ca_address_sk@0 AS Float64) as CAST(customer_address.ca_address_sk AS Float64)] │ FilterExec: ca_gmt_offset@1 = Some(-500),4,2, projection=[ca_address_sk@0] │ RepartitionExec: partitioning=RoundRobinBatch(3), input_partitions=2 - │ PartitionIsolatorExec: tasks=3 partitions=4 - │ DataSourceExec: file_groups={4 groups: [[/testdata/tpcds/plans_sf1_partitions4/customer_address/part-0.parquet], [/testdata/tpcds/plans_sf1_partitions4/customer_address/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/customer_address/part-2.parquet], [/testdata/tpcds/plans_sf1_partitions4/customer_address/part-3.parquet]]}, projection=[ca_address_sk, ca_gmt_offset], file_type=parquet, predicate=ca_gmt_offset@11 = Some(-500),4,2 AND DynamicFilter [ empty ], pruning_predicate=ca_gmt_offset_null_count@2 != row_count@3 AND ca_gmt_offset_min@0 <= Some(-500),4,2 AND Some(-500),4,2 <= ca_gmt_offset_max@1, required_guarantees=[ca_gmt_offset in (Some(-500),4,2)] + │ DistributedLeafExec: DataSourceExec: file_groups={2 groups: [[/testdata/tpcds/plans_sf1_partitions4/customer_address/part-0.parquet, /testdata/tpcds/plans_sf1_partitions4/customer_address/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/customer_address/part-2.parquet, /testdata/tpcds/plans_sf1_partitions4/customer_address/part-3.parquet]]}, projection=[ca_address_sk, ca_gmt_offset], file_type=parquet, predicate=ca_gmt_offset@11 = Some(-500),4,2 AND DynamicFilter [ empty ], pruning_predicate=ca_gmt_offset_null_count@2 != row_count@3 AND ca_gmt_offset_min@0 <= Some(-500),4,2 AND Some(-500),4,2 <= ca_gmt_offset_max@1, required_guarantees=[ca_gmt_offset in (Some(-500),4,2)] └────────────────────────────────────────────────── - ┌───── Stage 6 ── Tasks: t0:[p0..p5] t1:[p6..p11] t2:[p12..p17] - │ BroadcastExec: input_partitions=2, consumer_tasks=3, output_partitions=6 + ┌───── Stage 6 ── Tasks: t0:[p0..p8] t1:[p9..p17] t2:[p18..p26] + │ BroadcastExec: input_partitions=3, consumer_tasks=3, output_partitions=9 │ HashJoinExec: mode=CollectLeft, join_type=Inner, on=[(CAST(date_dim.d_date_sk AS Float64)@1, cs_sold_date_sk@0)], projection=[cs_bill_addr_sk@3, cs_item_sk@4, cs_ext_sales_price@5] │ CoalescePartitionsExec │ [Stage 5] => NetworkBroadcastExec: partitions_per_consumer=3, stage_partitions=9, input_tasks=2 - │ PartitionIsolatorExec: tasks=3 partitions=6 - │ DataSourceExec: file_groups={6 groups: [[/testdata/tpcds/plans_sf1_partitions4/catalog_sales/part-0.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/catalog_sales/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/catalog_sales/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/catalog_sales/part-2.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/catalog_sales/part-2.parquet:..], ...]}, projection=[cs_sold_date_sk, cs_bill_addr_sk, cs_item_sk, cs_ext_sales_price], file_type=parquet, predicate=DynamicFilter [ empty ] + │ DistributedLeafExec: DataSourceExec: file_groups={3 groups: [[/testdata/tpcds/plans_sf1_partitions4/catalog_sales/part-0.parquet:.., /testdata/tpcds/plans_sf1_partitions4/catalog_sales/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/catalog_sales/part-1.parquet:.., /testdata/tpcds/plans_sf1_partitions4/catalog_sales/part-2.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/catalog_sales/part-2.parquet:.., /testdata/tpcds/plans_sf1_partitions4/catalog_sales/part-3.parquet:..]]}, projection=[cs_sold_date_sk, cs_bill_addr_sk, cs_item_sk, cs_ext_sales_price], file_type=parquet, predicate=DynamicFilter [ empty ] └────────────────────────────────────────────────── ┌───── Stage 5 ── Tasks: t0:[p0..p8] t1:[p9..p17] │ BroadcastExec: input_partitions=3, consumer_tasks=3, output_partitions=9 │ ProjectionExec: expr=[d_date_sk@0 as d_date_sk, CAST(d_date_sk@0 AS Float64) as CAST(date_dim.d_date_sk AS Float64)] │ FilterExec: d_year@1 = 1998 AND d_moy@2 = 5, projection=[d_date_sk@0] │ RepartitionExec: partitioning=RoundRobinBatch(3), input_partitions=2 - │ PartitionIsolatorExec: tasks=2 partitions=4 - │ DataSourceExec: file_groups={4 groups: [[/testdata/tpcds/plans_sf1_partitions4/date_dim/part-0.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-2.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-3.parquet]]}, projection=[d_date_sk, d_year, d_moy], file_type=parquet, predicate=d_year@6 = 1998 AND d_moy@8 = 5, pruning_predicate=d_year_null_count@2 != row_count@3 AND d_year_min@0 <= 1998 AND 1998 <= d_year_max@1 AND d_moy_null_count@6 != row_count@3 AND d_moy_min@4 <= 5 AND 5 <= d_moy_max@5, required_guarantees=[d_moy in (5), d_year in (1998)] + │ DistributedLeafExec: DataSourceExec: file_groups={2 groups: [[/testdata/tpcds/plans_sf1_partitions4/date_dim/part-0.parquet, /testdata/tpcds/plans_sf1_partitions4/date_dim/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-2.parquet, /testdata/tpcds/plans_sf1_partitions4/date_dim/part-3.parquet]]}, projection=[d_date_sk, d_year, d_moy], file_type=parquet, predicate=d_year@6 = 1998 AND d_moy@8 = 5, pruning_predicate=d_year_null_count@2 != row_count@3 AND d_year_min@0 <= 1998 AND 1998 <= d_year_max@1 AND d_moy_null_count@6 != row_count@3 AND d_moy_min@4 <= 5 AND 5 <= d_moy_max@5, required_guarantees=[d_moy in (5), d_year in (1998)] └────────────────────────────────────────────────── ┌───── Stage 12 ── Tasks: t0:[p0..p2] t1:[p3..p5] t2:[p6..p8] │ HashJoinExec: mode=CollectLeft, join_type=Inner, on=[(ws_item_sk@0, i_item_sk@0)], projection=[ws_ext_sales_price@1, i_manufact_id@3] │ CoalescePartitionsExec │ [Stage 11] => NetworkBroadcastExec: partitions_per_consumer=3, stage_partitions=9, input_tasks=3 │ RepartitionExec: partitioning=RoundRobinBatch(3), input_partitions=2 - │ PartitionIsolatorExec: tasks=3 partitions=4 - │ DataSourceExec: file_groups={4 groups: [[/testdata/tpcds/plans_sf1_partitions4/item/part-0.parquet], [/testdata/tpcds/plans_sf1_partitions4/item/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/item/part-2.parquet], [/testdata/tpcds/plans_sf1_partitions4/item/part-3.parquet]]}, projection=[i_item_sk, i_manufact_id], file_type=parquet, predicate=DynamicFilter [ empty ] + │ DistributedLeafExec: DataSourceExec: file_groups={2 groups: [[/testdata/tpcds/plans_sf1_partitions4/item/part-0.parquet, /testdata/tpcds/plans_sf1_partitions4/item/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/item/part-2.parquet, /testdata/tpcds/plans_sf1_partitions4/item/part-3.parquet]]}, projection=[i_item_sk, i_manufact_id], file_type=parquet, predicate=DynamicFilter [ empty ] └────────────────────────────────────────────────── ┌───── Stage 11 ── Tasks: t0:[p0..p8] t1:[p9..p17] t2:[p18..p26] │ BroadcastExec: input_partitions=3, consumer_tasks=3, output_partitions=9 │ HashJoinExec: mode=CollectLeft, join_type=Inner, on=[(ws_bill_addr_sk@1, CAST(customer_address.ca_address_sk AS Float64)@1)], projection=[ws_item_sk@0, ws_ext_sales_price@2] │ CoalescePartitionsExec - │ [Stage 10] => NetworkBroadcastExec: partitions_per_consumer=2, stage_partitions=6, input_tasks=3 + │ [Stage 10] => NetworkBroadcastExec: partitions_per_consumer=3, stage_partitions=9, input_tasks=3 │ ProjectionExec: expr=[ca_address_sk@0 as ca_address_sk, CAST(ca_address_sk@0 AS Float64) as CAST(customer_address.ca_address_sk AS Float64)] │ FilterExec: ca_gmt_offset@1 = Some(-500),4,2, projection=[ca_address_sk@0] │ RepartitionExec: partitioning=RoundRobinBatch(3), input_partitions=2 - │ PartitionIsolatorExec: tasks=3 partitions=4 - │ DataSourceExec: file_groups={4 groups: [[/testdata/tpcds/plans_sf1_partitions4/customer_address/part-0.parquet], [/testdata/tpcds/plans_sf1_partitions4/customer_address/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/customer_address/part-2.parquet], [/testdata/tpcds/plans_sf1_partitions4/customer_address/part-3.parquet]]}, projection=[ca_address_sk, ca_gmt_offset], file_type=parquet, predicate=ca_gmt_offset@11 = Some(-500),4,2 AND DynamicFilter [ empty ], pruning_predicate=ca_gmt_offset_null_count@2 != row_count@3 AND ca_gmt_offset_min@0 <= Some(-500),4,2 AND Some(-500),4,2 <= ca_gmt_offset_max@1, required_guarantees=[ca_gmt_offset in (Some(-500),4,2)] + │ DistributedLeafExec: DataSourceExec: file_groups={2 groups: [[/testdata/tpcds/plans_sf1_partitions4/customer_address/part-0.parquet, /testdata/tpcds/plans_sf1_partitions4/customer_address/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/customer_address/part-2.parquet, /testdata/tpcds/plans_sf1_partitions4/customer_address/part-3.parquet]]}, projection=[ca_address_sk, ca_gmt_offset], file_type=parquet, predicate=ca_gmt_offset@11 = Some(-500),4,2 AND DynamicFilter [ empty ], pruning_predicate=ca_gmt_offset_null_count@2 != row_count@3 AND ca_gmt_offset_min@0 <= Some(-500),4,2 AND Some(-500),4,2 <= ca_gmt_offset_max@1, required_guarantees=[ca_gmt_offset in (Some(-500),4,2)] └────────────────────────────────────────────────── - ┌───── Stage 10 ── Tasks: t0:[p0..p5] t1:[p6..p11] t2:[p12..p17] - │ BroadcastExec: input_partitions=2, consumer_tasks=3, output_partitions=6 + ┌───── Stage 10 ── Tasks: t0:[p0..p8] t1:[p9..p17] t2:[p18..p26] + │ BroadcastExec: input_partitions=3, consumer_tasks=3, output_partitions=9 │ HashJoinExec: mode=CollectLeft, join_type=Inner, on=[(CAST(date_dim.d_date_sk AS Float64)@1, ws_sold_date_sk@0)], projection=[ws_item_sk@3, ws_bill_addr_sk@4, ws_ext_sales_price@5] │ CoalescePartitionsExec │ [Stage 9] => NetworkBroadcastExec: partitions_per_consumer=3, stage_partitions=9, input_tasks=2 - │ PartitionIsolatorExec: tasks=3 partitions=6 - │ DataSourceExec: file_groups={6 groups: [[/testdata/tpcds/plans_sf1_partitions4/web_sales/part-0.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/web_sales/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/web_sales/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/web_sales/part-2.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/web_sales/part-2.parquet:..], ...]}, projection=[ws_sold_date_sk, ws_item_sk, ws_bill_addr_sk, ws_ext_sales_price], file_type=parquet, predicate=DynamicFilter [ empty ] + │ DistributedLeafExec: DataSourceExec: file_groups={3 groups: [[/testdata/tpcds/plans_sf1_partitions4/web_sales/part-0.parquet:.., /testdata/tpcds/plans_sf1_partitions4/web_sales/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/web_sales/part-1.parquet:.., /testdata/tpcds/plans_sf1_partitions4/web_sales/part-2.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/web_sales/part-2.parquet:.., /testdata/tpcds/plans_sf1_partitions4/web_sales/part-3.parquet:..]]}, projection=[ws_sold_date_sk, ws_item_sk, ws_bill_addr_sk, ws_ext_sales_price], file_type=parquet, predicate=DynamicFilter [ empty ] └────────────────────────────────────────────────── ┌───── Stage 9 ── Tasks: t0:[p0..p8] t1:[p9..p17] │ BroadcastExec: input_partitions=3, consumer_tasks=3, output_partitions=9 │ ProjectionExec: expr=[d_date_sk@0 as d_date_sk, CAST(d_date_sk@0 AS Float64) as CAST(date_dim.d_date_sk AS Float64)] │ FilterExec: d_year@1 = 1998 AND d_moy@2 = 5, projection=[d_date_sk@0] │ RepartitionExec: partitioning=RoundRobinBatch(3), input_partitions=2 - │ PartitionIsolatorExec: tasks=2 partitions=4 - │ DataSourceExec: file_groups={4 groups: [[/testdata/tpcds/plans_sf1_partitions4/date_dim/part-0.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-2.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-3.parquet]]}, projection=[d_date_sk, d_year, d_moy], file_type=parquet, predicate=d_year@6 = 1998 AND d_moy@8 = 5, pruning_predicate=d_year_null_count@2 != row_count@3 AND d_year_min@0 <= 1998 AND 1998 <= d_year_max@1 AND d_moy_null_count@6 != row_count@3 AND d_moy_min@4 <= 5 AND 5 <= d_moy_max@5, required_guarantees=[d_moy in (5), d_year in (1998)] + │ DistributedLeafExec: DataSourceExec: file_groups={2 groups: [[/testdata/tpcds/plans_sf1_partitions4/date_dim/part-0.parquet, /testdata/tpcds/plans_sf1_partitions4/date_dim/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-2.parquet, /testdata/tpcds/plans_sf1_partitions4/date_dim/part-3.parquet]]}, projection=[d_date_sk, d_year, d_moy], file_type=parquet, predicate=d_year@6 = 1998 AND d_moy@8 = 5, pruning_predicate=d_year_null_count@2 != row_count@3 AND d_year_min@0 <= 1998 AND 1998 <= d_year_max@1 AND d_moy_null_count@6 != row_count@3 AND d_moy_min@4 <= 5 AND 5 <= d_moy_max@5, required_guarantees=[d_moy in (5), d_year in (1998)] └────────────────────────────────────────────────── "); Ok(()) @@ -3995,7 +3685,7 @@ mod tests { #[tokio::test] async fn test_tpcds_34() -> Result<()> { let display = test_tpcds_query("q34").await?; - assert_snapshot!(display, @" + assert_snapshot!(display, @r" ┌───── DistributedExec ── Tasks: t0:[p0] │ SortPreservingMergeExec: [c_last_name@0 ASC, c_first_name@1 ASC, c_salutation@2 ASC, c_preferred_cust_flag@3 DESC, ss_ticket_number@4 ASC] │ [Stage 6] => NetworkCoalesceExec: output_partitions=6, input_tasks=2 @@ -4008,8 +3698,7 @@ mod tests { │ [Stage 5] => NetworkBroadcastExec: partitions_per_consumer=3, stage_partitions=6, input_tasks=2 │ ProjectionExec: expr=[c_customer_sk@0 as c_customer_sk, c_salutation@1 as c_salutation, c_first_name@2 as c_first_name, c_last_name@3 as c_last_name, c_preferred_cust_flag@4 as c_preferred_cust_flag, CAST(c_customer_sk@0 AS Float64) as CAST(customer.c_customer_sk AS Float64)] │ RepartitionExec: partitioning=RoundRobinBatch(3), input_partitions=2 - │ PartitionIsolatorExec: tasks=2 partitions=4 - │ DataSourceExec: file_groups={4 groups: [[/testdata/tpcds/plans_sf1_partitions4/customer/part-0.parquet], [/testdata/tpcds/plans_sf1_partitions4/customer/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/customer/part-2.parquet], [/testdata/tpcds/plans_sf1_partitions4/customer/part-3.parquet]]}, projection=[c_customer_sk, c_salutation, c_first_name, c_last_name, c_preferred_cust_flag], file_type=parquet, predicate=DynamicFilter [ empty ] + │ DistributedLeafExec: DataSourceExec: file_groups={2 groups: [[/testdata/tpcds/plans_sf1_partitions4/customer/part-0.parquet, /testdata/tpcds/plans_sf1_partitions4/customer/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/customer/part-2.parquet, /testdata/tpcds/plans_sf1_partitions4/customer/part-3.parquet]]}, projection=[c_customer_sk, c_salutation, c_first_name, c_last_name, c_preferred_cust_flag], file_type=parquet, predicate=DynamicFilter [ empty ] └────────────────────────────────────────────────── ┌───── Stage 5 ── Tasks: t0:[p0..p5] t1:[p6..p11] │ BroadcastExec: input_partitions=3, consumer_tasks=2, output_partitions=6 @@ -4019,7 +3708,7 @@ mod tests { │ [Stage 4] => NetworkShuffleExec: output_partitions=3, input_tasks=3 └────────────────────────────────────────────────── ┌───── Stage 4 ── Tasks: t0:[p0..p5] t1:[p0..p5] t2:[p0..p5] - │ RepartitionExec: partitioning=Hash([ss_ticket_number@0, ss_customer_sk@1], 6), input_partitions=2 + │ RepartitionExec: partitioning=Hash([ss_ticket_number@0, ss_customer_sk@1], 6), input_partitions=3 │ AggregateExec: mode=Partial, gby=[ss_ticket_number@1 as ss_ticket_number, ss_customer_sk@0 as ss_customer_sk], aggr=[count(Int64(1))] │ HashJoinExec: mode=CollectLeft, join_type=Inner, on=[(CAST(household_demographics.hd_demo_sk AS Float64)@1, ss_hdemo_sk@1)], projection=[ss_customer_sk@2, ss_ticket_number@4] │ CoalescePartitionsExec @@ -4030,32 +3719,28 @@ mod tests { │ HashJoinExec: mode=CollectLeft, join_type=Inner, on=[(CAST(date_dim.d_date_sk AS Float64)@1, ss_sold_date_sk@0)], projection=[ss_customer_sk@3, ss_hdemo_sk@4, ss_store_sk@5, ss_ticket_number@6] │ CoalescePartitionsExec │ [Stage 3] => NetworkBroadcastExec: partitions_per_consumer=3, stage_partitions=9, input_tasks=2 - │ PartitionIsolatorExec: tasks=3 partitions=6 - │ DataSourceExec: file_groups={6 groups: [[/testdata/tpcds/plans_sf1_partitions4/store_sales/part-0.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_sales/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_sales/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_sales/part-2.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_sales/part-2.parquet:..], ...]}, projection=[ss_sold_date_sk, ss_customer_sk, ss_hdemo_sk, ss_store_sk, ss_ticket_number], file_type=parquet, predicate=DynamicFilter [ empty ] AND DynamicFilter [ empty ] AND DynamicFilter [ empty ] + │ DistributedLeafExec: DataSourceExec: file_groups={3 groups: [[/testdata/tpcds/plans_sf1_partitions4/store_sales/part-0.parquet:.., /testdata/tpcds/plans_sf1_partitions4/store_sales/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_sales/part-1.parquet:.., /testdata/tpcds/plans_sf1_partitions4/store_sales/part-2.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_sales/part-2.parquet:.., /testdata/tpcds/plans_sf1_partitions4/store_sales/part-3.parquet:..]]}, projection=[ss_sold_date_sk, ss_customer_sk, ss_hdemo_sk, ss_store_sk, ss_ticket_number], file_type=parquet, predicate=DynamicFilter [ empty ] AND DynamicFilter [ empty ] AND DynamicFilter [ empty ] └────────────────────────────────────────────────── ┌───── Stage 1 ── Tasks: t0:[p0..p8] t1:[p9..p17] │ BroadcastExec: input_partitions=3, consumer_tasks=3, output_partitions=9 │ ProjectionExec: expr=[hd_demo_sk@0 as hd_demo_sk, CAST(hd_demo_sk@0 AS Float64) as CAST(household_demographics.hd_demo_sk AS Float64)] │ FilterExec: (hd_buy_potential@1 = >10000 OR hd_buy_potential@1 = Unknown) AND hd_vehicle_count@3 > 0 AND CASE WHEN hd_vehicle_count@3 > 0 THEN CAST(hd_dep_count@2 AS Float64) / CAST(hd_vehicle_count@3 AS Float64) END > 1.2, projection=[hd_demo_sk@0] │ RepartitionExec: partitioning=RoundRobinBatch(3), input_partitions=2 - │ PartitionIsolatorExec: tasks=2 partitions=4 - │ DataSourceExec: file_groups={4 groups: [[/testdata/tpcds/plans_sf1_partitions4/household_demographics/part-0.parquet], [/testdata/tpcds/plans_sf1_partitions4/household_demographics/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/household_demographics/part-2.parquet], [/testdata/tpcds/plans_sf1_partitions4/household_demographics/part-3.parquet]]}, projection=[hd_demo_sk, hd_buy_potential, hd_dep_count, hd_vehicle_count], file_type=parquet, predicate=(hd_buy_potential@2 = >10000 OR hd_buy_potential@2 = Unknown) AND hd_vehicle_count@4 > 0 AND CASE WHEN hd_vehicle_count@4 > 0 THEN CAST(hd_dep_count@3 AS Float64) / CAST(hd_vehicle_count@4 AS Float64) END > 1.2, pruning_predicate=(hd_buy_potential_null_count@2 != row_count@3 AND hd_buy_potential_min@0 <= >10000 AND >10000 <= hd_buy_potential_max@1 OR hd_buy_potential_null_count@2 != row_count@3 AND hd_buy_potential_min@0 <= Unknown AND Unknown <= hd_buy_potential_max@1) AND hd_vehicle_count_null_count@5 != row_count@3 AND hd_vehicle_count_max@4 > 0, required_guarantees=[hd_buy_potential in (>10000, Unknown)] + │ DistributedLeafExec: DataSourceExec: file_groups={2 groups: [[/testdata/tpcds/plans_sf1_partitions4/household_demographics/part-0.parquet, /testdata/tpcds/plans_sf1_partitions4/household_demographics/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/household_demographics/part-2.parquet, /testdata/tpcds/plans_sf1_partitions4/household_demographics/part-3.parquet]]}, projection=[hd_demo_sk, hd_buy_potential, hd_dep_count, hd_vehicle_count], file_type=parquet, predicate=(hd_buy_potential@2 = >10000 OR hd_buy_potential@2 = Unknown) AND hd_vehicle_count@4 > 0 AND CASE WHEN hd_vehicle_count@4 > 0 THEN CAST(hd_dep_count@3 AS Float64) / CAST(hd_vehicle_count@4 AS Float64) END > 1.2, pruning_predicate=(hd_buy_potential_null_count@2 != row_count@3 AND hd_buy_potential_min@0 <= >10000 AND >10000 <= hd_buy_potential_max@1 OR hd_buy_potential_null_count@2 != row_count@3 AND hd_buy_potential_min@0 <= Unknown AND Unknown <= hd_buy_potential_max@1) AND hd_vehicle_count_null_count@5 != row_count@3 AND hd_vehicle_count_max@4 > 0, required_guarantees=[hd_buy_potential in (>10000, Unknown)] └────────────────────────────────────────────────── ┌───── Stage 2 ── Tasks: t0:[p0..p8] t1:[p9..p17] │ BroadcastExec: input_partitions=3, consumer_tasks=3, output_partitions=9 │ ProjectionExec: expr=[s_store_sk@0 as s_store_sk, CAST(s_store_sk@0 AS Float64) as CAST(store.s_store_sk AS Float64)] │ FilterExec: s_county@1 = Williamson County, projection=[s_store_sk@0] │ RepartitionExec: partitioning=RoundRobinBatch(3), input_partitions=2 - │ PartitionIsolatorExec: tasks=2 partitions=4 - │ DataSourceExec: file_groups={4 groups: [[/testdata/tpcds/plans_sf1_partitions4/store/part-0.parquet], [/testdata/tpcds/plans_sf1_partitions4/store/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/store/part-2.parquet], [/testdata/tpcds/plans_sf1_partitions4/store/part-3.parquet]]}, projection=[s_store_sk, s_county], file_type=parquet, predicate=s_county@23 = Williamson County, pruning_predicate=s_county_null_count@2 != row_count@3 AND s_county_min@0 <= Williamson County AND Williamson County <= s_county_max@1, required_guarantees=[s_county in (Williamson County)] + │ DistributedLeafExec: DataSourceExec: file_groups={2 groups: [[/testdata/tpcds/plans_sf1_partitions4/store/part-0.parquet, /testdata/tpcds/plans_sf1_partitions4/store/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/store/part-2.parquet, /testdata/tpcds/plans_sf1_partitions4/store/part-3.parquet]]}, projection=[s_store_sk, s_county], file_type=parquet, predicate=s_county@23 = Williamson County, pruning_predicate=s_county_null_count@2 != row_count@3 AND s_county_min@0 <= Williamson County AND Williamson County <= s_county_max@1, required_guarantees=[s_county in (Williamson County)] └────────────────────────────────────────────────── ┌───── Stage 3 ── Tasks: t0:[p0..p8] t1:[p9..p17] │ BroadcastExec: input_partitions=3, consumer_tasks=3, output_partitions=9 │ ProjectionExec: expr=[d_date_sk@0 as d_date_sk, CAST(d_date_sk@0 AS Float64) as CAST(date_dim.d_date_sk AS Float64)] │ FilterExec: (d_dom@2 >= 1 AND d_dom@2 <= 3 OR d_dom@2 >= 25 AND d_dom@2 <= 28) AND (d_year@1 = 1999 OR d_year@1 = 2000 OR d_year@1 = 2001), projection=[d_date_sk@0] │ RepartitionExec: partitioning=RoundRobinBatch(3), input_partitions=2 - │ PartitionIsolatorExec: tasks=2 partitions=4 - │ DataSourceExec: file_groups={4 groups: [[/testdata/tpcds/plans_sf1_partitions4/date_dim/part-0.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-2.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-3.parquet]]}, projection=[d_date_sk, d_year, d_dom], file_type=parquet, predicate=(d_dom@9 >= 1 AND d_dom@9 <= 3 OR d_dom@9 >= 25 AND d_dom@9 <= 28) AND (d_year@6 = 1999 OR d_year@6 = 2000 OR d_year@6 = 2001), pruning_predicate=(d_dom_null_count@1 != row_count@2 AND d_dom_max@0 >= 1 AND d_dom_null_count@1 != row_count@2 AND d_dom_min@3 <= 3 OR d_dom_null_count@1 != row_count@2 AND d_dom_max@0 >= 25 AND d_dom_null_count@1 != row_count@2 AND d_dom_min@3 <= 28) AND (d_year_null_count@6 != row_count@2 AND d_year_min@4 <= 1999 AND 1999 <= d_year_max@5 OR d_year_null_count@6 != row_count@2 AND d_year_min@4 <= 2000 AND 2000 <= d_year_max@5 OR d_year_null_count@6 != row_count@2 AND d_year_min@4 <= 2001 AND 2001 <= d_year_max@5), required_guarantees=[d_year in (1999, 2000, 2001)] + │ DistributedLeafExec: DataSourceExec: file_groups={2 groups: [[/testdata/tpcds/plans_sf1_partitions4/date_dim/part-0.parquet, /testdata/tpcds/plans_sf1_partitions4/date_dim/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-2.parquet, /testdata/tpcds/plans_sf1_partitions4/date_dim/part-3.parquet]]}, projection=[d_date_sk, d_year, d_dom], file_type=parquet, predicate=(d_dom@9 >= 1 AND d_dom@9 <= 3 OR d_dom@9 >= 25 AND d_dom@9 <= 28) AND (d_year@6 = 1999 OR d_year@6 = 2000 OR d_year@6 = 2001), pruning_predicate=(d_dom_null_count@1 != row_count@2 AND d_dom_max@0 >= 1 AND d_dom_null_count@1 != row_count@2 AND d_dom_min@3 <= 3 OR d_dom_null_count@1 != row_count@2 AND d_dom_max@0 >= 25 AND d_dom_null_count@1 != row_count@2 AND d_dom_min@3 <= 28) AND (d_year_null_count@6 != row_count@2 AND d_year_min@4 <= 1999 AND 1999 <= d_year_max@5 OR d_year_null_count@6 != row_count@2 AND d_year_min@4 <= 2000 AND 2000 <= d_year_max@5 OR d_year_null_count@6 != row_count@2 AND d_year_min@4 <= 2001 AND 2001 <= d_year_max@5), required_guarantees=[d_year in (1999, 2000, 2001)] └────────────────────────────────────────────────── "); Ok(()) @@ -4063,7 +3748,7 @@ mod tests { #[tokio::test] async fn test_tpcds_35() -> Result<()> { let display = test_tpcds_query("q35").await?; - assert_snapshot!(display, @" + assert_snapshot!(display, @r" ┌───── DistributedExec ── Tasks: t0:[p0] │ SortPreservingMergeExec: [ca_state@0 ASC, cd_gender@1 ASC, cd_marital_status@2 ASC, cd_dep_count@3 ASC, cd_dep_employed_count@8 ASC, cd_dep_college_count@13 ASC], fetch=100 │ SortExec: TopK(fetch=100), expr=[ca_state@0 ASC, cd_gender@1 ASC, cd_marital_status@2 ASC, cd_dep_count@3 ASC, cd_dep_employed_count@8 ASC, cd_dep_college_count@13 ASC], preserve_partitioning=[true] @@ -4072,74 +3757,67 @@ mod tests { │ [Stage 9] => NetworkShuffleExec: output_partitions=3, input_tasks=3 └────────────────────────────────────────────────── ┌───── Stage 9 ── Tasks: t0:[p0..p2] t1:[p0..p2] t2:[p0..p2] - │ RepartitionExec: partitioning=Hash([ca_state@0, cd_gender@1, cd_marital_status@2, cd_dep_count@3, cd_dep_employed_count@4, cd_dep_college_count@5], 3), input_partitions=2 + │ RepartitionExec: partitioning=Hash([ca_state@0, cd_gender@1, cd_marital_status@2, cd_dep_count@3, cd_dep_employed_count@4, cd_dep_college_count@5], 3), input_partitions=3 │ AggregateExec: mode=Partial, gby=[ca_state@0 as ca_state, cd_gender@1 as cd_gender, cd_marital_status@2 as cd_marital_status, cd_dep_count@3 as cd_dep_count, cd_dep_employed_count@4 as cd_dep_employed_count, cd_dep_college_count@5 as cd_dep_college_count], aggr=[count(Int64(1)), min(customer_demographics.cd_dep_count), max(customer_demographics.cd_dep_count), avg(customer_demographics.cd_dep_count), min(customer_demographics.cd_dep_employed_count), max(customer_demographics.cd_dep_employed_count), avg(customer_demographics.cd_dep_employed_count), min(customer_demographics.cd_dep_college_count), max(customer_demographics.cd_dep_college_count), avg(customer_demographics.cd_dep_college_count)] │ FilterExec: mark@6 OR mark@7, projection=[ca_state@0, cd_gender@1, cd_marital_status@2, cd_dep_count@3, cd_dep_employed_count@4, cd_dep_college_count@5] │ HashJoinExec: mode=CollectLeft, join_type=RightMark, on=[(cs_ship_customer_sk@0, CAST(c.c_customer_sk AS Float64)@8)], projection=[ca_state@1, cd_gender@2, cd_marital_status@3, cd_dep_count@4, cd_dep_employed_count@5, cd_dep_college_count@6, mark@7, mark@9] │ CoalescePartitionsExec - │ [Stage 2] => NetworkBroadcastExec: partitions_per_consumer=2, stage_partitions=6, input_tasks=3 + │ [Stage 2] => NetworkBroadcastExec: partitions_per_consumer=3, stage_partitions=9, input_tasks=3 │ ProjectionExec: expr=[c_customer_sk@0 as c_customer_sk, ca_state@1 as ca_state, cd_gender@2 as cd_gender, cd_marital_status@3 as cd_marital_status, cd_dep_count@4 as cd_dep_count, cd_dep_employed_count@5 as cd_dep_employed_count, cd_dep_college_count@6 as cd_dep_college_count, mark@7 as mark, CAST(c_customer_sk@0 AS Float64) as CAST(c.c_customer_sk AS Float64)] │ HashJoinExec: mode=CollectLeft, join_type=RightMark, on=[(ws_bill_customer_sk@0, CAST(c.c_customer_sk AS Float64)@7)], projection=[c_customer_sk@0, ca_state@1, cd_gender@2, cd_marital_status@3, cd_dep_count@4, cd_dep_employed_count@5, cd_dep_college_count@6, mark@8] │ CoalescePartitionsExec - │ [Stage 4] => NetworkBroadcastExec: partitions_per_consumer=2, stage_partitions=6, input_tasks=3 + │ [Stage 4] => NetworkBroadcastExec: partitions_per_consumer=3, stage_partitions=9, input_tasks=3 │ ProjectionExec: expr=[c_customer_sk@0 as c_customer_sk, ca_state@1 as ca_state, cd_gender@2 as cd_gender, cd_marital_status@3 as cd_marital_status, cd_dep_count@4 as cd_dep_count, cd_dep_employed_count@5 as cd_dep_employed_count, cd_dep_college_count@6 as cd_dep_college_count, CAST(c_customer_sk@0 AS Float64) as CAST(c.c_customer_sk AS Float64)] │ HashJoinExec: mode=CollectLeft, join_type=RightSemi, on=[(ss_customer_sk@0, CAST(c.c_customer_sk AS Float64)@7)], projection=[c_customer_sk@0, ca_state@1, cd_gender@2, cd_marital_status@3, cd_dep_count@4, cd_dep_employed_count@5, cd_dep_college_count@6] │ CoalescePartitionsExec - │ [Stage 6] => NetworkBroadcastExec: partitions_per_consumer=2, stage_partitions=6, input_tasks=3 + │ [Stage 6] => NetworkBroadcastExec: partitions_per_consumer=3, stage_partitions=9, input_tasks=3 │ ProjectionExec: expr=[c_customer_sk@0 as c_customer_sk, ca_state@1 as ca_state, cd_gender@2 as cd_gender, cd_marital_status@3 as cd_marital_status, cd_dep_count@4 as cd_dep_count, cd_dep_employed_count@5 as cd_dep_employed_count, cd_dep_college_count@6 as cd_dep_college_count, CAST(c_customer_sk@0 AS Float64) as CAST(c.c_customer_sk AS Float64)] │ HashJoinExec: mode=CollectLeft, join_type=Inner, on=[(c_current_cdemo_sk@1, CAST(customer_demographics.cd_demo_sk AS Float64)@6)], projection=[c_customer_sk@0, ca_state@2, cd_gender@4, cd_marital_status@5, cd_dep_count@6, cd_dep_employed_count@7, cd_dep_college_count@8] │ CoalescePartitionsExec │ [Stage 8] => NetworkBroadcastExec: partitions_per_consumer=3, stage_partitions=9, input_tasks=2 - │ PartitionIsolatorExec: tasks=3 partitions=6 - │ DataSourceExec: file_groups={6 groups: [[/testdata/tpcds/plans_sf1_partitions4/customer_demographics/part-0.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/customer_demographics/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/customer_demographics/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/customer_demographics/part-2.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/customer_demographics/part-2.parquet:..], ...]}, projection=[cd_demo_sk, cd_gender, cd_marital_status, cd_dep_count, cd_dep_employed_count, cd_dep_college_count, CAST(cd_demo_sk@0 AS Float64) as CAST(customer_demographics.cd_demo_sk AS Float64)], file_type=parquet, predicate=DynamicFilter [ empty ] + │ DistributedLeafExec: DataSourceExec: file_groups={3 groups: [[/testdata/tpcds/plans_sf1_partitions4/customer_demographics/part-0.parquet:.., /testdata/tpcds/plans_sf1_partitions4/customer_demographics/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/customer_demographics/part-1.parquet:.., /testdata/tpcds/plans_sf1_partitions4/customer_demographics/part-2.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/customer_demographics/part-2.parquet:.., /testdata/tpcds/plans_sf1_partitions4/customer_demographics/part-3.parquet:..]]}, projection=[cd_demo_sk, cd_gender, cd_marital_status, cd_dep_count, cd_dep_employed_count, cd_dep_college_count, CAST(cd_demo_sk@0 AS Float64) as CAST(customer_demographics.cd_demo_sk AS Float64)], file_type=parquet, predicate=DynamicFilter [ empty ] └────────────────────────────────────────────────── - ┌───── Stage 2 ── Tasks: t0:[p0..p5] t1:[p6..p11] t2:[p12..p17] - │ BroadcastExec: input_partitions=2, consumer_tasks=3, output_partitions=6 + ┌───── Stage 2 ── Tasks: t0:[p0..p8] t1:[p9..p17] t2:[p18..p26] + │ BroadcastExec: input_partitions=3, consumer_tasks=3, output_partitions=9 │ HashJoinExec: mode=CollectLeft, join_type=Inner, on=[(CAST(date_dim.d_date_sk AS Float64)@1, cs_sold_date_sk@0)], projection=[cs_ship_customer_sk@3] │ CoalescePartitionsExec │ [Stage 1] => NetworkBroadcastExec: partitions_per_consumer=3, stage_partitions=9, input_tasks=2 - │ PartitionIsolatorExec: tasks=3 partitions=6 - │ DataSourceExec: file_groups={6 groups: [[/testdata/tpcds/plans_sf1_partitions4/catalog_sales/part-0.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/catalog_sales/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/catalog_sales/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/catalog_sales/part-2.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/catalog_sales/part-2.parquet:..], ...]}, projection=[cs_sold_date_sk, cs_ship_customer_sk], file_type=parquet, predicate=DynamicFilter [ empty ] + │ DistributedLeafExec: DataSourceExec: file_groups={3 groups: [[/testdata/tpcds/plans_sf1_partitions4/catalog_sales/part-0.parquet:.., /testdata/tpcds/plans_sf1_partitions4/catalog_sales/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/catalog_sales/part-1.parquet:.., /testdata/tpcds/plans_sf1_partitions4/catalog_sales/part-2.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/catalog_sales/part-2.parquet:.., /testdata/tpcds/plans_sf1_partitions4/catalog_sales/part-3.parquet:..]]}, projection=[cs_sold_date_sk, cs_ship_customer_sk], file_type=parquet, predicate=DynamicFilter [ empty ] └────────────────────────────────────────────────── ┌───── Stage 1 ── Tasks: t0:[p0..p8] t1:[p9..p17] │ BroadcastExec: input_partitions=3, consumer_tasks=3, output_partitions=9 │ ProjectionExec: expr=[d_date_sk@0 as d_date_sk, CAST(d_date_sk@0 AS Float64) as CAST(date_dim.d_date_sk AS Float64)] │ FilterExec: d_year@1 = 2002 AND d_qoy@2 < 4, projection=[d_date_sk@0] │ RepartitionExec: partitioning=RoundRobinBatch(3), input_partitions=2 - │ PartitionIsolatorExec: tasks=2 partitions=4 - │ DataSourceExec: file_groups={4 groups: [[/testdata/tpcds/plans_sf1_partitions4/date_dim/part-0.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-2.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-3.parquet]]}, projection=[d_date_sk, d_year, d_qoy], file_type=parquet, predicate=d_year@6 = 2002 AND d_qoy@10 < 4, pruning_predicate=d_year_null_count@2 != row_count@3 AND d_year_min@0 <= 2002 AND 2002 <= d_year_max@1 AND d_qoy_null_count@5 != row_count@3 AND d_qoy_min@4 < 4, required_guarantees=[d_year in (2002)] + │ DistributedLeafExec: DataSourceExec: file_groups={2 groups: [[/testdata/tpcds/plans_sf1_partitions4/date_dim/part-0.parquet, /testdata/tpcds/plans_sf1_partitions4/date_dim/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-2.parquet, /testdata/tpcds/plans_sf1_partitions4/date_dim/part-3.parquet]]}, projection=[d_date_sk, d_year, d_qoy], file_type=parquet, predicate=d_year@6 = 2002 AND d_qoy@10 < 4, pruning_predicate=d_year_null_count@2 != row_count@3 AND d_year_min@0 <= 2002 AND 2002 <= d_year_max@1 AND d_qoy_null_count@5 != row_count@3 AND d_qoy_min@4 < 4, required_guarantees=[d_year in (2002)] └────────────────────────────────────────────────── - ┌───── Stage 4 ── Tasks: t0:[p0..p5] t1:[p6..p11] t2:[p12..p17] - │ BroadcastExec: input_partitions=2, consumer_tasks=3, output_partitions=6 + ┌───── Stage 4 ── Tasks: t0:[p0..p8] t1:[p9..p17] t2:[p18..p26] + │ BroadcastExec: input_partitions=3, consumer_tasks=3, output_partitions=9 │ HashJoinExec: mode=CollectLeft, join_type=Inner, on=[(CAST(date_dim.d_date_sk AS Float64)@1, ws_sold_date_sk@0)], projection=[ws_bill_customer_sk@3] │ CoalescePartitionsExec │ [Stage 3] => NetworkBroadcastExec: partitions_per_consumer=3, stage_partitions=9, input_tasks=2 - │ PartitionIsolatorExec: tasks=3 partitions=6 - │ DataSourceExec: file_groups={6 groups: [[/testdata/tpcds/plans_sf1_partitions4/web_sales/part-0.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/web_sales/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/web_sales/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/web_sales/part-2.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/web_sales/part-2.parquet:..], ...]}, projection=[ws_sold_date_sk, ws_bill_customer_sk], file_type=parquet, predicate=DynamicFilter [ empty ] + │ DistributedLeafExec: DataSourceExec: file_groups={3 groups: [[/testdata/tpcds/plans_sf1_partitions4/web_sales/part-0.parquet:.., /testdata/tpcds/plans_sf1_partitions4/web_sales/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/web_sales/part-1.parquet:.., /testdata/tpcds/plans_sf1_partitions4/web_sales/part-2.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/web_sales/part-2.parquet:.., /testdata/tpcds/plans_sf1_partitions4/web_sales/part-3.parquet:..]]}, projection=[ws_sold_date_sk, ws_bill_customer_sk], file_type=parquet, predicate=DynamicFilter [ empty ] └────────────────────────────────────────────────── ┌───── Stage 3 ── Tasks: t0:[p0..p8] t1:[p9..p17] │ BroadcastExec: input_partitions=3, consumer_tasks=3, output_partitions=9 │ ProjectionExec: expr=[d_date_sk@0 as d_date_sk, CAST(d_date_sk@0 AS Float64) as CAST(date_dim.d_date_sk AS Float64)] │ FilterExec: d_year@1 = 2002 AND d_qoy@2 < 4, projection=[d_date_sk@0] │ RepartitionExec: partitioning=RoundRobinBatch(3), input_partitions=2 - │ PartitionIsolatorExec: tasks=2 partitions=4 - │ DataSourceExec: file_groups={4 groups: [[/testdata/tpcds/plans_sf1_partitions4/date_dim/part-0.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-2.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-3.parquet]]}, projection=[d_date_sk, d_year, d_qoy], file_type=parquet, predicate=d_year@6 = 2002 AND d_qoy@10 < 4, pruning_predicate=d_year_null_count@2 != row_count@3 AND d_year_min@0 <= 2002 AND 2002 <= d_year_max@1 AND d_qoy_null_count@5 != row_count@3 AND d_qoy_min@4 < 4, required_guarantees=[d_year in (2002)] + │ DistributedLeafExec: DataSourceExec: file_groups={2 groups: [[/testdata/tpcds/plans_sf1_partitions4/date_dim/part-0.parquet, /testdata/tpcds/plans_sf1_partitions4/date_dim/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-2.parquet, /testdata/tpcds/plans_sf1_partitions4/date_dim/part-3.parquet]]}, projection=[d_date_sk, d_year, d_qoy], file_type=parquet, predicate=d_year@6 = 2002 AND d_qoy@10 < 4, pruning_predicate=d_year_null_count@2 != row_count@3 AND d_year_min@0 <= 2002 AND 2002 <= d_year_max@1 AND d_qoy_null_count@5 != row_count@3 AND d_qoy_min@4 < 4, required_guarantees=[d_year in (2002)] └────────────────────────────────────────────────── - ┌───── Stage 6 ── Tasks: t0:[p0..p5] t1:[p6..p11] t2:[p12..p17] - │ BroadcastExec: input_partitions=2, consumer_tasks=3, output_partitions=6 + ┌───── Stage 6 ── Tasks: t0:[p0..p8] t1:[p9..p17] t2:[p18..p26] + │ BroadcastExec: input_partitions=3, consumer_tasks=3, output_partitions=9 │ HashJoinExec: mode=CollectLeft, join_type=Inner, on=[(CAST(date_dim.d_date_sk AS Float64)@1, ss_sold_date_sk@0)], projection=[ss_customer_sk@3] │ CoalescePartitionsExec │ [Stage 5] => NetworkBroadcastExec: partitions_per_consumer=3, stage_partitions=9, input_tasks=2 - │ PartitionIsolatorExec: tasks=3 partitions=6 - │ DataSourceExec: file_groups={6 groups: [[/testdata/tpcds/plans_sf1_partitions4/store_sales/part-0.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_sales/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_sales/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_sales/part-2.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_sales/part-2.parquet:..], ...]}, projection=[ss_sold_date_sk, ss_customer_sk], file_type=parquet, predicate=DynamicFilter [ empty ] + │ DistributedLeafExec: DataSourceExec: file_groups={3 groups: [[/testdata/tpcds/plans_sf1_partitions4/store_sales/part-0.parquet:.., /testdata/tpcds/plans_sf1_partitions4/store_sales/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_sales/part-1.parquet:.., /testdata/tpcds/plans_sf1_partitions4/store_sales/part-2.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_sales/part-2.parquet:.., /testdata/tpcds/plans_sf1_partitions4/store_sales/part-3.parquet:..]]}, projection=[ss_sold_date_sk, ss_customer_sk], file_type=parquet, predicate=DynamicFilter [ empty ] └────────────────────────────────────────────────── ┌───── Stage 5 ── Tasks: t0:[p0..p8] t1:[p9..p17] │ BroadcastExec: input_partitions=3, consumer_tasks=3, output_partitions=9 │ ProjectionExec: expr=[d_date_sk@0 as d_date_sk, CAST(d_date_sk@0 AS Float64) as CAST(date_dim.d_date_sk AS Float64)] │ FilterExec: d_year@1 = 2002 AND d_qoy@2 < 4, projection=[d_date_sk@0] │ RepartitionExec: partitioning=RoundRobinBatch(3), input_partitions=2 - │ PartitionIsolatorExec: tasks=2 partitions=4 - │ DataSourceExec: file_groups={4 groups: [[/testdata/tpcds/plans_sf1_partitions4/date_dim/part-0.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-2.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-3.parquet]]}, projection=[d_date_sk, d_year, d_qoy], file_type=parquet, predicate=d_year@6 = 2002 AND d_qoy@10 < 4, pruning_predicate=d_year_null_count@2 != row_count@3 AND d_year_min@0 <= 2002 AND 2002 <= d_year_max@1 AND d_qoy_null_count@5 != row_count@3 AND d_qoy_min@4 < 4, required_guarantees=[d_year in (2002)] + │ DistributedLeafExec: DataSourceExec: file_groups={2 groups: [[/testdata/tpcds/plans_sf1_partitions4/date_dim/part-0.parquet, /testdata/tpcds/plans_sf1_partitions4/date_dim/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-2.parquet, /testdata/tpcds/plans_sf1_partitions4/date_dim/part-3.parquet]]}, projection=[d_date_sk, d_year, d_qoy], file_type=parquet, predicate=d_year@6 = 2002 AND d_qoy@10 < 4, pruning_predicate=d_year_null_count@2 != row_count@3 AND d_year_min@0 <= 2002 AND 2002 <= d_year_max@1 AND d_qoy_null_count@5 != row_count@3 AND d_qoy_min@4 < 4, required_guarantees=[d_year in (2002)] └────────────────────────────────────────────────── ┌───── Stage 8 ── Tasks: t0:[p0..p8] t1:[p9..p17] │ BroadcastExec: input_partitions=3, consumer_tasks=3, output_partitions=9 @@ -4148,13 +3826,11 @@ mod tests { │ CoalescePartitionsExec │ [Stage 7] => NetworkBroadcastExec: partitions_per_consumer=2, stage_partitions=4, input_tasks=2 │ RepartitionExec: partitioning=RoundRobinBatch(3), input_partitions=2 - │ PartitionIsolatorExec: tasks=2 partitions=4 - │ DataSourceExec: file_groups={4 groups: [[/testdata/tpcds/plans_sf1_partitions4/customer/part-0.parquet], [/testdata/tpcds/plans_sf1_partitions4/customer/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/customer/part-2.parquet], [/testdata/tpcds/plans_sf1_partitions4/customer/part-3.parquet]]}, projection=[c_customer_sk, c_current_cdemo_sk, c_current_addr_sk], file_type=parquet, predicate=DynamicFilter [ empty ] + │ DistributedLeafExec: DataSourceExec: file_groups={2 groups: [[/testdata/tpcds/plans_sf1_partitions4/customer/part-0.parquet, /testdata/tpcds/plans_sf1_partitions4/customer/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/customer/part-2.parquet, /testdata/tpcds/plans_sf1_partitions4/customer/part-3.parquet]]}, projection=[c_customer_sk, c_current_cdemo_sk, c_current_addr_sk], file_type=parquet, predicate=DynamicFilter [ empty ] └────────────────────────────────────────────────── ┌───── Stage 7 ── Tasks: t0:[p0..p3] t1:[p4..p7] │ BroadcastExec: input_partitions=2, consumer_tasks=2, output_partitions=4 - │ PartitionIsolatorExec: tasks=2 partitions=4 - │ DataSourceExec: file_groups={4 groups: [[/testdata/tpcds/plans_sf1_partitions4/customer_address/part-0.parquet], [/testdata/tpcds/plans_sf1_partitions4/customer_address/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/customer_address/part-2.parquet], [/testdata/tpcds/plans_sf1_partitions4/customer_address/part-3.parquet]]}, projection=[ca_address_sk, ca_state], file_type=parquet + │ DistributedLeafExec: DataSourceExec: file_groups={2 groups: [[/testdata/tpcds/plans_sf1_partitions4/customer_address/part-0.parquet, /testdata/tpcds/plans_sf1_partitions4/customer_address/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/customer_address/part-2.parquet, /testdata/tpcds/plans_sf1_partitions4/customer_address/part-3.parquet]]}, projection=[ca_address_sk, ca_state], file_type=parquet └────────────────────────────────────────────────── "); Ok(()) @@ -4205,34 +3881,30 @@ mod tests { │ [Stage 1] => NetworkBroadcastExec: partitions_per_consumer=3, stage_partitions=9, input_tasks=2 │ HashJoinExec: mode=CollectLeft, join_type=Inner, on=[(ss_item_sk@0, i_item_sk@0)], projection=[ss_store_sk@1, ss_ext_sales_price@2, ss_net_profit@3, i_class@5, i_category@6] │ CoalescePartitionsExec - │ [Stage 3] => NetworkBroadcastExec: partitions_per_consumer=2, stage_partitions=6, input_tasks=3 + │ [Stage 3] => NetworkBroadcastExec: partitions_per_consumer=3, stage_partitions=9, input_tasks=3 │ RepartitionExec: partitioning=RoundRobinBatch(3), input_partitions=2 - │ PartitionIsolatorExec: tasks=3 partitions=4 - │ DataSourceExec: file_groups={4 groups: [[/testdata/tpcds/plans_sf1_partitions4/item/part-0.parquet], [/testdata/tpcds/plans_sf1_partitions4/item/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/item/part-2.parquet], [/testdata/tpcds/plans_sf1_partitions4/item/part-3.parquet]]}, projection=[i_item_sk, i_class, i_category], file_type=parquet, predicate=DynamicFilter [ empty ] + │ DistributedLeafExec: DataSourceExec: file_groups={2 groups: [[/testdata/tpcds/plans_sf1_partitions4/item/part-0.parquet, /testdata/tpcds/plans_sf1_partitions4/item/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/item/part-2.parquet, /testdata/tpcds/plans_sf1_partitions4/item/part-3.parquet]]}, projection=[i_item_sk, i_class, i_category], file_type=parquet, predicate=DynamicFilter [ empty ] └────────────────────────────────────────────────── ┌───── Stage 1 ── Tasks: t0:[p0..p8] t1:[p9..p17] │ BroadcastExec: input_partitions=3, consumer_tasks=3, output_partitions=9 │ ProjectionExec: expr=[s_store_sk@0 as s_store_sk, CAST(s_store_sk@0 AS Float64) as CAST(store.s_store_sk AS Float64)] │ FilterExec: s_state@1 = TN, projection=[s_store_sk@0] │ RepartitionExec: partitioning=RoundRobinBatch(3), input_partitions=2 - │ PartitionIsolatorExec: tasks=2 partitions=4 - │ DataSourceExec: file_groups={4 groups: [[/testdata/tpcds/plans_sf1_partitions4/store/part-0.parquet], [/testdata/tpcds/plans_sf1_partitions4/store/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/store/part-2.parquet], [/testdata/tpcds/plans_sf1_partitions4/store/part-3.parquet]]}, projection=[s_store_sk, s_state], file_type=parquet, predicate=s_state@24 = TN, pruning_predicate=s_state_null_count@2 != row_count@3 AND s_state_min@0 <= TN AND TN <= s_state_max@1, required_guarantees=[s_state in (TN)] + │ DistributedLeafExec: DataSourceExec: file_groups={2 groups: [[/testdata/tpcds/plans_sf1_partitions4/store/part-0.parquet, /testdata/tpcds/plans_sf1_partitions4/store/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/store/part-2.parquet, /testdata/tpcds/plans_sf1_partitions4/store/part-3.parquet]]}, projection=[s_store_sk, s_state], file_type=parquet, predicate=s_state@24 = TN, pruning_predicate=s_state_null_count@2 != row_count@3 AND s_state_min@0 <= TN AND TN <= s_state_max@1, required_guarantees=[s_state in (TN)] └────────────────────────────────────────────────── - ┌───── Stage 3 ── Tasks: t0:[p0..p5] t1:[p6..p11] t2:[p12..p17] - │ BroadcastExec: input_partitions=2, consumer_tasks=3, output_partitions=6 + ┌───── Stage 3 ── Tasks: t0:[p0..p8] t1:[p9..p17] t2:[p18..p26] + │ BroadcastExec: input_partitions=3, consumer_tasks=3, output_partitions=9 │ HashJoinExec: mode=CollectLeft, join_type=Inner, on=[(CAST(d1.d_date_sk AS Float64)@1, ss_sold_date_sk@0)], projection=[ss_item_sk@3, ss_store_sk@4, ss_ext_sales_price@5, ss_net_profit@6] │ CoalescePartitionsExec │ [Stage 2] => NetworkBroadcastExec: partitions_per_consumer=3, stage_partitions=9, input_tasks=2 - │ PartitionIsolatorExec: tasks=3 partitions=6 - │ DataSourceExec: file_groups={6 groups: [[/testdata/tpcds/plans_sf1_partitions4/store_sales/part-0.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_sales/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_sales/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_sales/part-2.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_sales/part-2.parquet:..], ...]}, projection=[ss_sold_date_sk, ss_item_sk, ss_store_sk, ss_ext_sales_price, ss_net_profit], file_type=parquet, predicate=DynamicFilter [ empty ] AND DynamicFilter [ empty ] + │ DistributedLeafExec: DataSourceExec: file_groups={3 groups: [[/testdata/tpcds/plans_sf1_partitions4/store_sales/part-0.parquet:.., /testdata/tpcds/plans_sf1_partitions4/store_sales/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_sales/part-1.parquet:.., /testdata/tpcds/plans_sf1_partitions4/store_sales/part-2.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_sales/part-2.parquet:.., /testdata/tpcds/plans_sf1_partitions4/store_sales/part-3.parquet:..]]}, projection=[ss_sold_date_sk, ss_item_sk, ss_store_sk, ss_ext_sales_price, ss_net_profit], file_type=parquet, predicate=DynamicFilter [ empty ] AND DynamicFilter [ empty ] └────────────────────────────────────────────────── ┌───── Stage 2 ── Tasks: t0:[p0..p8] t1:[p9..p17] │ BroadcastExec: input_partitions=3, consumer_tasks=3, output_partitions=9 │ ProjectionExec: expr=[d_date_sk@0 as d_date_sk, CAST(d_date_sk@0 AS Float64) as CAST(d1.d_date_sk AS Float64)] │ FilterExec: d_year@1 = 2001, projection=[d_date_sk@0] │ RepartitionExec: partitioning=RoundRobinBatch(3), input_partitions=2 - │ PartitionIsolatorExec: tasks=2 partitions=4 - │ DataSourceExec: file_groups={4 groups: [[/testdata/tpcds/plans_sf1_partitions4/date_dim/part-0.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-2.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-3.parquet]]}, projection=[d_date_sk, d_year], file_type=parquet, predicate=d_year@6 = 2001, pruning_predicate=d_year_null_count@2 != row_count@3 AND d_year_min@0 <= 2001 AND 2001 <= d_year_max@1, required_guarantees=[d_year in (2001)] + │ DistributedLeafExec: DataSourceExec: file_groups={2 groups: [[/testdata/tpcds/plans_sf1_partitions4/date_dim/part-0.parquet, /testdata/tpcds/plans_sf1_partitions4/date_dim/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-2.parquet, /testdata/tpcds/plans_sf1_partitions4/date_dim/part-3.parquet]]}, projection=[d_date_sk, d_year], file_type=parquet, predicate=d_year@6 = 2001, pruning_predicate=d_year_null_count@2 != row_count@3 AND d_year_min@0 <= 2001 AND 2001 <= d_year_max@1, required_guarantees=[d_year in (2001)] └────────────────────────────────────────────────── ┌───── Stage 9 ── Tasks: t0:[p0..p2] t1:[p0..p2] │ RepartitionExec: partitioning=Hash([i_category@0], 3), input_partitions=3 @@ -4249,34 +3921,30 @@ mod tests { │ [Stage 5] => NetworkBroadcastExec: partitions_per_consumer=3, stage_partitions=9, input_tasks=2 │ HashJoinExec: mode=CollectLeft, join_type=Inner, on=[(ss_item_sk@0, i_item_sk@0)], projection=[ss_store_sk@1, ss_ext_sales_price@2, ss_net_profit@3, i_class@5, i_category@6] │ CoalescePartitionsExec - │ [Stage 7] => NetworkBroadcastExec: partitions_per_consumer=2, stage_partitions=6, input_tasks=3 + │ [Stage 7] => NetworkBroadcastExec: partitions_per_consumer=3, stage_partitions=9, input_tasks=3 │ RepartitionExec: partitioning=RoundRobinBatch(3), input_partitions=2 - │ PartitionIsolatorExec: tasks=3 partitions=4 - │ DataSourceExec: file_groups={4 groups: [[/testdata/tpcds/plans_sf1_partitions4/item/part-0.parquet], [/testdata/tpcds/plans_sf1_partitions4/item/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/item/part-2.parquet], [/testdata/tpcds/plans_sf1_partitions4/item/part-3.parquet]]}, projection=[i_item_sk, i_class, i_category], file_type=parquet, predicate=DynamicFilter [ empty ] + │ DistributedLeafExec: DataSourceExec: file_groups={2 groups: [[/testdata/tpcds/plans_sf1_partitions4/item/part-0.parquet, /testdata/tpcds/plans_sf1_partitions4/item/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/item/part-2.parquet, /testdata/tpcds/plans_sf1_partitions4/item/part-3.parquet]]}, projection=[i_item_sk, i_class, i_category], file_type=parquet, predicate=DynamicFilter [ empty ] └────────────────────────────────────────────────── ┌───── Stage 5 ── Tasks: t0:[p0..p8] t1:[p9..p17] │ BroadcastExec: input_partitions=3, consumer_tasks=3, output_partitions=9 │ ProjectionExec: expr=[s_store_sk@0 as s_store_sk, CAST(s_store_sk@0 AS Float64) as CAST(store.s_store_sk AS Float64)] │ FilterExec: s_state@1 = TN, projection=[s_store_sk@0] │ RepartitionExec: partitioning=RoundRobinBatch(3), input_partitions=2 - │ PartitionIsolatorExec: tasks=2 partitions=4 - │ DataSourceExec: file_groups={4 groups: [[/testdata/tpcds/plans_sf1_partitions4/store/part-0.parquet], [/testdata/tpcds/plans_sf1_partitions4/store/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/store/part-2.parquet], [/testdata/tpcds/plans_sf1_partitions4/store/part-3.parquet]]}, projection=[s_store_sk, s_state], file_type=parquet, predicate=s_state@24 = TN, pruning_predicate=s_state_null_count@2 != row_count@3 AND s_state_min@0 <= TN AND TN <= s_state_max@1, required_guarantees=[s_state in (TN)] + │ DistributedLeafExec: DataSourceExec: file_groups={2 groups: [[/testdata/tpcds/plans_sf1_partitions4/store/part-0.parquet, /testdata/tpcds/plans_sf1_partitions4/store/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/store/part-2.parquet, /testdata/tpcds/plans_sf1_partitions4/store/part-3.parquet]]}, projection=[s_store_sk, s_state], file_type=parquet, predicate=s_state@24 = TN, pruning_predicate=s_state_null_count@2 != row_count@3 AND s_state_min@0 <= TN AND TN <= s_state_max@1, required_guarantees=[s_state in (TN)] └────────────────────────────────────────────────── - ┌───── Stage 7 ── Tasks: t0:[p0..p5] t1:[p6..p11] t2:[p12..p17] - │ BroadcastExec: input_partitions=2, consumer_tasks=3, output_partitions=6 + ┌───── Stage 7 ── Tasks: t0:[p0..p8] t1:[p9..p17] t2:[p18..p26] + │ BroadcastExec: input_partitions=3, consumer_tasks=3, output_partitions=9 │ HashJoinExec: mode=CollectLeft, join_type=Inner, on=[(CAST(d1.d_date_sk AS Float64)@1, ss_sold_date_sk@0)], projection=[ss_item_sk@3, ss_store_sk@4, ss_ext_sales_price@5, ss_net_profit@6] │ CoalescePartitionsExec │ [Stage 6] => NetworkBroadcastExec: partitions_per_consumer=3, stage_partitions=9, input_tasks=2 - │ PartitionIsolatorExec: tasks=3 partitions=6 - │ DataSourceExec: file_groups={6 groups: [[/testdata/tpcds/plans_sf1_partitions4/store_sales/part-0.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_sales/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_sales/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_sales/part-2.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_sales/part-2.parquet:..], ...]}, projection=[ss_sold_date_sk, ss_item_sk, ss_store_sk, ss_ext_sales_price, ss_net_profit], file_type=parquet, predicate=DynamicFilter [ empty ] AND DynamicFilter [ empty ] + │ DistributedLeafExec: DataSourceExec: file_groups={3 groups: [[/testdata/tpcds/plans_sf1_partitions4/store_sales/part-0.parquet:.., /testdata/tpcds/plans_sf1_partitions4/store_sales/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_sales/part-1.parquet:.., /testdata/tpcds/plans_sf1_partitions4/store_sales/part-2.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_sales/part-2.parquet:.., /testdata/tpcds/plans_sf1_partitions4/store_sales/part-3.parquet:..]]}, projection=[ss_sold_date_sk, ss_item_sk, ss_store_sk, ss_ext_sales_price, ss_net_profit], file_type=parquet, predicate=DynamicFilter [ empty ] AND DynamicFilter [ empty ] └────────────────────────────────────────────────── ┌───── Stage 6 ── Tasks: t0:[p0..p8] t1:[p9..p17] │ BroadcastExec: input_partitions=3, consumer_tasks=3, output_partitions=9 │ ProjectionExec: expr=[d_date_sk@0 as d_date_sk, CAST(d_date_sk@0 AS Float64) as CAST(d1.d_date_sk AS Float64)] │ FilterExec: d_year@1 = 2001, projection=[d_date_sk@0] │ RepartitionExec: partitioning=RoundRobinBatch(3), input_partitions=2 - │ PartitionIsolatorExec: tasks=2 partitions=4 - │ DataSourceExec: file_groups={4 groups: [[/testdata/tpcds/plans_sf1_partitions4/date_dim/part-0.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-2.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-3.parquet]]}, projection=[d_date_sk, d_year], file_type=parquet, predicate=d_year@6 = 2001, pruning_predicate=d_year_null_count@2 != row_count@3 AND d_year_min@0 <= 2001 AND 2001 <= d_year_max@1, required_guarantees=[d_year in (2001)] + │ DistributedLeafExec: DataSourceExec: file_groups={2 groups: [[/testdata/tpcds/plans_sf1_partitions4/date_dim/part-0.parquet, /testdata/tpcds/plans_sf1_partitions4/date_dim/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-2.parquet, /testdata/tpcds/plans_sf1_partitions4/date_dim/part-3.parquet]]}, projection=[d_date_sk, d_year], file_type=parquet, predicate=d_year@6 = 2001, pruning_predicate=d_year_null_count@2 != row_count@3 AND d_year_min@0 <= 2001 AND 2001 <= d_year_max@1, required_guarantees=[d_year in (2001)] └────────────────────────────────────────────────── ┌───── Stage 15 ── Tasks: t0:[p0..p2] t1:[p0..p2] │ AggregateExec: mode=Partial, gby=[], aggr=[sum(results.ss_net_profit), sum(results.ss_ext_sales_price)] @@ -4292,34 +3960,30 @@ mod tests { │ [Stage 11] => NetworkBroadcastExec: partitions_per_consumer=3, stage_partitions=9, input_tasks=2 │ HashJoinExec: mode=CollectLeft, join_type=Inner, on=[(ss_item_sk@0, i_item_sk@0)], projection=[ss_store_sk@1, ss_ext_sales_price@2, ss_net_profit@3, i_class@5, i_category@6] │ CoalescePartitionsExec - │ [Stage 13] => NetworkBroadcastExec: partitions_per_consumer=2, stage_partitions=6, input_tasks=3 + │ [Stage 13] => NetworkBroadcastExec: partitions_per_consumer=3, stage_partitions=9, input_tasks=3 │ RepartitionExec: partitioning=RoundRobinBatch(3), input_partitions=2 - │ PartitionIsolatorExec: tasks=3 partitions=4 - │ DataSourceExec: file_groups={4 groups: [[/testdata/tpcds/plans_sf1_partitions4/item/part-0.parquet], [/testdata/tpcds/plans_sf1_partitions4/item/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/item/part-2.parquet], [/testdata/tpcds/plans_sf1_partitions4/item/part-3.parquet]]}, projection=[i_item_sk, i_class, i_category], file_type=parquet, predicate=DynamicFilter [ empty ] + │ DistributedLeafExec: DataSourceExec: file_groups={2 groups: [[/testdata/tpcds/plans_sf1_partitions4/item/part-0.parquet, /testdata/tpcds/plans_sf1_partitions4/item/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/item/part-2.parquet, /testdata/tpcds/plans_sf1_partitions4/item/part-3.parquet]]}, projection=[i_item_sk, i_class, i_category], file_type=parquet, predicate=DynamicFilter [ empty ] └────────────────────────────────────────────────── ┌───── Stage 11 ── Tasks: t0:[p0..p8] t1:[p9..p17] │ BroadcastExec: input_partitions=3, consumer_tasks=3, output_partitions=9 │ ProjectionExec: expr=[s_store_sk@0 as s_store_sk, CAST(s_store_sk@0 AS Float64) as CAST(store.s_store_sk AS Float64)] │ FilterExec: s_state@1 = TN, projection=[s_store_sk@0] │ RepartitionExec: partitioning=RoundRobinBatch(3), input_partitions=2 - │ PartitionIsolatorExec: tasks=2 partitions=4 - │ DataSourceExec: file_groups={4 groups: [[/testdata/tpcds/plans_sf1_partitions4/store/part-0.parquet], [/testdata/tpcds/plans_sf1_partitions4/store/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/store/part-2.parquet], [/testdata/tpcds/plans_sf1_partitions4/store/part-3.parquet]]}, projection=[s_store_sk, s_state], file_type=parquet, predicate=s_state@24 = TN, pruning_predicate=s_state_null_count@2 != row_count@3 AND s_state_min@0 <= TN AND TN <= s_state_max@1, required_guarantees=[s_state in (TN)] + │ DistributedLeafExec: DataSourceExec: file_groups={2 groups: [[/testdata/tpcds/plans_sf1_partitions4/store/part-0.parquet, /testdata/tpcds/plans_sf1_partitions4/store/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/store/part-2.parquet, /testdata/tpcds/plans_sf1_partitions4/store/part-3.parquet]]}, projection=[s_store_sk, s_state], file_type=parquet, predicate=s_state@24 = TN, pruning_predicate=s_state_null_count@2 != row_count@3 AND s_state_min@0 <= TN AND TN <= s_state_max@1, required_guarantees=[s_state in (TN)] └────────────────────────────────────────────────── - ┌───── Stage 13 ── Tasks: t0:[p0..p5] t1:[p6..p11] t2:[p12..p17] - │ BroadcastExec: input_partitions=2, consumer_tasks=3, output_partitions=6 + ┌───── Stage 13 ── Tasks: t0:[p0..p8] t1:[p9..p17] t2:[p18..p26] + │ BroadcastExec: input_partitions=3, consumer_tasks=3, output_partitions=9 │ HashJoinExec: mode=CollectLeft, join_type=Inner, on=[(CAST(d1.d_date_sk AS Float64)@1, ss_sold_date_sk@0)], projection=[ss_item_sk@3, ss_store_sk@4, ss_ext_sales_price@5, ss_net_profit@6] │ CoalescePartitionsExec │ [Stage 12] => NetworkBroadcastExec: partitions_per_consumer=3, stage_partitions=9, input_tasks=2 - │ PartitionIsolatorExec: tasks=3 partitions=6 - │ DataSourceExec: file_groups={6 groups: [[/testdata/tpcds/plans_sf1_partitions4/store_sales/part-0.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_sales/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_sales/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_sales/part-2.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_sales/part-2.parquet:..], ...]}, projection=[ss_sold_date_sk, ss_item_sk, ss_store_sk, ss_ext_sales_price, ss_net_profit], file_type=parquet, predicate=DynamicFilter [ empty ] AND DynamicFilter [ empty ] + │ DistributedLeafExec: DataSourceExec: file_groups={3 groups: [[/testdata/tpcds/plans_sf1_partitions4/store_sales/part-0.parquet:.., /testdata/tpcds/plans_sf1_partitions4/store_sales/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_sales/part-1.parquet:.., /testdata/tpcds/plans_sf1_partitions4/store_sales/part-2.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_sales/part-2.parquet:.., /testdata/tpcds/plans_sf1_partitions4/store_sales/part-3.parquet:..]]}, projection=[ss_sold_date_sk, ss_item_sk, ss_store_sk, ss_ext_sales_price, ss_net_profit], file_type=parquet, predicate=DynamicFilter [ empty ] AND DynamicFilter [ empty ] └────────────────────────────────────────────────── ┌───── Stage 12 ── Tasks: t0:[p0..p8] t1:[p9..p17] │ BroadcastExec: input_partitions=3, consumer_tasks=3, output_partitions=9 │ ProjectionExec: expr=[d_date_sk@0 as d_date_sk, CAST(d_date_sk@0 AS Float64) as CAST(d1.d_date_sk AS Float64)] │ FilterExec: d_year@1 = 2001, projection=[d_date_sk@0] │ RepartitionExec: partitioning=RoundRobinBatch(3), input_partitions=2 - │ PartitionIsolatorExec: tasks=2 partitions=4 - │ DataSourceExec: file_groups={4 groups: [[/testdata/tpcds/plans_sf1_partitions4/date_dim/part-0.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-2.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-3.parquet]]}, projection=[d_date_sk, d_year], file_type=parquet, predicate=d_year@6 = 2001, pruning_predicate=d_year_null_count@2 != row_count@3 AND d_year_min@0 <= 2001 AND 2001 <= d_year_max@1, required_guarantees=[d_year in (2001)] + │ DistributedLeafExec: DataSourceExec: file_groups={2 groups: [[/testdata/tpcds/plans_sf1_partitions4/date_dim/part-0.parquet, /testdata/tpcds/plans_sf1_partitions4/date_dim/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-2.parquet, /testdata/tpcds/plans_sf1_partitions4/date_dim/part-3.parquet]]}, projection=[d_date_sk, d_year], file_type=parquet, predicate=d_year@6 = 2001, pruning_predicate=d_year_null_count@2 != row_count@3 AND d_year_min@0 <= 2001 AND 2001 <= d_year_max@1, required_guarantees=[d_year in (2001)] └────────────────────────────────────────────────── "#); Ok(()) @@ -4327,7 +3991,7 @@ mod tests { #[tokio::test] async fn test_tpcds_37() -> Result<()> { let display = test_tpcds_query("q37").await?; - assert_snapshot!(display, @" + assert_snapshot!(display, @r" ┌───── DistributedExec ── Tasks: t0:[p0] │ SortPreservingMergeExec: [i_item_id@0 ASC NULLS LAST], fetch=100 │ [Stage 5] => NetworkCoalesceExec: output_partitions=6, input_tasks=2 @@ -4338,39 +4002,35 @@ mod tests { │ [Stage 4] => NetworkShuffleExec: output_partitions=3, input_tasks=3 └────────────────────────────────────────────────── ┌───── Stage 4 ── Tasks: t0:[p0..p5] t1:[p0..p5] t2:[p0..p5] - │ RepartitionExec: partitioning=Hash([i_item_id@0, i_item_desc@1, i_current_price@2], 6), input_partitions=2 + │ RepartitionExec: partitioning=Hash([i_item_id@0, i_item_desc@1, i_current_price@2], 6), input_partitions=3 │ AggregateExec: mode=Partial, gby=[i_item_id@0 as i_item_id, i_item_desc@1 as i_item_desc, i_current_price@2 as i_current_price], aggr=[] │ HashJoinExec: mode=CollectLeft, join_type=Inner, on=[(i_item_sk@0, cs_item_sk@0)], projection=[i_item_id@1, i_item_desc@2, i_current_price@3] │ CoalescePartitionsExec │ [Stage 3] => NetworkBroadcastExec: partitions_per_consumer=3, stage_partitions=9, input_tasks=3 - │ PartitionIsolatorExec: tasks=3 partitions=6 - │ DataSourceExec: file_groups={6 groups: [[/testdata/tpcds/plans_sf1_partitions4/catalog_sales/part-0.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/catalog_sales/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/catalog_sales/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/catalog_sales/part-2.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/catalog_sales/part-2.parquet:..], ...]}, projection=[cs_item_sk], file_type=parquet, predicate=DynamicFilter [ empty ] + │ DistributedLeafExec: DataSourceExec: file_groups={3 groups: [[/testdata/tpcds/plans_sf1_partitions4/catalog_sales/part-0.parquet:.., /testdata/tpcds/plans_sf1_partitions4/catalog_sales/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/catalog_sales/part-1.parquet:.., /testdata/tpcds/plans_sf1_partitions4/catalog_sales/part-2.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/catalog_sales/part-2.parquet:.., /testdata/tpcds/plans_sf1_partitions4/catalog_sales/part-3.parquet:..]]}, projection=[cs_item_sk], file_type=parquet, predicate=DynamicFilter [ empty ] └────────────────────────────────────────────────── ┌───── Stage 3 ── Tasks: t0:[p0..p8] t1:[p9..p17] t2:[p18..p26] │ BroadcastExec: input_partitions=3, consumer_tasks=3, output_partitions=9 │ HashJoinExec: mode=CollectLeft, join_type=Inner, on=[(inv_date_sk@4, d_date_sk@0)], projection=[i_item_sk@0, i_item_id@1, i_item_desc@2, i_current_price@3] │ CoalescePartitionsExec - │ [Stage 2] => NetworkBroadcastExec: partitions_per_consumer=2, stage_partitions=6, input_tasks=3 + │ [Stage 2] => NetworkBroadcastExec: partitions_per_consumer=3, stage_partitions=9, input_tasks=3 │ FilterExec: d_date@1 >= 2000-02-01 AND d_date@1 <= 2000-04-01, projection=[d_date_sk@0] │ RepartitionExec: partitioning=RoundRobinBatch(3), input_partitions=2 - │ PartitionIsolatorExec: tasks=3 partitions=4 - │ DataSourceExec: file_groups={4 groups: [[/testdata/tpcds/plans_sf1_partitions4/date_dim/part-0.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-2.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-3.parquet]]}, projection=[d_date_sk, d_date], file_type=parquet, predicate=d_date@2 >= 2000-02-01 AND d_date@2 <= 2000-04-01 AND DynamicFilter [ empty ], pruning_predicate=d_date_null_count@1 != row_count@2 AND d_date_max@0 >= 2000-02-01 AND d_date_null_count@1 != row_count@2 AND d_date_min@3 <= 2000-04-01, required_guarantees=[] + │ DistributedLeafExec: DataSourceExec: file_groups={2 groups: [[/testdata/tpcds/plans_sf1_partitions4/date_dim/part-0.parquet, /testdata/tpcds/plans_sf1_partitions4/date_dim/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-2.parquet, /testdata/tpcds/plans_sf1_partitions4/date_dim/part-3.parquet]]}, projection=[d_date_sk, d_date], file_type=parquet, predicate=d_date@2 >= 2000-02-01 AND d_date@2 <= 2000-04-01 AND DynamicFilter [ empty ], pruning_predicate=d_date_null_count@1 != row_count@2 AND d_date_max@0 >= 2000-02-01 AND d_date_null_count@1 != row_count@2 AND d_date_min@3 <= 2000-04-01, required_guarantees=[] └────────────────────────────────────────────────── - ┌───── Stage 2 ── Tasks: t0:[p0..p5] t1:[p6..p11] t2:[p12..p17] - │ BroadcastExec: input_partitions=2, consumer_tasks=3, output_partitions=6 + ┌───── Stage 2 ── Tasks: t0:[p0..p8] t1:[p9..p17] t2:[p18..p26] + │ BroadcastExec: input_partitions=3, consumer_tasks=3, output_partitions=9 │ HashJoinExec: mode=CollectLeft, join_type=Inner, on=[(i_item_sk@0, inv_item_sk@1)], projection=[i_item_sk@0, i_item_id@1, i_item_desc@2, i_current_price@3, inv_date_sk@4] │ CoalescePartitionsExec │ [Stage 1] => NetworkBroadcastExec: partitions_per_consumer=3, stage_partitions=9, input_tasks=2 │ FilterExec: inv_quantity_on_hand@2 >= 100 AND inv_quantity_on_hand@2 <= 500, projection=[inv_date_sk@0, inv_item_sk@1] - │ PartitionIsolatorExec: tasks=3 partitions=6 - │ DataSourceExec: file_groups={6 groups: [[/testdata/tpcds/plans_sf1_partitions4/inventory/part-0.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/inventory/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/inventory/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/inventory/part-2.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/inventory/part-2.parquet:..], ...]}, projection=[inv_date_sk, inv_item_sk, inv_quantity_on_hand], file_type=parquet, predicate=inv_quantity_on_hand@3 >= 100 AND inv_quantity_on_hand@3 <= 500 AND DynamicFilter [ empty ], pruning_predicate=inv_quantity_on_hand_null_count@1 != row_count@2 AND inv_quantity_on_hand_max@0 >= 100 AND inv_quantity_on_hand_null_count@1 != row_count@2 AND inv_quantity_on_hand_min@3 <= 500, required_guarantees=[] + │ DistributedLeafExec: DataSourceExec: file_groups={3 groups: [[/testdata/tpcds/plans_sf1_partitions4/inventory/part-0.parquet:.., /testdata/tpcds/plans_sf1_partitions4/inventory/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/inventory/part-1.parquet:.., /testdata/tpcds/plans_sf1_partitions4/inventory/part-2.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/inventory/part-2.parquet:.., /testdata/tpcds/plans_sf1_partitions4/inventory/part-3.parquet:..]]}, projection=[inv_date_sk, inv_item_sk, inv_quantity_on_hand], file_type=parquet, predicate=inv_quantity_on_hand@3 >= 100 AND inv_quantity_on_hand@3 <= 500 AND DynamicFilter [ empty ], pruning_predicate=inv_quantity_on_hand_null_count@1 != row_count@2 AND inv_quantity_on_hand_max@0 >= 100 AND inv_quantity_on_hand_null_count@1 != row_count@2 AND inv_quantity_on_hand_min@3 <= 500, required_guarantees=[] └────────────────────────────────────────────────── ┌───── Stage 1 ── Tasks: t0:[p0..p8] t1:[p9..p17] │ BroadcastExec: input_partitions=3, consumer_tasks=3, output_partitions=9 │ FilterExec: i_current_price@3 >= Some(6800),4,2 AND i_current_price@3 <= Some(9800),4,2 AND i_manufact_id@4 IN (SET) ([677, 940, 694, 808]), projection=[i_item_sk@0, i_item_id@1, i_item_desc@2, i_current_price@3] │ RepartitionExec: partitioning=RoundRobinBatch(3), input_partitions=2 - │ PartitionIsolatorExec: tasks=2 partitions=4 - │ DataSourceExec: file_groups={4 groups: [[/testdata/tpcds/plans_sf1_partitions4/item/part-0.parquet], [/testdata/tpcds/plans_sf1_partitions4/item/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/item/part-2.parquet], [/testdata/tpcds/plans_sf1_partitions4/item/part-3.parquet]]}, projection=[i_item_sk, i_item_id, i_item_desc, i_current_price, i_manufact_id], file_type=parquet, predicate=i_current_price@5 >= Some(6800),4,2 AND i_current_price@5 <= Some(9800),4,2 AND i_manufact_id@13 IN (SET) ([677, 940, 694, 808]) AND DynamicFilter [ empty ], pruning_predicate=i_current_price_null_count@1 != row_count@2 AND i_current_price_max@0 >= Some(6800),4,2 AND i_current_price_null_count@1 != row_count@2 AND i_current_price_min@3 <= Some(9800),4,2 AND (i_manufact_id_null_count@6 != row_count@2 AND i_manufact_id_min@4 <= 677 AND 677 <= i_manufact_id_max@5 OR i_manufact_id_null_count@6 != row_count@2 AND i_manufact_id_min@4 <= 940 AND 940 <= i_manufact_id_max@5 OR i_manufact_id_null_count@6 != row_count@2 AND i_manufact_id_min@4 <= 694 AND 694 <= i_manufact_id_max@5 OR i_manufact_id_null_count@6 != row_count@2 AND i_manufact_id_min@4 <= 808 AND 808 <= i_manufact_id_max@5), required_guarantees=[i_manufact_id in (677, 694, 808, 940)] + │ DistributedLeafExec: DataSourceExec: file_groups={2 groups: [[/testdata/tpcds/plans_sf1_partitions4/item/part-0.parquet, /testdata/tpcds/plans_sf1_partitions4/item/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/item/part-2.parquet, /testdata/tpcds/plans_sf1_partitions4/item/part-3.parquet]]}, projection=[i_item_sk, i_item_id, i_item_desc, i_current_price, i_manufact_id], file_type=parquet, predicate=i_current_price@5 >= Some(6800),4,2 AND i_current_price@5 <= Some(9800),4,2 AND i_manufact_id@13 IN (SET) ([677, 940, 694, 808]) AND DynamicFilter [ empty ], pruning_predicate=i_current_price_null_count@1 != row_count@2 AND i_current_price_max@0 >= Some(6800),4,2 AND i_current_price_null_count@1 != row_count@2 AND i_current_price_min@3 <= Some(9800),4,2 AND (i_manufact_id_null_count@6 != row_count@2 AND i_manufact_id_min@4 <= 677 AND 677 <= i_manufact_id_max@5 OR i_manufact_id_null_count@6 != row_count@2 AND i_manufact_id_min@4 <= 940 AND 940 <= i_manufact_id_max@5 OR i_manufact_id_null_count@6 != row_count@2 AND i_manufact_id_min@4 <= 694 AND 694 <= i_manufact_id_max@5 OR i_manufact_id_null_count@6 != row_count@2 AND i_manufact_id_min@4 <= 808 AND 808 <= i_manufact_id_max@5), required_guarantees=[i_manufact_id in (677, 694, 808, 940)] └────────────────────────────────────────────────── "); Ok(()) @@ -4378,7 +4038,7 @@ mod tests { #[tokio::test] async fn test_tpcds_38() -> Result<()> { let display = test_tpcds_query("q38").await?; - assert_snapshot!(display, @" + assert_snapshot!(display, @r" ┌───── DistributedExec ── Tasks: t0:[p0] │ ProjectionExec: expr=[count(Int64(1))@0 as count(*)] │ GlobalLimitExec: skip=0, fetch=100 @@ -4410,28 +4070,25 @@ mod tests { │ ProjectionExec: expr=[c_last_name@2 as c_last_name, c_first_name@1 as c_first_name, d_date@0 as d_date] │ HashJoinExec: mode=CollectLeft, join_type=Inner, on=[(ws_bill_customer_sk@0, CAST(customer.c_customer_sk AS Float64)@3)], projection=[d_date@1, c_first_name@3, c_last_name@4] │ CoalescePartitionsExec - │ [Stage 2] => NetworkBroadcastExec: partitions_per_consumer=2, stage_partitions=6, input_tasks=3 + │ [Stage 2] => NetworkBroadcastExec: partitions_per_consumer=3, stage_partitions=9, input_tasks=3 │ ProjectionExec: expr=[c_customer_sk@0 as c_customer_sk, c_first_name@1 as c_first_name, c_last_name@2 as c_last_name, CAST(c_customer_sk@0 AS Float64) as CAST(customer.c_customer_sk AS Float64)] │ RepartitionExec: partitioning=RoundRobinBatch(3), input_partitions=2 - │ PartitionIsolatorExec: tasks=3 partitions=4 - │ DataSourceExec: file_groups={4 groups: [[/testdata/tpcds/plans_sf1_partitions4/customer/part-0.parquet], [/testdata/tpcds/plans_sf1_partitions4/customer/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/customer/part-2.parquet], [/testdata/tpcds/plans_sf1_partitions4/customer/part-3.parquet]]}, projection=[c_customer_sk, c_first_name, c_last_name], file_type=parquet, predicate=DynamicFilter [ empty ] + │ DistributedLeafExec: DataSourceExec: file_groups={2 groups: [[/testdata/tpcds/plans_sf1_partitions4/customer/part-0.parquet, /testdata/tpcds/plans_sf1_partitions4/customer/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/customer/part-2.parquet, /testdata/tpcds/plans_sf1_partitions4/customer/part-3.parquet]]}, projection=[c_customer_sk, c_first_name, c_last_name], file_type=parquet, predicate=DynamicFilter [ empty ] └────────────────────────────────────────────────── - ┌───── Stage 2 ── Tasks: t0:[p0..p5] t1:[p6..p11] t2:[p12..p17] - │ BroadcastExec: input_partitions=2, consumer_tasks=3, output_partitions=6 + ┌───── Stage 2 ── Tasks: t0:[p0..p8] t1:[p9..p17] t2:[p18..p26] + │ BroadcastExec: input_partitions=3, consumer_tasks=3, output_partitions=9 │ ProjectionExec: expr=[ws_bill_customer_sk@1 as ws_bill_customer_sk, d_date@0 as d_date] │ HashJoinExec: mode=CollectLeft, join_type=Inner, on=[(CAST(date_dim.d_date_sk AS Float64)@2, ws_sold_date_sk@0)], projection=[d_date@1, ws_bill_customer_sk@4] │ CoalescePartitionsExec │ [Stage 1] => NetworkBroadcastExec: partitions_per_consumer=3, stage_partitions=9, input_tasks=2 - │ PartitionIsolatorExec: tasks=3 partitions=6 - │ DataSourceExec: file_groups={6 groups: [[/testdata/tpcds/plans_sf1_partitions4/web_sales/part-0.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/web_sales/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/web_sales/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/web_sales/part-2.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/web_sales/part-2.parquet:..], ...]}, projection=[ws_sold_date_sk, ws_bill_customer_sk], file_type=parquet, predicate=DynamicFilter [ empty ] + │ DistributedLeafExec: DataSourceExec: file_groups={3 groups: [[/testdata/tpcds/plans_sf1_partitions4/web_sales/part-0.parquet:.., /testdata/tpcds/plans_sf1_partitions4/web_sales/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/web_sales/part-1.parquet:.., /testdata/tpcds/plans_sf1_partitions4/web_sales/part-2.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/web_sales/part-2.parquet:.., /testdata/tpcds/plans_sf1_partitions4/web_sales/part-3.parquet:..]]}, projection=[ws_sold_date_sk, ws_bill_customer_sk], file_type=parquet, predicate=DynamicFilter [ empty ] └────────────────────────────────────────────────── ┌───── Stage 1 ── Tasks: t0:[p0..p8] t1:[p9..p17] │ BroadcastExec: input_partitions=3, consumer_tasks=3, output_partitions=9 │ ProjectionExec: expr=[d_date_sk@0 as d_date_sk, d_date@1 as d_date, CAST(d_date_sk@0 AS Float64) as CAST(date_dim.d_date_sk AS Float64)] │ FilterExec: d_month_seq@2 >= 1200 AND d_month_seq@2 <= 1211, projection=[d_date_sk@0, d_date@1] │ RepartitionExec: partitioning=RoundRobinBatch(3), input_partitions=2 - │ PartitionIsolatorExec: tasks=2 partitions=4 - │ DataSourceExec: file_groups={4 groups: [[/testdata/tpcds/plans_sf1_partitions4/date_dim/part-0.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-2.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-3.parquet]]}, projection=[d_date_sk, d_date, d_month_seq], file_type=parquet, predicate=d_month_seq@3 >= 1200 AND d_month_seq@3 <= 1211, pruning_predicate=d_month_seq_null_count@1 != row_count@2 AND d_month_seq_max@0 >= 1200 AND d_month_seq_null_count@1 != row_count@2 AND d_month_seq_min@3 <= 1211, required_guarantees=[] + │ DistributedLeafExec: DataSourceExec: file_groups={2 groups: [[/testdata/tpcds/plans_sf1_partitions4/date_dim/part-0.parquet, /testdata/tpcds/plans_sf1_partitions4/date_dim/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-2.parquet, /testdata/tpcds/plans_sf1_partitions4/date_dim/part-3.parquet]]}, projection=[d_date_sk, d_date, d_month_seq], file_type=parquet, predicate=d_month_seq@3 >= 1200 AND d_month_seq@3 <= 1211, pruning_predicate=d_month_seq_null_count@1 != row_count@2 AND d_month_seq_max@0 >= 1200 AND d_month_seq_null_count@1 != row_count@2 AND d_month_seq_min@3 <= 1211, required_guarantees=[] └────────────────────────────────────────────────── ┌───── Stage 8 ── Tasks: t0:[p0..p5] t1:[p6..p11] │ BroadcastExec: input_partitions=3, consumer_tasks=2, output_partitions=6 @@ -4444,28 +4101,25 @@ mod tests { │ ProjectionExec: expr=[c_last_name@2 as c_last_name, c_first_name@1 as c_first_name, d_date@0 as d_date] │ HashJoinExec: mode=CollectLeft, join_type=Inner, on=[(cs_bill_customer_sk@0, CAST(customer.c_customer_sk AS Float64)@3)], projection=[d_date@1, c_first_name@3, c_last_name@4] │ CoalescePartitionsExec - │ [Stage 6] => NetworkBroadcastExec: partitions_per_consumer=2, stage_partitions=6, input_tasks=3 + │ [Stage 6] => NetworkBroadcastExec: partitions_per_consumer=3, stage_partitions=9, input_tasks=3 │ ProjectionExec: expr=[c_customer_sk@0 as c_customer_sk, c_first_name@1 as c_first_name, c_last_name@2 as c_last_name, CAST(c_customer_sk@0 AS Float64) as CAST(customer.c_customer_sk AS Float64)] │ RepartitionExec: partitioning=RoundRobinBatch(3), input_partitions=2 - │ PartitionIsolatorExec: tasks=3 partitions=4 - │ DataSourceExec: file_groups={4 groups: [[/testdata/tpcds/plans_sf1_partitions4/customer/part-0.parquet], [/testdata/tpcds/plans_sf1_partitions4/customer/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/customer/part-2.parquet], [/testdata/tpcds/plans_sf1_partitions4/customer/part-3.parquet]]}, projection=[c_customer_sk, c_first_name, c_last_name], file_type=parquet, predicate=DynamicFilter [ empty ] + │ DistributedLeafExec: DataSourceExec: file_groups={2 groups: [[/testdata/tpcds/plans_sf1_partitions4/customer/part-0.parquet, /testdata/tpcds/plans_sf1_partitions4/customer/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/customer/part-2.parquet, /testdata/tpcds/plans_sf1_partitions4/customer/part-3.parquet]]}, projection=[c_customer_sk, c_first_name, c_last_name], file_type=parquet, predicate=DynamicFilter [ empty ] └────────────────────────────────────────────────── - ┌───── Stage 6 ── Tasks: t0:[p0..p5] t1:[p6..p11] t2:[p12..p17] - │ BroadcastExec: input_partitions=2, consumer_tasks=3, output_partitions=6 + ┌───── Stage 6 ── Tasks: t0:[p0..p8] t1:[p9..p17] t2:[p18..p26] + │ BroadcastExec: input_partitions=3, consumer_tasks=3, output_partitions=9 │ ProjectionExec: expr=[cs_bill_customer_sk@1 as cs_bill_customer_sk, d_date@0 as d_date] │ HashJoinExec: mode=CollectLeft, join_type=Inner, on=[(CAST(date_dim.d_date_sk AS Float64)@2, cs_sold_date_sk@0)], projection=[d_date@1, cs_bill_customer_sk@4] │ CoalescePartitionsExec │ [Stage 5] => NetworkBroadcastExec: partitions_per_consumer=3, stage_partitions=9, input_tasks=2 - │ PartitionIsolatorExec: tasks=3 partitions=6 - │ DataSourceExec: file_groups={6 groups: [[/testdata/tpcds/plans_sf1_partitions4/catalog_sales/part-0.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/catalog_sales/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/catalog_sales/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/catalog_sales/part-2.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/catalog_sales/part-2.parquet:..], ...]}, projection=[cs_sold_date_sk, cs_bill_customer_sk], file_type=parquet, predicate=DynamicFilter [ empty ] + │ DistributedLeafExec: DataSourceExec: file_groups={3 groups: [[/testdata/tpcds/plans_sf1_partitions4/catalog_sales/part-0.parquet:.., /testdata/tpcds/plans_sf1_partitions4/catalog_sales/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/catalog_sales/part-1.parquet:.., /testdata/tpcds/plans_sf1_partitions4/catalog_sales/part-2.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/catalog_sales/part-2.parquet:.., /testdata/tpcds/plans_sf1_partitions4/catalog_sales/part-3.parquet:..]]}, projection=[cs_sold_date_sk, cs_bill_customer_sk], file_type=parquet, predicate=DynamicFilter [ empty ] └────────────────────────────────────────────────── ┌───── Stage 5 ── Tasks: t0:[p0..p8] t1:[p9..p17] │ BroadcastExec: input_partitions=3, consumer_tasks=3, output_partitions=9 │ ProjectionExec: expr=[d_date_sk@0 as d_date_sk, d_date@1 as d_date, CAST(d_date_sk@0 AS Float64) as CAST(date_dim.d_date_sk AS Float64)] │ FilterExec: d_month_seq@2 >= 1200 AND d_month_seq@2 <= 1211, projection=[d_date_sk@0, d_date@1] │ RepartitionExec: partitioning=RoundRobinBatch(3), input_partitions=2 - │ PartitionIsolatorExec: tasks=2 partitions=4 - │ DataSourceExec: file_groups={4 groups: [[/testdata/tpcds/plans_sf1_partitions4/date_dim/part-0.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-2.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-3.parquet]]}, projection=[d_date_sk, d_date, d_month_seq], file_type=parquet, predicate=d_month_seq@3 >= 1200 AND d_month_seq@3 <= 1211, pruning_predicate=d_month_seq_null_count@1 != row_count@2 AND d_month_seq_max@0 >= 1200 AND d_month_seq_null_count@1 != row_count@2 AND d_month_seq_min@3 <= 1211, required_guarantees=[] + │ DistributedLeafExec: DataSourceExec: file_groups={2 groups: [[/testdata/tpcds/plans_sf1_partitions4/date_dim/part-0.parquet, /testdata/tpcds/plans_sf1_partitions4/date_dim/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-2.parquet, /testdata/tpcds/plans_sf1_partitions4/date_dim/part-3.parquet]]}, projection=[d_date_sk, d_date, d_month_seq], file_type=parquet, predicate=d_month_seq@3 >= 1200 AND d_month_seq@3 <= 1211, pruning_predicate=d_month_seq_null_count@1 != row_count@2 AND d_month_seq_max@0 >= 1200 AND d_month_seq_null_count@1 != row_count@2 AND d_month_seq_min@3 <= 1211, required_guarantees=[] └────────────────────────────────────────────────── ┌───── Stage 11 ── Tasks: t0:[p0..p5] t1:[p0..p5] t2:[p0..p5] │ RepartitionExec: partitioning=Hash([c_last_name@0, c_first_name@1, d_date@2], 6), input_partitions=3 @@ -4473,28 +4127,25 @@ mod tests { │ ProjectionExec: expr=[c_last_name@2 as c_last_name, c_first_name@1 as c_first_name, d_date@0 as d_date] │ HashJoinExec: mode=CollectLeft, join_type=Inner, on=[(ss_customer_sk@0, CAST(customer.c_customer_sk AS Float64)@3)], projection=[d_date@1, c_first_name@3, c_last_name@4] │ CoalescePartitionsExec - │ [Stage 10] => NetworkBroadcastExec: partitions_per_consumer=2, stage_partitions=6, input_tasks=3 + │ [Stage 10] => NetworkBroadcastExec: partitions_per_consumer=3, stage_partitions=9, input_tasks=3 │ ProjectionExec: expr=[c_customer_sk@0 as c_customer_sk, c_first_name@1 as c_first_name, c_last_name@2 as c_last_name, CAST(c_customer_sk@0 AS Float64) as CAST(customer.c_customer_sk AS Float64)] │ RepartitionExec: partitioning=RoundRobinBatch(3), input_partitions=2 - │ PartitionIsolatorExec: tasks=3 partitions=4 - │ DataSourceExec: file_groups={4 groups: [[/testdata/tpcds/plans_sf1_partitions4/customer/part-0.parquet], [/testdata/tpcds/plans_sf1_partitions4/customer/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/customer/part-2.parquet], [/testdata/tpcds/plans_sf1_partitions4/customer/part-3.parquet]]}, projection=[c_customer_sk, c_first_name, c_last_name], file_type=parquet, predicate=DynamicFilter [ empty ] + │ DistributedLeafExec: DataSourceExec: file_groups={2 groups: [[/testdata/tpcds/plans_sf1_partitions4/customer/part-0.parquet, /testdata/tpcds/plans_sf1_partitions4/customer/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/customer/part-2.parquet, /testdata/tpcds/plans_sf1_partitions4/customer/part-3.parquet]]}, projection=[c_customer_sk, c_first_name, c_last_name], file_type=parquet, predicate=DynamicFilter [ empty ] └────────────────────────────────────────────────── - ┌───── Stage 10 ── Tasks: t0:[p0..p5] t1:[p6..p11] t2:[p12..p17] - │ BroadcastExec: input_partitions=2, consumer_tasks=3, output_partitions=6 + ┌───── Stage 10 ── Tasks: t0:[p0..p8] t1:[p9..p17] t2:[p18..p26] + │ BroadcastExec: input_partitions=3, consumer_tasks=3, output_partitions=9 │ ProjectionExec: expr=[ss_customer_sk@1 as ss_customer_sk, d_date@0 as d_date] │ HashJoinExec: mode=CollectLeft, join_type=Inner, on=[(CAST(date_dim.d_date_sk AS Float64)@2, ss_sold_date_sk@0)], projection=[d_date@1, ss_customer_sk@4] │ CoalescePartitionsExec │ [Stage 9] => NetworkBroadcastExec: partitions_per_consumer=3, stage_partitions=9, input_tasks=2 - │ PartitionIsolatorExec: tasks=3 partitions=6 - │ DataSourceExec: file_groups={6 groups: [[/testdata/tpcds/plans_sf1_partitions4/store_sales/part-0.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_sales/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_sales/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_sales/part-2.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_sales/part-2.parquet:..], ...]}, projection=[ss_sold_date_sk, ss_customer_sk], file_type=parquet, predicate=DynamicFilter [ empty ] + │ DistributedLeafExec: DataSourceExec: file_groups={3 groups: [[/testdata/tpcds/plans_sf1_partitions4/store_sales/part-0.parquet:.., /testdata/tpcds/plans_sf1_partitions4/store_sales/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_sales/part-1.parquet:.., /testdata/tpcds/plans_sf1_partitions4/store_sales/part-2.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_sales/part-2.parquet:.., /testdata/tpcds/plans_sf1_partitions4/store_sales/part-3.parquet:..]]}, projection=[ss_sold_date_sk, ss_customer_sk], file_type=parquet, predicate=DynamicFilter [ empty ] └────────────────────────────────────────────────── ┌───── Stage 9 ── Tasks: t0:[p0..p8] t1:[p9..p17] │ BroadcastExec: input_partitions=3, consumer_tasks=3, output_partitions=9 │ ProjectionExec: expr=[d_date_sk@0 as d_date_sk, d_date@1 as d_date, CAST(d_date_sk@0 AS Float64) as CAST(date_dim.d_date_sk AS Float64)] │ FilterExec: d_month_seq@2 >= 1200 AND d_month_seq@2 <= 1211, projection=[d_date_sk@0, d_date@1] │ RepartitionExec: partitioning=RoundRobinBatch(3), input_partitions=2 - │ PartitionIsolatorExec: tasks=2 partitions=4 - │ DataSourceExec: file_groups={4 groups: [[/testdata/tpcds/plans_sf1_partitions4/date_dim/part-0.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-2.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-3.parquet]]}, projection=[d_date_sk, d_date, d_month_seq], file_type=parquet, predicate=d_month_seq@3 >= 1200 AND d_month_seq@3 <= 1211, pruning_predicate=d_month_seq_null_count@1 != row_count@2 AND d_month_seq_max@0 >= 1200 AND d_month_seq_null_count@1 != row_count@2 AND d_month_seq_min@3 <= 1211, required_guarantees=[] + │ DistributedLeafExec: DataSourceExec: file_groups={2 groups: [[/testdata/tpcds/plans_sf1_partitions4/date_dim/part-0.parquet, /testdata/tpcds/plans_sf1_partitions4/date_dim/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-2.parquet, /testdata/tpcds/plans_sf1_partitions4/date_dim/part-3.parquet]]}, projection=[d_date_sk, d_date, d_month_seq], file_type=parquet, predicate=d_month_seq@3 >= 1200 AND d_month_seq@3 <= 1211, pruning_predicate=d_month_seq_null_count@1 != row_count@2 AND d_month_seq_max@0 >= 1200 AND d_month_seq_null_count@1 != row_count@2 AND d_month_seq_min@3 <= 1211, required_guarantees=[] └────────────────────────────────────────────────── "); Ok(()) @@ -4502,7 +4153,7 @@ mod tests { #[tokio::test] async fn test_tpcds_39() -> Result<()> { let display = test_tpcds_query("q39").await?; - assert_snapshot!(display, @" + assert_snapshot!(display, @r" ┌───── DistributedExec ── Tasks: t0:[p0] │ ProjectionExec: expr=[wsk1@0 as wsk1, isk1@1 as isk1, dmoy1@2 as dmoy1, mean1@3 as mean1, cov1@4 as cov1, w_warehouse_sk@5 as w_warehouse_sk, i_item_sk@6 as i_item_sk, d_moy@7 as d_moy, mean@8 as mean, cov@9 as cov] │ SortPreservingMergeExec: [w_warehouse_sk@10 ASC, i_item_sk@11 ASC, d_moy@12 ASC, mean@13 ASC, cov@14 ASC, d_moy@7 ASC, mean@8 ASC, cov@9 ASC] @@ -4521,7 +4172,7 @@ mod tests { │ [Stage 4] => NetworkShuffleExec: output_partitions=3, input_tasks=3 └────────────────────────────────────────────────── ┌───── Stage 4 ── Tasks: t0:[p0..p5] t1:[p0..p5] t2:[p0..p5] - │ RepartitionExec: partitioning=Hash([w_warehouse_name@0, w_warehouse_sk@1, i_item_sk@2, d_moy@3], 6), input_partitions=2 + │ RepartitionExec: partitioning=Hash([w_warehouse_name@0, w_warehouse_sk@1, i_item_sk@2, d_moy@3], 6), input_partitions=3 │ AggregateExec: mode=Partial, gby=[w_warehouse_name@3 as w_warehouse_name, w_warehouse_sk@2 as w_warehouse_sk, i_item_sk@1 as i_item_sk, d_moy@4 as d_moy], aggr=[stddev(inventory.inv_quantity_on_hand), avg(inventory.inv_quantity_on_hand)], ordering_mode=PartiallySorted([3]) │ ProjectionExec: expr=[inv_quantity_on_hand@1 as inv_quantity_on_hand, i_item_sk@2 as i_item_sk, w_warehouse_sk@3 as w_warehouse_sk, w_warehouse_name@4 as w_warehouse_name, d_moy@0 as d_moy] │ HashJoinExec: mode=CollectLeft, join_type=Inner, on=[(d_date_sk@0, inv_date_sk@0)], projection=[d_moy@1, inv_quantity_on_hand@3, i_item_sk@4, w_warehouse_sk@5, w_warehouse_name@6] @@ -4535,25 +4186,21 @@ mod tests { │ HashJoinExec: mode=CollectLeft, join_type=Inner, on=[(i_item_sk@0, inv_item_sk@1)], projection=[i_item_sk@0, inv_date_sk@1, inv_warehouse_sk@3, inv_quantity_on_hand@4] │ CoalescePartitionsExec │ [Stage 3] => NetworkBroadcastExec: partitions_per_consumer=2, stage_partitions=6, input_tasks=2 - │ PartitionIsolatorExec: tasks=3 partitions=6 - │ DataSourceExec: file_groups={6 groups: [[/testdata/tpcds/plans_sf1_partitions4/inventory/part-0.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/inventory/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/inventory/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/inventory/part-2.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/inventory/part-2.parquet:..], ...]}, projection=[inv_date_sk, inv_item_sk, inv_warehouse_sk, inv_quantity_on_hand], file_type=parquet, predicate=DynamicFilter [ empty ] AND DynamicFilter [ empty ] AND DynamicFilter [ empty ] + │ DistributedLeafExec: DataSourceExec: file_groups={3 groups: [[/testdata/tpcds/plans_sf1_partitions4/inventory/part-0.parquet:.., /testdata/tpcds/plans_sf1_partitions4/inventory/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/inventory/part-1.parquet:.., /testdata/tpcds/plans_sf1_partitions4/inventory/part-2.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/inventory/part-2.parquet:.., /testdata/tpcds/plans_sf1_partitions4/inventory/part-3.parquet:..]]}, projection=[inv_date_sk, inv_item_sk, inv_warehouse_sk, inv_quantity_on_hand], file_type=parquet, predicate=DynamicFilter [ empty ] AND DynamicFilter [ empty ] AND DynamicFilter [ empty ] └────────────────────────────────────────────────── ┌───── Stage 1 ── Tasks: t0:[p0..p8] t1:[p9..p17] │ BroadcastExec: input_partitions=3, consumer_tasks=3, output_partitions=9 │ FilterExec: d_moy@2 = 1 AND d_year@1 = 2001, projection=[d_date_sk@0, d_moy@2] │ RepartitionExec: partitioning=RoundRobinBatch(3), input_partitions=2 - │ PartitionIsolatorExec: tasks=2 partitions=4 - │ DataSourceExec: file_groups={4 groups: [[/testdata/tpcds/plans_sf1_partitions4/date_dim/part-0.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-2.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-3.parquet]]}, projection=[d_date_sk, d_year, d_moy], file_type=parquet, predicate=d_moy@8 = 1 AND d_year@6 = 2001, pruning_predicate=d_moy_null_count@2 != row_count@3 AND d_moy_min@0 <= 1 AND 1 <= d_moy_max@1 AND d_year_null_count@6 != row_count@3 AND d_year_min@4 <= 2001 AND 2001 <= d_year_max@5, required_guarantees=[d_moy in (1), d_year in (2001)] + │ DistributedLeafExec: DataSourceExec: file_groups={2 groups: [[/testdata/tpcds/plans_sf1_partitions4/date_dim/part-0.parquet, /testdata/tpcds/plans_sf1_partitions4/date_dim/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-2.parquet, /testdata/tpcds/plans_sf1_partitions4/date_dim/part-3.parquet]]}, projection=[d_date_sk, d_year, d_moy], file_type=parquet, predicate=d_moy@8 = 1 AND d_year@6 = 2001, pruning_predicate=d_moy_null_count@2 != row_count@3 AND d_moy_min@0 <= 1 AND 1 <= d_moy_max@1 AND d_year_null_count@6 != row_count@3 AND d_year_min@4 <= 2001 AND 2001 <= d_year_max@5, required_guarantees=[d_moy in (1), d_year in (2001)] └────────────────────────────────────────────────── ┌───── Stage 2 ── Tasks: t0:[p0..p5] t1:[p6..p11] │ BroadcastExec: input_partitions=2, consumer_tasks=3, output_partitions=6 - │ PartitionIsolatorExec: tasks=2 partitions=4 - │ DataSourceExec: file_groups={4 groups: [[/testdata/tpcds/plans_sf1_partitions4/warehouse/part-0.parquet], [/testdata/tpcds/plans_sf1_partitions4/warehouse/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/warehouse/part-2.parquet], [/testdata/tpcds/plans_sf1_partitions4/warehouse/part-3.parquet]]}, projection=[w_warehouse_sk, w_warehouse_name], file_type=parquet + │ DistributedLeafExec: DataSourceExec: file_groups={2 groups: [[/testdata/tpcds/plans_sf1_partitions4/warehouse/part-0.parquet, /testdata/tpcds/plans_sf1_partitions4/warehouse/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/warehouse/part-2.parquet, /testdata/tpcds/plans_sf1_partitions4/warehouse/part-3.parquet]]}, projection=[w_warehouse_sk, w_warehouse_name], file_type=parquet └────────────────────────────────────────────────── ┌───── Stage 3 ── Tasks: t0:[p0..p5] t1:[p6..p11] │ BroadcastExec: input_partitions=2, consumer_tasks=3, output_partitions=6 - │ PartitionIsolatorExec: tasks=2 partitions=4 - │ DataSourceExec: file_groups={4 groups: [[/testdata/tpcds/plans_sf1_partitions4/item/part-0.parquet], [/testdata/tpcds/plans_sf1_partitions4/item/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/item/part-2.parquet], [/testdata/tpcds/plans_sf1_partitions4/item/part-3.parquet]]}, projection=[i_item_sk], file_type=parquet + │ DistributedLeafExec: DataSourceExec: file_groups={2 groups: [[/testdata/tpcds/plans_sf1_partitions4/item/part-0.parquet, /testdata/tpcds/plans_sf1_partitions4/item/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/item/part-2.parquet, /testdata/tpcds/plans_sf1_partitions4/item/part-3.parquet]]}, projection=[i_item_sk], file_type=parquet └────────────────────────────────────────────────── ┌───── Stage 10 ── Tasks: t0:[p0..p2] t1:[p0..p2] │ RepartitionExec: partitioning=Hash([i_item_sk@1, w_warehouse_sk@0], 3), input_partitions=3 @@ -4564,7 +4211,7 @@ mod tests { │ [Stage 9] => NetworkShuffleExec: output_partitions=3, input_tasks=3 └────────────────────────────────────────────────── ┌───── Stage 9 ── Tasks: t0:[p0..p5] t1:[p0..p5] t2:[p0..p5] - │ RepartitionExec: partitioning=Hash([w_warehouse_name@0, w_warehouse_sk@1, i_item_sk@2, d_moy@3], 6), input_partitions=2 + │ RepartitionExec: partitioning=Hash([w_warehouse_name@0, w_warehouse_sk@1, i_item_sk@2, d_moy@3], 6), input_partitions=3 │ AggregateExec: mode=Partial, gby=[w_warehouse_name@3 as w_warehouse_name, w_warehouse_sk@2 as w_warehouse_sk, i_item_sk@1 as i_item_sk, d_moy@4 as d_moy], aggr=[stddev(inventory.inv_quantity_on_hand), avg(inventory.inv_quantity_on_hand)], ordering_mode=PartiallySorted([3]) │ ProjectionExec: expr=[inv_quantity_on_hand@1 as inv_quantity_on_hand, i_item_sk@2 as i_item_sk, w_warehouse_sk@3 as w_warehouse_sk, w_warehouse_name@4 as w_warehouse_name, d_moy@0 as d_moy] │ HashJoinExec: mode=CollectLeft, join_type=Inner, on=[(d_date_sk@0, inv_date_sk@0)], projection=[d_moy@1, inv_quantity_on_hand@3, i_item_sk@4, w_warehouse_sk@5, w_warehouse_name@6] @@ -4578,25 +4225,21 @@ mod tests { │ HashJoinExec: mode=CollectLeft, join_type=Inner, on=[(i_item_sk@0, inv_item_sk@1)], projection=[i_item_sk@0, inv_date_sk@1, inv_warehouse_sk@3, inv_quantity_on_hand@4] │ CoalescePartitionsExec │ [Stage 8] => NetworkBroadcastExec: partitions_per_consumer=2, stage_partitions=6, input_tasks=2 - │ PartitionIsolatorExec: tasks=3 partitions=6 - │ DataSourceExec: file_groups={6 groups: [[/testdata/tpcds/plans_sf1_partitions4/inventory/part-0.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/inventory/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/inventory/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/inventory/part-2.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/inventory/part-2.parquet:..], ...]}, projection=[inv_date_sk, inv_item_sk, inv_warehouse_sk, inv_quantity_on_hand], file_type=parquet, predicate=DynamicFilter [ empty ] AND DynamicFilter [ empty ] AND DynamicFilter [ empty ] + │ DistributedLeafExec: DataSourceExec: file_groups={3 groups: [[/testdata/tpcds/plans_sf1_partitions4/inventory/part-0.parquet:.., /testdata/tpcds/plans_sf1_partitions4/inventory/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/inventory/part-1.parquet:.., /testdata/tpcds/plans_sf1_partitions4/inventory/part-2.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/inventory/part-2.parquet:.., /testdata/tpcds/plans_sf1_partitions4/inventory/part-3.parquet:..]]}, projection=[inv_date_sk, inv_item_sk, inv_warehouse_sk, inv_quantity_on_hand], file_type=parquet, predicate=DynamicFilter [ empty ] AND DynamicFilter [ empty ] AND DynamicFilter [ empty ] └────────────────────────────────────────────────── ┌───── Stage 6 ── Tasks: t0:[p0..p8] t1:[p9..p17] │ BroadcastExec: input_partitions=3, consumer_tasks=3, output_partitions=9 │ FilterExec: d_moy@2 = 2 AND d_year@1 = 2001, projection=[d_date_sk@0, d_moy@2] │ RepartitionExec: partitioning=RoundRobinBatch(3), input_partitions=2 - │ PartitionIsolatorExec: tasks=2 partitions=4 - │ DataSourceExec: file_groups={4 groups: [[/testdata/tpcds/plans_sf1_partitions4/date_dim/part-0.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-2.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-3.parquet]]}, projection=[d_date_sk, d_year, d_moy], file_type=parquet, predicate=d_moy@8 = 2 AND d_year@6 = 2001, pruning_predicate=d_moy_null_count@2 != row_count@3 AND d_moy_min@0 <= 2 AND 2 <= d_moy_max@1 AND d_year_null_count@6 != row_count@3 AND d_year_min@4 <= 2001 AND 2001 <= d_year_max@5, required_guarantees=[d_moy in (2), d_year in (2001)] + │ DistributedLeafExec: DataSourceExec: file_groups={2 groups: [[/testdata/tpcds/plans_sf1_partitions4/date_dim/part-0.parquet, /testdata/tpcds/plans_sf1_partitions4/date_dim/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-2.parquet, /testdata/tpcds/plans_sf1_partitions4/date_dim/part-3.parquet]]}, projection=[d_date_sk, d_year, d_moy], file_type=parquet, predicate=d_moy@8 = 2 AND d_year@6 = 2001, pruning_predicate=d_moy_null_count@2 != row_count@3 AND d_moy_min@0 <= 2 AND 2 <= d_moy_max@1 AND d_year_null_count@6 != row_count@3 AND d_year_min@4 <= 2001 AND 2001 <= d_year_max@5, required_guarantees=[d_moy in (2), d_year in (2001)] └────────────────────────────────────────────────── ┌───── Stage 7 ── Tasks: t0:[p0..p5] t1:[p6..p11] │ BroadcastExec: input_partitions=2, consumer_tasks=3, output_partitions=6 - │ PartitionIsolatorExec: tasks=2 partitions=4 - │ DataSourceExec: file_groups={4 groups: [[/testdata/tpcds/plans_sf1_partitions4/warehouse/part-0.parquet], [/testdata/tpcds/plans_sf1_partitions4/warehouse/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/warehouse/part-2.parquet], [/testdata/tpcds/plans_sf1_partitions4/warehouse/part-3.parquet]]}, projection=[w_warehouse_sk, w_warehouse_name], file_type=parquet + │ DistributedLeafExec: DataSourceExec: file_groups={2 groups: [[/testdata/tpcds/plans_sf1_partitions4/warehouse/part-0.parquet, /testdata/tpcds/plans_sf1_partitions4/warehouse/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/warehouse/part-2.parquet, /testdata/tpcds/plans_sf1_partitions4/warehouse/part-3.parquet]]}, projection=[w_warehouse_sk, w_warehouse_name], file_type=parquet └────────────────────────────────────────────────── ┌───── Stage 8 ── Tasks: t0:[p0..p5] t1:[p6..p11] │ BroadcastExec: input_partitions=2, consumer_tasks=3, output_partitions=6 - │ PartitionIsolatorExec: tasks=2 partitions=4 - │ DataSourceExec: file_groups={4 groups: [[/testdata/tpcds/plans_sf1_partitions4/item/part-0.parquet], [/testdata/tpcds/plans_sf1_partitions4/item/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/item/part-2.parquet], [/testdata/tpcds/plans_sf1_partitions4/item/part-3.parquet]]}, projection=[i_item_sk], file_type=parquet + │ DistributedLeafExec: DataSourceExec: file_groups={2 groups: [[/testdata/tpcds/plans_sf1_partitions4/item/part-0.parquet, /testdata/tpcds/plans_sf1_partitions4/item/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/item/part-2.parquet, /testdata/tpcds/plans_sf1_partitions4/item/part-3.parquet]]}, projection=[i_item_sk], file_type=parquet └────────────────────────────────────────────────── "); Ok(()) @@ -4625,8 +4268,7 @@ mod tests { │ ProjectionExec: expr=[d_date_sk@0 as d_date_sk, d_date@1 as d_date, CAST(d_date_sk@0 AS Float64) as CAST(date_dim.d_date_sk AS Float64)] │ FilterExec: d_date@1 >= 2000-02-10 AND d_date@1 <= 2000-04-10 │ RepartitionExec: partitioning=RoundRobinBatch(3), input_partitions=2 - │ PartitionIsolatorExec: tasks=3 partitions=4 - │ DataSourceExec: file_groups={4 groups: [[/testdata/tpcds/plans_sf1_partitions4/date_dim/part-0.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-2.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-3.parquet]]}, projection=[d_date_sk, d_date], file_type=parquet, predicate=d_date@2 >= 2000-02-10 AND d_date@2 <= 2000-04-10 AND DynamicFilter [ empty ], pruning_predicate=d_date_null_count@1 != row_count@2 AND d_date_max@0 >= 2000-02-10 AND d_date_null_count@1 != row_count@2 AND d_date_min@3 <= 2000-04-10, required_guarantees=[] + │ DistributedLeafExec: DataSourceExec: file_groups={2 groups: [[/testdata/tpcds/plans_sf1_partitions4/date_dim/part-0.parquet, /testdata/tpcds/plans_sf1_partitions4/date_dim/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-2.parquet, /testdata/tpcds/plans_sf1_partitions4/date_dim/part-3.parquet]]}, projection=[d_date_sk, d_date], file_type=parquet, predicate=d_date@2 >= 2000-02-10 AND d_date@2 <= 2000-04-10 AND DynamicFilter [ empty ], pruning_predicate=d_date_null_count@1 != row_count@2 AND d_date_max@0 >= 2000-02-10 AND d_date_null_count@1 != row_count@2 AND d_date_min@3 <= 2000-04-10, required_guarantees=[] └────────────────────────────────────────────────── ┌───── Stage 5 ── Tasks: t0:[p0..p8] t1:[p9..p17] t2:[p18..p26] │ BroadcastExec: input_partitions=3, consumer_tasks=3, output_partitions=9 @@ -4635,8 +4277,7 @@ mod tests { │ [Stage 4] => NetworkBroadcastExec: partitions_per_consumer=3, stage_partitions=9, input_tasks=3 │ FilterExec: i_current_price@2 >= Some(99),4,2 AND i_current_price@2 <= Some(149),4,2, projection=[i_item_sk@0, i_item_id@1] │ RepartitionExec: partitioning=RoundRobinBatch(3), input_partitions=2 - │ PartitionIsolatorExec: tasks=3 partitions=4 - │ DataSourceExec: file_groups={4 groups: [[/testdata/tpcds/plans_sf1_partitions4/item/part-0.parquet], [/testdata/tpcds/plans_sf1_partitions4/item/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/item/part-2.parquet], [/testdata/tpcds/plans_sf1_partitions4/item/part-3.parquet]]}, projection=[i_item_sk, i_item_id, i_current_price], file_type=parquet, predicate=i_current_price@5 >= Some(99),4,2 AND i_current_price@5 <= Some(149),4,2 AND DynamicFilter [ empty ], pruning_predicate=i_current_price_null_count@1 != row_count@2 AND i_current_price_max@0 >= Some(99),4,2 AND i_current_price_null_count@1 != row_count@2 AND i_current_price_min@3 <= Some(149),4,2, required_guarantees=[] + │ DistributedLeafExec: DataSourceExec: file_groups={2 groups: [[/testdata/tpcds/plans_sf1_partitions4/item/part-0.parquet, /testdata/tpcds/plans_sf1_partitions4/item/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/item/part-2.parquet, /testdata/tpcds/plans_sf1_partitions4/item/part-3.parquet]]}, projection=[i_item_sk, i_item_id, i_current_price], file_type=parquet, predicate=i_current_price@5 >= Some(99),4,2 AND i_current_price@5 <= Some(149),4,2 AND DynamicFilter [ empty ], pruning_predicate=i_current_price_null_count@1 != row_count@2 AND i_current_price_max@0 >= Some(99),4,2 AND i_current_price_null_count@1 != row_count@2 AND i_current_price_min@3 <= Some(149),4,2, required_guarantees=[] └────────────────────────────────────────────────── ┌───── Stage 4 ── Tasks: t0:[p0..p8] t1:[p9..p17] t2:[p18..p26] │ BroadcastExec: input_partitions=3, consumer_tasks=3, output_partitions=9 @@ -4651,18 +4292,15 @@ mod tests { └────────────────────────────────────────────────── ┌───── Stage 1 ── Tasks: t0:[p0..p5] t1:[p6..p11] │ BroadcastExec: input_partitions=2, consumer_tasks=3, output_partitions=6 - │ PartitionIsolatorExec: tasks=2 partitions=4 - │ DataSourceExec: file_groups={4 groups: [[/testdata/tpcds/plans_sf1_partitions4/warehouse/part-0.parquet], [/testdata/tpcds/plans_sf1_partitions4/warehouse/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/warehouse/part-2.parquet], [/testdata/tpcds/plans_sf1_partitions4/warehouse/part-3.parquet]]}, projection=[w_warehouse_sk, w_state, CAST(w_warehouse_sk@0 AS Float64) as CAST(warehouse.w_warehouse_sk AS Float64)], file_type=parquet + │ DistributedLeafExec: DataSourceExec: file_groups={2 groups: [[/testdata/tpcds/plans_sf1_partitions4/warehouse/part-0.parquet, /testdata/tpcds/plans_sf1_partitions4/warehouse/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/warehouse/part-2.parquet, /testdata/tpcds/plans_sf1_partitions4/warehouse/part-3.parquet]]}, projection=[w_warehouse_sk, w_state, CAST(w_warehouse_sk@0 AS Float64) as CAST(warehouse.w_warehouse_sk AS Float64)], file_type=parquet └────────────────────────────────────────────────── ┌───── Stage 2 ── Tasks: t0:[p0..p8] t1:[p0..p8] t2:[p0..p8] - │ RepartitionExec: partitioning=Hash([cr_order_number@1, cr_item_sk@0], 9), input_partitions=2 - │ PartitionIsolatorExec: tasks=3 partitions=6 - │ DataSourceExec: file_groups={6 groups: [[/testdata/tpcds/plans_sf1_partitions4/catalog_returns/part-0.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/catalog_returns/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/catalog_returns/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/catalog_returns/part-2.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/catalog_returns/part-2.parquet:..], ...]}, projection=[cr_item_sk, cr_order_number, cr_refunded_cash], file_type=parquet + │ RepartitionExec: partitioning=Hash([cr_order_number@1, cr_item_sk@0], 9), input_partitions=3 + │ DistributedLeafExec: DataSourceExec: file_groups={3 groups: [[/testdata/tpcds/plans_sf1_partitions4/catalog_returns/part-0.parquet:.., /testdata/tpcds/plans_sf1_partitions4/catalog_returns/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/catalog_returns/part-1.parquet:.., /testdata/tpcds/plans_sf1_partitions4/catalog_returns/part-2.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/catalog_returns/part-2.parquet:.., /testdata/tpcds/plans_sf1_partitions4/catalog_returns/part-3.parquet:..]]}, projection=[cr_item_sk, cr_order_number, cr_refunded_cash], file_type=parquet └────────────────────────────────────────────────── ┌───── Stage 3 ── Tasks: t0:[p0..p8] t1:[p0..p8] t2:[p0..p8] - │ RepartitionExec: partitioning=Hash([cs_order_number@3, cs_item_sk@2], 9), input_partitions=2 - │ PartitionIsolatorExec: tasks=3 partitions=6 - │ DataSourceExec: file_groups={6 groups: [[/testdata/tpcds/plans_sf1_partitions4/catalog_sales/part-0.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/catalog_sales/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/catalog_sales/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/catalog_sales/part-2.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/catalog_sales/part-2.parquet:..], ...]}, projection=[cs_sold_date_sk, cs_warehouse_sk, cs_item_sk, cs_order_number, cs_sales_price], file_type=parquet, predicate=DynamicFilter [ empty ] + │ RepartitionExec: partitioning=Hash([cs_order_number@3, cs_item_sk@2], 9), input_partitions=3 + │ DistributedLeafExec: DataSourceExec: file_groups={3 groups: [[/testdata/tpcds/plans_sf1_partitions4/catalog_sales/part-0.parquet:.., /testdata/tpcds/plans_sf1_partitions4/catalog_sales/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/catalog_sales/part-1.parquet:.., /testdata/tpcds/plans_sf1_partitions4/catalog_sales/part-2.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/catalog_sales/part-2.parquet:.., /testdata/tpcds/plans_sf1_partitions4/catalog_sales/part-3.parquet:..]]}, projection=[cs_sold_date_sk, cs_warehouse_sk, cs_item_sk, cs_order_number, cs_sales_price], file_type=parquet, predicate=DynamicFilter [ empty ] └────────────────────────────────────────────────── "#); Ok(()) @@ -4670,7 +4308,7 @@ mod tests { #[tokio::test] async fn test_tpcds_41() -> Result<()> { let display = test_tpcds_query("q41").await?; - assert_snapshot!(display, @" + assert_snapshot!(display, @r" ┌───── DistributedExec ── Tasks: t0:[p0] │ SortPreservingMergeExec: [i_product_name@0 ASC NULLS LAST], fetch=100 │ SortExec: TopK(fetch=100), expr=[i_product_name@0 ASC NULLS LAST], preserve_partitioning=[true] @@ -4687,8 +4325,7 @@ mod tests { │ [Stage 2] => NetworkBroadcastExec: partitions_per_consumer=3, stage_partitions=6, input_tasks=1 │ FilterExec: i_manufact_id@0 >= 738 AND i_manufact_id@0 <= 778, projection=[i_manufact@1, i_product_name@2] │ RepartitionExec: partitioning=RoundRobinBatch(3), input_partitions=2 - │ PartitionIsolatorExec: tasks=2 partitions=4 - │ DataSourceExec: file_groups={4 groups: [[/testdata/tpcds/plans_sf1_partitions4/item/part-0.parquet], [/testdata/tpcds/plans_sf1_partitions4/item/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/item/part-2.parquet], [/testdata/tpcds/plans_sf1_partitions4/item/part-3.parquet]]}, projection=[i_manufact_id, i_manufact, i_product_name], file_type=parquet, predicate=i_manufact_id@13 >= 738 AND i_manufact_id@13 <= 778 AND DynamicFilter [ empty ], pruning_predicate=i_manufact_id_null_count@1 != row_count@2 AND i_manufact_id_max@0 >= 738 AND i_manufact_id_null_count@1 != row_count@2 AND i_manufact_id_min@3 <= 778, required_guarantees=[] + │ DistributedLeafExec: DataSourceExec: file_groups={2 groups: [[/testdata/tpcds/plans_sf1_partitions4/item/part-0.parquet, /testdata/tpcds/plans_sf1_partitions4/item/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/item/part-2.parquet, /testdata/tpcds/plans_sf1_partitions4/item/part-3.parquet]]}, projection=[i_manufact_id, i_manufact, i_product_name], file_type=parquet, predicate=i_manufact_id@13 >= 738 AND i_manufact_id@13 <= 778 AND DynamicFilter [ empty ], pruning_predicate=i_manufact_id_null_count@1 != row_count@2 AND i_manufact_id_max@0 >= 738 AND i_manufact_id_null_count@1 != row_count@2 AND i_manufact_id_min@3 <= 778, required_guarantees=[] └────────────────────────────────────────────────── ┌───── Stage 2 ── Tasks: t0:[p0..p5] │ BroadcastExec: input_partitions=3, consumer_tasks=2, output_partitions=6 @@ -4702,8 +4339,7 @@ mod tests { │ FilterExec: __common_expr_4@0 AND ((i_color@4 = powder OR i_color@4 = khaki) AND (i_units@5 = Ounce OR i_units@5 = Oz) AND (i_size@3 = medium OR i_size@3 = extra large) OR (i_color@4 = brown OR i_color@4 = honeydew) AND (i_units@5 = Bunch OR i_units@5 = Ton) AND (i_size@3 = N/A OR i_size@3 = small)) OR i_category@1 = Men AND (i_color@4 = floral OR i_color@4 = deep) AND (i_units@5 = N/A OR i_units@5 = Dozen) AND i_size@3 = petite OR i_category@1 = Men AND (i_color@4 = light OR i_color@4 = cornflower) AND (i_units@5 = Box OR i_units@5 = Pound) AND (i_size@3 = medium OR i_size@3 = extra large) OR __common_expr_4@0 AND ((i_color@4 = midnight OR i_color@4 = snow) AND (i_units@5 = Pallet OR i_units@5 = Gross) AND (i_size@3 = medium OR i_size@3 = extra large) OR (i_color@4 = cyan OR i_color@4 = papaya) AND (i_units@5 = Cup OR i_units@5 = Dram) AND (i_size@3 = N/A OR i_size@3 = small)) OR i_category@1 = Men AND (i_color@4 = orange OR i_color@4 = frosted) AND (i_units@5 = Each OR i_units@5 = Tbl) AND i_size@3 = petite OR i_category@1 = Men AND (i_color@4 = forest OR i_color@4 = ghost) AND (i_units@5 = Lb OR i_units@5 = Bundle) AND (i_size@3 = medium OR i_size@3 = extra large), projection=[i_manufact@2] │ ProjectionExec: expr=[i_category@0 = Women as __common_expr_4, i_category@0 as i_category, i_manufact@1 as i_manufact, i_size@2 as i_size, i_color@3 as i_color, i_units@4 as i_units] │ RepartitionExec: partitioning=RoundRobinBatch(3), input_partitions=2 - │ PartitionIsolatorExec: tasks=2 partitions=4 - │ DataSourceExec: file_groups={4 groups: [[/testdata/tpcds/plans_sf1_partitions4/item/part-0.parquet], [/testdata/tpcds/plans_sf1_partitions4/item/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/item/part-2.parquet], [/testdata/tpcds/plans_sf1_partitions4/item/part-3.parquet]]}, projection=[i_category, i_manufact, i_size, i_color, i_units], file_type=parquet, predicate=i_category@12 = Women AND ((i_color@17 = powder OR i_color@17 = khaki) AND (i_units@18 = Ounce OR i_units@18 = Oz) AND (i_size@15 = medium OR i_size@15 = extra large) OR (i_color@17 = brown OR i_color@17 = honeydew) AND (i_units@18 = Bunch OR i_units@18 = Ton) AND (i_size@15 = N/A OR i_size@15 = small)) OR i_category@12 = Men AND (i_color@17 = floral OR i_color@17 = deep) AND (i_units@18 = N/A OR i_units@18 = Dozen) AND i_size@15 = petite OR i_category@12 = Men AND (i_color@17 = light OR i_color@17 = cornflower) AND (i_units@18 = Box OR i_units@18 = Pound) AND (i_size@15 = medium OR i_size@15 = extra large) OR i_category@12 = Women AND ((i_color@17 = midnight OR i_color@17 = snow) AND (i_units@18 = Pallet OR i_units@18 = Gross) AND (i_size@15 = medium OR i_size@15 = extra large) OR (i_color@17 = cyan OR i_color@17 = papaya) AND (i_units@18 = Cup OR i_units@18 = Dram) AND (i_size@15 = N/A OR i_size@15 = small)) OR i_category@12 = Men AND (i_color@17 = orange OR i_color@17 = frosted) AND (i_units@18 = Each OR i_units@18 = Tbl) AND i_size@15 = petite OR i_category@12 = Men AND (i_color@17 = forest OR i_color@17 = ghost) AND (i_units@18 = Lb OR i_units@18 = Bundle) AND (i_size@15 = medium OR i_size@15 = extra large), pruning_predicate=i_category_null_count@2 != row_count@3 AND i_category_min@0 <= Women AND Women <= i_category_max@1 AND ((i_color_null_count@6 != row_count@3 AND i_color_min@4 <= powder AND powder <= i_color_max@5 OR i_color_null_count@6 != row_count@3 AND i_color_min@4 <= khaki AND khaki <= i_color_max@5) AND (i_units_null_count@9 != row_count@3 AND i_units_min@7 <= Ounce AND Ounce <= i_units_max@8 OR i_units_null_count@9 != row_count@3 AND i_units_min@7 <= Oz AND Oz <= i_units_max@8) AND (i_size_null_count@12 != row_count@3 AND i_size_min@10 <= medium AND medium <= i_size_max@11 OR i_size_null_count@12 != row_count@3 AND i_size_min@10 <= extra large AND extra large <= i_size_max@11) OR (i_color_null_count@6 != row_count@3 AND i_color_min@4 <= brown AND brown <= i_color_max@5 OR i_color_null_count@6 != row_count@3 AND i_color_min@4 <= honeydew AND honeydew <= i_color_max@5) AND (i_units_null_count@9 != row_count@3 AND i_units_min@7 <= Bunch AND Bunch <= i_units_max@8 OR i_units_null_count@9 != row_count@3 AND i_units_min@7 <= Ton AND Ton <= i_units_max@8) AND (i_size_null_count@12 != row_count@3 AND i_size_min@10 <= N/A AND N/A <= i_size_max@11 OR i_size_null_count@12 != row_count@3 AND i_size_min@10 <= small AND small <= i_size_max@11)) OR i_category_null_count@2 != row_count@3 AND i_category_min@0 <= Men AND Men <= i_category_max@1 AND (i_color_null_count@6 != row_count@3 AND i_color_min@4 <= floral AND floral <= i_color_max@5 OR i_color_null_count@6 != row_count@3 AND i_color_min@4 <= deep AND deep <= i_color_max@5) AND (i_units_null_count@9 != row_count@3 AND i_units_min@7 <= N/A AND N/A <= i_units_max@8 OR i_units_null_count@9 != row_count@3 AND i_units_min@7 <= Dozen AND Dozen <= i_units_max@8) AND i_size_null_count@12 != row_count@3 AND i_size_min@10 <= petite AND petite <= i_size_max@11 OR i_category_null_count@2 != row_count@3 AND i_category_min@0 <= Men AND Men <= i_category_max@1 AND (i_color_null_count@6 != row_count@3 AND i_color_min@4 <= light AND light <= i_color_max@5 OR i_color_null_count@6 != row_count@3 AND i_color_min@4 <= cornflower AND cornflower <= i_color_max@5) AND (i_units_null_count@9 != row_count@3 AND i_units_min@7 <= Box AND Box <= i_units_max@8 OR i_units_null_count@9 != row_count@3 AND i_units_min@7 <= Pound AND Pound <= i_units_max@8) AND (i_size_null_count@12 != row_count@3 AND i_size_min@10 <= medium AND medium <= i_size_max@11 OR i_size_null_count@12 != row_count@3 AND i_size_min@10 <= extra large AND extra large <= i_size_max@11) OR i_category_null_count@2 != row_count@3 AND i_category_min@0 <= Women AND Women <= i_category_max@1 AND ((i_color_null_count@6 != row_count@3 AND i_color_min@4 <= midnight AND midnight <= i_color_max@5 OR i_color_null_count@6 != row_count@3 AND i_color_min@4 <= snow AND snow <= i_color_max@5) AND (i_units_null_count@9 != row_count@3 AND i_units_min@7 <= Pallet AND Pallet <= i_units_max@8 OR i_units_null_count@9 != row_count@3 AND i_units_min@7 <= Gross AND Gross <= i_units_max@8) AND (i_size_null_count@12 != row_count@3 AND i_size_min@10 <= medium AND medium <= i_size_max@11 OR i_size_null_count@12 != row_count@3 AND i_size_min@10 <= extra large AND extra large <= i_size_max@11) OR (i_color_null_count@6 != row_count@3 AND i_color_min@4 <= cyan AND cyan <= i_color_max@5 OR i_color_null_count@6 != row_count@3 AND i_color_min@4 <= papaya AND papaya <= i_color_max@5) AND (i_units_null_count@9 != row_count@3 AND i_units_min@7 <= Cup AND Cup <= i_units_max@8 OR i_units_null_count@9 != row_count@3 AND i_units_min@7 <= Dram AND Dram <= i_units_max@8) AND (i_size_null_count@12 != row_count@3 AND i_size_min@10 <= N/A AND N/A <= i_size_max@11 OR i_size_null_count@12 != row_count@3 AND i_size_min@10 <= small AND small <= i_size_max@11)) OR i_category_null_count@2 != row_count@3 AND i_category_min@0 <= Men AND Men <= i_category_max@1 AND (i_color_null_count@6 != row_count@3 AND i_color_min@4 <= orange AND orange <= i_color_max@5 OR i_color_null_count@6 != row_count@3 AND i_color_min@4 <= frosted AND frosted <= i_color_max@5) AND (i_units_null_count@9 != row_count@3 AND i_units_min@7 <= Each AND Each <= i_units_max@8 OR i_units_null_count@9 != row_count@3 AND i_units_min@7 <= Tbl AND Tbl <= i_units_max@8) AND i_size_null_count@12 != row_count@3 AND i_size_min@10 <= petite AND petite <= i_size_max@11 OR i_category_null_count@2 != row_count@3 AND i_category_min@0 <= Men AND Men <= i_category_max@1 AND (i_color_null_count@6 != row_count@3 AND i_color_min@4 <= forest AND forest <= i_color_max@5 OR i_color_null_count@6 != row_count@3 AND i_color_min@4 <= ghost AND ghost <= i_color_max@5) AND (i_units_null_count@9 != row_count@3 AND i_units_min@7 <= Lb AND Lb <= i_units_max@8 OR i_units_null_count@9 != row_count@3 AND i_units_min@7 <= Bundle AND Bundle <= i_units_max@8) AND (i_size_null_count@12 != row_count@3 AND i_size_min@10 <= medium AND medium <= i_size_max@11 OR i_size_null_count@12 != row_count@3 AND i_size_min@10 <= extra large AND extra large <= i_size_max@11), required_guarantees=[i_category in (Men, Women)] + │ DistributedLeafExec: DataSourceExec: file_groups={2 groups: [[/testdata/tpcds/plans_sf1_partitions4/item/part-0.parquet, /testdata/tpcds/plans_sf1_partitions4/item/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/item/part-2.parquet, /testdata/tpcds/plans_sf1_partitions4/item/part-3.parquet]]}, projection=[i_category, i_manufact, i_size, i_color, i_units], file_type=parquet, predicate=i_category@12 = Women AND ((i_color@17 = powder OR i_color@17 = khaki) AND (i_units@18 = Ounce OR i_units@18 = Oz) AND (i_size@15 = medium OR i_size@15 = extra large) OR (i_color@17 = brown OR i_color@17 = honeydew) AND (i_units@18 = Bunch OR i_units@18 = Ton) AND (i_size@15 = N/A OR i_size@15 = small)) OR i_category@12 = Men AND (i_color@17 = floral OR i_color@17 = deep) AND (i_units@18 = N/A OR i_units@18 = Dozen) AND i_size@15 = petite OR i_category@12 = Men AND (i_color@17 = light OR i_color@17 = cornflower) AND (i_units@18 = Box OR i_units@18 = Pound) AND (i_size@15 = medium OR i_size@15 = extra large) OR i_category@12 = Women AND ((i_color@17 = midnight OR i_color@17 = snow) AND (i_units@18 = Pallet OR i_units@18 = Gross) AND (i_size@15 = medium OR i_size@15 = extra large) OR (i_color@17 = cyan OR i_color@17 = papaya) AND (i_units@18 = Cup OR i_units@18 = Dram) AND (i_size@15 = N/A OR i_size@15 = small)) OR i_category@12 = Men AND (i_color@17 = orange OR i_color@17 = frosted) AND (i_units@18 = Each OR i_units@18 = Tbl) AND i_size@15 = petite OR i_category@12 = Men AND (i_color@17 = forest OR i_color@17 = ghost) AND (i_units@18 = Lb OR i_units@18 = Bundle) AND (i_size@15 = medium OR i_size@15 = extra large), pruning_predicate=i_category_null_count@2 != row_count@3 AND i_category_min@0 <= Women AND Women <= i_category_max@1 AND ((i_color_null_count@6 != row_count@3 AND i_color_min@4 <= powder AND powder <= i_color_max@5 OR i_color_null_count@6 != row_count@3 AND i_color_min@4 <= khaki AND khaki <= i_color_max@5) AND (i_units_null_count@9 != row_count@3 AND i_units_min@7 <= Ounce AND Ounce <= i_units_max@8 OR i_units_null_count@9 != row_count@3 AND i_units_min@7 <= Oz AND Oz <= i_units_max@8) AND (i_size_null_count@12 != row_count@3 AND i_size_min@10 <= medium AND medium <= i_size_max@11 OR i_size_null_count@12 != row_count@3 AND i_size_min@10 <= extra large AND extra large <= i_size_max@11) OR (i_color_null_count@6 != row_count@3 AND i_color_min@4 <= brown AND brown <= i_color_max@5 OR i_color_null_count@6 != row_count@3 AND i_color_min@4 <= honeydew AND honeydew <= i_color_max@5) AND (i_units_null_count@9 != row_count@3 AND i_units_min@7 <= Bunch AND Bunch <= i_units_max@8 OR i_units_null_count@9 != row_count@3 AND i_units_min@7 <= Ton AND Ton <= i_units_max@8) AND (i_size_null_count@12 != row_count@3 AND i_size_min@10 <= N/A AND N/A <= i_size_max@11 OR i_size_null_count@12 != row_count@3 AND i_size_min@10 <= small AND small <= i_size_max@11)) OR i_category_null_count@2 != row_count@3 AND i_category_min@0 <= Men AND Men <= i_category_max@1 AND (i_color_null_count@6 != row_count@3 AND i_color_min@4 <= floral AND floral <= i_color_max@5 OR i_color_null_count@6 != row_count@3 AND i_color_min@4 <= deep AND deep <= i_color_max@5) AND (i_units_null_count@9 != row_count@3 AND i_units_min@7 <= N/A AND N/A <= i_units_max@8 OR i_units_null_count@9 != row_count@3 AND i_units_min@7 <= Dozen AND Dozen <= i_units_max@8) AND i_size_null_count@12 != row_count@3 AND i_size_min@10 <= petite AND petite <= i_size_max@11 OR i_category_null_count@2 != row_count@3 AND i_category_min@0 <= Men AND Men <= i_category_max@1 AND (i_color_null_count@6 != row_count@3 AND i_color_min@4 <= light AND light <= i_color_max@5 OR i_color_null_count@6 != row_count@3 AND i_color_min@4 <= cornflower AND cornflower <= i_color_max@5) AND (i_units_null_count@9 != row_count@3 AND i_units_min@7 <= Box AND Box <= i_units_max@8 OR i_units_null_count@9 != row_count@3 AND i_units_min@7 <= Pound AND Pound <= i_units_max@8) AND (i_size_null_count@12 != row_count@3 AND i_size_min@10 <= medium AND medium <= i_size_max@11 OR i_size_null_count@12 != row_count@3 AND i_size_min@10 <= extra large AND extra large <= i_size_max@11) OR i_category_null_count@2 != row_count@3 AND i_category_min@0 <= Women AND Women <= i_category_max@1 AND ((i_color_null_count@6 != row_count@3 AND i_color_min@4 <= midnight AND midnight <= i_color_max@5 OR i_color_null_count@6 != row_count@3 AND i_color_min@4 <= snow AND snow <= i_color_max@5) AND (i_units_null_count@9 != row_count@3 AND i_units_min@7 <= Pallet AND Pallet <= i_units_max@8 OR i_units_null_count@9 != row_count@3 AND i_units_min@7 <= Gross AND Gross <= i_units_max@8) AND (i_size_null_count@12 != row_count@3 AND i_size_min@10 <= medium AND medium <= i_size_max@11 OR i_size_null_count@12 != row_count@3 AND i_size_min@10 <= extra large AND extra large <= i_size_max@11) OR (i_color_null_count@6 != row_count@3 AND i_color_min@4 <= cyan AND cyan <= i_color_max@5 OR i_color_null_count@6 != row_count@3 AND i_color_min@4 <= papaya AND papaya <= i_color_max@5) AND (i_units_null_count@9 != row_count@3 AND i_units_min@7 <= Cup AND Cup <= i_units_max@8 OR i_units_null_count@9 != row_count@3 AND i_units_min@7 <= Dram AND Dram <= i_units_max@8) AND (i_size_null_count@12 != row_count@3 AND i_size_min@10 <= N/A AND N/A <= i_size_max@11 OR i_size_null_count@12 != row_count@3 AND i_size_min@10 <= small AND small <= i_size_max@11)) OR i_category_null_count@2 != row_count@3 AND i_category_min@0 <= Men AND Men <= i_category_max@1 AND (i_color_null_count@6 != row_count@3 AND i_color_min@4 <= orange AND orange <= i_color_max@5 OR i_color_null_count@6 != row_count@3 AND i_color_min@4 <= frosted AND frosted <= i_color_max@5) AND (i_units_null_count@9 != row_count@3 AND i_units_min@7 <= Each AND Each <= i_units_max@8 OR i_units_null_count@9 != row_count@3 AND i_units_min@7 <= Tbl AND Tbl <= i_units_max@8) AND i_size_null_count@12 != row_count@3 AND i_size_min@10 <= petite AND petite <= i_size_max@11 OR i_category_null_count@2 != row_count@3 AND i_category_min@0 <= Men AND Men <= i_category_max@1 AND (i_color_null_count@6 != row_count@3 AND i_color_min@4 <= forest AND forest <= i_color_max@5 OR i_color_null_count@6 != row_count@3 AND i_color_min@4 <= ghost AND ghost <= i_color_max@5) AND (i_units_null_count@9 != row_count@3 AND i_units_min@7 <= Lb AND Lb <= i_units_max@8 OR i_units_null_count@9 != row_count@3 AND i_units_min@7 <= Bundle AND Bundle <= i_units_max@8) AND (i_size_null_count@12 != row_count@3 AND i_size_min@10 <= medium AND medium <= i_size_max@11 OR i_size_null_count@12 != row_count@3 AND i_size_min@10 <= extra large AND extra large <= i_size_max@11), required_guarantees=[i_category in (Men, Women)] └────────────────────────────────────────────────── "); Ok(()) @@ -4711,7 +4347,7 @@ mod tests { #[tokio::test] async fn test_tpcds_42() -> Result<()> { let display = test_tpcds_query("q42").await?; - assert_snapshot!(display, @" + assert_snapshot!(display, @r" ┌───── DistributedExec ── Tasks: t0:[p0] │ SortPreservingMergeExec: [sum(store_sales.ss_ext_sales_price)@3 DESC, d_year@0 ASC NULLS LAST, i_category_id@1 ASC NULLS LAST, i_category@2 ASC NULLS LAST], fetch=100 │ [Stage 4] => NetworkCoalesceExec: output_partitions=6, input_tasks=2 @@ -4726,27 +4362,24 @@ mod tests { │ AggregateExec: mode=Partial, gby=[d_year@0 as d_year, i_category_id@2 as i_category_id, i_category@3 as i_category], aggr=[sum(store_sales.ss_ext_sales_price)], ordering_mode=PartiallySorted([0]) │ HashJoinExec: mode=CollectLeft, join_type=Inner, on=[(ss_item_sk@1, i_item_sk@0)], projection=[d_year@0, ss_ext_sales_price@2, i_category_id@4, i_category@5] │ CoalescePartitionsExec - │ [Stage 2] => NetworkBroadcastExec: partitions_per_consumer=2, stage_partitions=6, input_tasks=3 + │ [Stage 2] => NetworkBroadcastExec: partitions_per_consumer=3, stage_partitions=9, input_tasks=3 │ FilterExec: i_manager_id@3 = 1, projection=[i_item_sk@0, i_category_id@1, i_category@2] │ RepartitionExec: partitioning=RoundRobinBatch(3), input_partitions=2 - │ PartitionIsolatorExec: tasks=3 partitions=4 - │ DataSourceExec: file_groups={4 groups: [[/testdata/tpcds/plans_sf1_partitions4/item/part-0.parquet], [/testdata/tpcds/plans_sf1_partitions4/item/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/item/part-2.parquet], [/testdata/tpcds/plans_sf1_partitions4/item/part-3.parquet]]}, projection=[i_item_sk, i_category_id, i_category, i_manager_id], file_type=parquet, predicate=i_manager_id@20 = 1 AND DynamicFilter [ empty ], pruning_predicate=i_manager_id_null_count@2 != row_count@3 AND i_manager_id_min@0 <= 1 AND 1 <= i_manager_id_max@1, required_guarantees=[i_manager_id in (1)] + │ DistributedLeafExec: DataSourceExec: file_groups={2 groups: [[/testdata/tpcds/plans_sf1_partitions4/item/part-0.parquet, /testdata/tpcds/plans_sf1_partitions4/item/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/item/part-2.parquet, /testdata/tpcds/plans_sf1_partitions4/item/part-3.parquet]]}, projection=[i_item_sk, i_category_id, i_category, i_manager_id], file_type=parquet, predicate=i_manager_id@20 = 1 AND DynamicFilter [ empty ], pruning_predicate=i_manager_id_null_count@2 != row_count@3 AND i_manager_id_min@0 <= 1 AND 1 <= i_manager_id_max@1, required_guarantees=[i_manager_id in (1)] └────────────────────────────────────────────────── - ┌───── Stage 2 ── Tasks: t0:[p0..p5] t1:[p6..p11] t2:[p12..p17] - │ BroadcastExec: input_partitions=2, consumer_tasks=3, output_partitions=6 + ┌───── Stage 2 ── Tasks: t0:[p0..p8] t1:[p9..p17] t2:[p18..p26] + │ BroadcastExec: input_partitions=3, consumer_tasks=3, output_partitions=9 │ HashJoinExec: mode=CollectLeft, join_type=Inner, on=[(CAST(dt.d_date_sk AS Float64)@2, ss_sold_date_sk@0)], projection=[d_year@1, ss_item_sk@4, ss_ext_sales_price@5] │ CoalescePartitionsExec │ [Stage 1] => NetworkBroadcastExec: partitions_per_consumer=3, stage_partitions=9, input_tasks=2 - │ PartitionIsolatorExec: tasks=3 partitions=6 - │ DataSourceExec: file_groups={6 groups: [[/testdata/tpcds/plans_sf1_partitions4/store_sales/part-0.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_sales/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_sales/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_sales/part-2.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_sales/part-2.parquet:..], ...]}, projection=[ss_sold_date_sk, ss_item_sk, ss_ext_sales_price], file_type=parquet, predicate=DynamicFilter [ empty ] + │ DistributedLeafExec: DataSourceExec: file_groups={3 groups: [[/testdata/tpcds/plans_sf1_partitions4/store_sales/part-0.parquet:.., /testdata/tpcds/plans_sf1_partitions4/store_sales/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_sales/part-1.parquet:.., /testdata/tpcds/plans_sf1_partitions4/store_sales/part-2.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_sales/part-2.parquet:.., /testdata/tpcds/plans_sf1_partitions4/store_sales/part-3.parquet:..]]}, projection=[ss_sold_date_sk, ss_item_sk, ss_ext_sales_price], file_type=parquet, predicate=DynamicFilter [ empty ] └────────────────────────────────────────────────── ┌───── Stage 1 ── Tasks: t0:[p0..p8] t1:[p9..p17] │ BroadcastExec: input_partitions=3, consumer_tasks=3, output_partitions=9 │ ProjectionExec: expr=[d_date_sk@0 as d_date_sk, d_year@1 as d_year, CAST(d_date_sk@0 AS Float64) as CAST(dt.d_date_sk AS Float64)] │ FilterExec: d_moy@2 = 11 AND d_year@1 = 2000, projection=[d_date_sk@0, d_year@1] │ RepartitionExec: partitioning=RoundRobinBatch(3), input_partitions=2 - │ PartitionIsolatorExec: tasks=2 partitions=4 - │ DataSourceExec: file_groups={4 groups: [[/testdata/tpcds/plans_sf1_partitions4/date_dim/part-0.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-2.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-3.parquet]]}, projection=[d_date_sk, d_year, d_moy], file_type=parquet, predicate=d_moy@8 = 11 AND d_year@6 = 2000, pruning_predicate=d_moy_null_count@2 != row_count@3 AND d_moy_min@0 <= 11 AND 11 <= d_moy_max@1 AND d_year_null_count@6 != row_count@3 AND d_year_min@4 <= 2000 AND 2000 <= d_year_max@5, required_guarantees=[d_moy in (11), d_year in (2000)] + │ DistributedLeafExec: DataSourceExec: file_groups={2 groups: [[/testdata/tpcds/plans_sf1_partitions4/date_dim/part-0.parquet, /testdata/tpcds/plans_sf1_partitions4/date_dim/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-2.parquet, /testdata/tpcds/plans_sf1_partitions4/date_dim/part-3.parquet]]}, projection=[d_date_sk, d_year, d_moy], file_type=parquet, predicate=d_moy@8 = 11 AND d_year@6 = 2000, pruning_predicate=d_moy_null_count@2 != row_count@3 AND d_moy_min@0 <= 11 AND 11 <= d_moy_max@1 AND d_year_null_count@6 != row_count@3 AND d_year_min@4 <= 2000 AND 2000 <= d_year_max@5, required_guarantees=[d_moy in (11), d_year in (2000)] └────────────────────────────────────────────────── "); Ok(()) @@ -4766,7 +4399,7 @@ mod tests { │ [Stage 3] => NetworkShuffleExec: output_partitions=3, input_tasks=3 └────────────────────────────────────────────────── ┌───── Stage 3 ── Tasks: t0:[p0..p5] t1:[p0..p5] t2:[p0..p5] - │ RepartitionExec: partitioning=Hash([s_store_name@0, s_store_id@1], 6), input_partitions=2 + │ RepartitionExec: partitioning=Hash([s_store_name@0, s_store_id@1], 6), input_partitions=3 │ AggregateExec: mode=Partial, gby=[s_store_name@3 as s_store_name, s_store_id@2 as s_store_id], aggr=[sum(CASE WHEN date_dim.d_day_name = Utf8("Sunday") THEN store_sales.ss_sales_price ELSE NULL END), sum(CASE WHEN date_dim.d_day_name = Utf8("Monday") THEN store_sales.ss_sales_price ELSE NULL END), sum(CASE WHEN date_dim.d_day_name = Utf8("Tuesday") THEN store_sales.ss_sales_price ELSE NULL END), sum(CASE WHEN date_dim.d_day_name = Utf8("Wednesday") THEN store_sales.ss_sales_price ELSE NULL END), sum(CASE WHEN date_dim.d_day_name = Utf8("Thursday") THEN store_sales.ss_sales_price ELSE NULL END), sum(CASE WHEN date_dim.d_day_name = Utf8("Friday") THEN store_sales.ss_sales_price ELSE NULL END), sum(CASE WHEN date_dim.d_day_name = Utf8("Saturday") THEN store_sales.ss_sales_price ELSE NULL END)] │ ProjectionExec: expr=[d_day_name@2 as d_day_name, ss_sales_price@3 as ss_sales_price, s_store_id@0 as s_store_id, s_store_name@1 as s_store_name] │ HashJoinExec: mode=CollectLeft, join_type=Inner, on=[(CAST(store.s_store_sk AS Float64)@3, ss_store_sk@1)], projection=[s_store_id@1, s_store_name@2, d_day_name@4, ss_sales_price@6] @@ -4775,24 +4408,21 @@ mod tests { │ HashJoinExec: mode=CollectLeft, join_type=Inner, on=[(CAST(date_dim.d_date_sk AS Float64)@2, ss_sold_date_sk@0)], projection=[d_day_name@1, ss_store_sk@4, ss_sales_price@5] │ CoalescePartitionsExec │ [Stage 2] => NetworkBroadcastExec: partitions_per_consumer=3, stage_partitions=9, input_tasks=2 - │ PartitionIsolatorExec: tasks=3 partitions=6 - │ DataSourceExec: file_groups={6 groups: [[/testdata/tpcds/plans_sf1_partitions4/store_sales/part-0.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_sales/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_sales/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_sales/part-2.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_sales/part-2.parquet:..], ...]}, projection=[ss_sold_date_sk, ss_store_sk, ss_sales_price], file_type=parquet, predicate=DynamicFilter [ empty ] AND DynamicFilter [ empty ] + │ DistributedLeafExec: DataSourceExec: file_groups={3 groups: [[/testdata/tpcds/plans_sf1_partitions4/store_sales/part-0.parquet:.., /testdata/tpcds/plans_sf1_partitions4/store_sales/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_sales/part-1.parquet:.., /testdata/tpcds/plans_sf1_partitions4/store_sales/part-2.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_sales/part-2.parquet:.., /testdata/tpcds/plans_sf1_partitions4/store_sales/part-3.parquet:..]]}, projection=[ss_sold_date_sk, ss_store_sk, ss_sales_price], file_type=parquet, predicate=DynamicFilter [ empty ] AND DynamicFilter [ empty ] └────────────────────────────────────────────────── ┌───── Stage 1 ── Tasks: t0:[p0..p8] t1:[p9..p17] │ BroadcastExec: input_partitions=3, consumer_tasks=3, output_partitions=9 │ ProjectionExec: expr=[s_store_sk@0 as s_store_sk, s_store_id@1 as s_store_id, s_store_name@2 as s_store_name, CAST(s_store_sk@0 AS Float64) as CAST(store.s_store_sk AS Float64)] │ FilterExec: s_gmt_offset@3 = Some(-500),3,2, projection=[s_store_sk@0, s_store_id@1, s_store_name@2] │ RepartitionExec: partitioning=RoundRobinBatch(3), input_partitions=2 - │ PartitionIsolatorExec: tasks=2 partitions=4 - │ DataSourceExec: file_groups={4 groups: [[/testdata/tpcds/plans_sf1_partitions4/store/part-0.parquet], [/testdata/tpcds/plans_sf1_partitions4/store/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/store/part-2.parquet], [/testdata/tpcds/plans_sf1_partitions4/store/part-3.parquet]]}, projection=[s_store_sk, s_store_id, s_store_name, s_gmt_offset], file_type=parquet, predicate=s_gmt_offset@27 = Some(-500),3,2, pruning_predicate=s_gmt_offset_null_count@2 != row_count@3 AND s_gmt_offset_min@0 <= Some(-500),3,2 AND Some(-500),3,2 <= s_gmt_offset_max@1, required_guarantees=[s_gmt_offset in (Some(-500),3,2)] + │ DistributedLeafExec: DataSourceExec: file_groups={2 groups: [[/testdata/tpcds/plans_sf1_partitions4/store/part-0.parquet, /testdata/tpcds/plans_sf1_partitions4/store/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/store/part-2.parquet, /testdata/tpcds/plans_sf1_partitions4/store/part-3.parquet]]}, projection=[s_store_sk, s_store_id, s_store_name, s_gmt_offset], file_type=parquet, predicate=s_gmt_offset@27 = Some(-500),3,2, pruning_predicate=s_gmt_offset_null_count@2 != row_count@3 AND s_gmt_offset_min@0 <= Some(-500),3,2 AND Some(-500),3,2 <= s_gmt_offset_max@1, required_guarantees=[s_gmt_offset in (Some(-500),3,2)] └────────────────────────────────────────────────── ┌───── Stage 2 ── Tasks: t0:[p0..p8] t1:[p9..p17] │ BroadcastExec: input_partitions=3, consumer_tasks=3, output_partitions=9 │ ProjectionExec: expr=[d_date_sk@0 as d_date_sk, d_day_name@1 as d_day_name, CAST(d_date_sk@0 AS Float64) as CAST(date_dim.d_date_sk AS Float64)] │ FilterExec: d_year@1 = 2000, projection=[d_date_sk@0, d_day_name@2] │ RepartitionExec: partitioning=RoundRobinBatch(3), input_partitions=2 - │ PartitionIsolatorExec: tasks=2 partitions=4 - │ DataSourceExec: file_groups={4 groups: [[/testdata/tpcds/plans_sf1_partitions4/date_dim/part-0.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-2.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-3.parquet]]}, projection=[d_date_sk, d_year, d_day_name], file_type=parquet, predicate=d_year@6 = 2000, pruning_predicate=d_year_null_count@2 != row_count@3 AND d_year_min@0 <= 2000 AND 2000 <= d_year_max@1, required_guarantees=[d_year in (2000)] + │ DistributedLeafExec: DataSourceExec: file_groups={2 groups: [[/testdata/tpcds/plans_sf1_partitions4/date_dim/part-0.parquet, /testdata/tpcds/plans_sf1_partitions4/date_dim/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-2.parquet, /testdata/tpcds/plans_sf1_partitions4/date_dim/part-3.parquet]]}, projection=[d_date_sk, d_year, d_day_name], file_type=parquet, predicate=d_year@6 = 2000, pruning_predicate=d_year_null_count@2 != row_count@3 AND d_year_min@0 <= 2000 AND 2000 <= d_year_max@1, required_guarantees=[d_year in (2000)] └────────────────────────────────────────────────── "#); Ok(()) @@ -4821,13 +4451,11 @@ mod tests { └────────────────────────────────────────────────── ┌───── Stage 1 ── Tasks: t0:[p0..p3] t1:[p4..p7] │ BroadcastExec: input_partitions=2, consumer_tasks=2, output_partitions=4 - │ PartitionIsolatorExec: tasks=2 partitions=4 - │ DataSourceExec: file_groups={4 groups: [[/testdata/tpcds/plans_sf1_partitions4/item/part-0.parquet], [/testdata/tpcds/plans_sf1_partitions4/item/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/item/part-2.parquet], [/testdata/tpcds/plans_sf1_partitions4/item/part-3.parquet]]}, projection=[i_item_sk, i_product_name], file_type=parquet + │ DistributedLeafExec: DataSourceExec: file_groups={2 groups: [[/testdata/tpcds/plans_sf1_partitions4/item/part-0.parquet, /testdata/tpcds/plans_sf1_partitions4/item/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/item/part-2.parquet, /testdata/tpcds/plans_sf1_partitions4/item/part-3.parquet]]}, projection=[i_item_sk, i_product_name], file_type=parquet └────────────────────────────────────────────────── ┌───── Stage 2 ── Tasks: t0:[p0..p3] t1:[p4..p7] │ BroadcastExec: input_partitions=2, consumer_tasks=2, output_partitions=4 - │ PartitionIsolatorExec: tasks=2 partitions=4 - │ DataSourceExec: file_groups={4 groups: [[/testdata/tpcds/plans_sf1_partitions4/item/part-0.parquet], [/testdata/tpcds/plans_sf1_partitions4/item/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/item/part-2.parquet], [/testdata/tpcds/plans_sf1_partitions4/item/part-3.parquet]]}, projection=[i_item_sk, i_product_name], file_type=parquet + │ DistributedLeafExec: DataSourceExec: file_groups={2 groups: [[/testdata/tpcds/plans_sf1_partitions4/item/part-0.parquet, /testdata/tpcds/plans_sf1_partitions4/item/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/item/part-2.parquet, /testdata/tpcds/plans_sf1_partitions4/item/part-3.parquet]]}, projection=[i_item_sk, i_product_name], file_type=parquet └────────────────────────────────────────────────── ┌───── Stage 5 ── Tasks: t0:[p0..p5] │ RepartitionExec: partitioning=Hash([rnk@1], 6), input_partitions=3 @@ -4849,18 +4477,16 @@ mod tests { │ [Stage 4] => NetworkShuffleExec: output_partitions=3, input_tasks=3 └────────────────────────────────────────────────── ┌───── Stage 3 ── Tasks: t0:[p0..p2] t1:[p0..p2] t2:[p0..p2] - │ RepartitionExec: partitioning=Hash([ss_item_sk@0], 3), input_partitions=2 + │ RepartitionExec: partitioning=Hash([ss_item_sk@0], 3), input_partitions=3 │ AggregateExec: mode=Partial, gby=[ss_item_sk@0 as ss_item_sk], aggr=[avg(ss1.ss_net_profit)] │ FilterExec: ss_store_sk@1 = 4, projection=[ss_item_sk@0, ss_net_profit@2] - │ PartitionIsolatorExec: tasks=3 partitions=6 - │ DataSourceExec: file_groups={6 groups: [[/testdata/tpcds/plans_sf1_partitions4/store_sales/part-0.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_sales/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_sales/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_sales/part-2.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_sales/part-2.parquet:..], ...]}, projection=[ss_item_sk, ss_store_sk, ss_net_profit], file_type=parquet, predicate=ss_store_sk@7 = 4, pruning_predicate=ss_store_sk_null_count@2 != row_count@3 AND ss_store_sk_min@0 <= 4 AND 4 <= ss_store_sk_max@1, required_guarantees=[ss_store_sk in (4)] + │ DistributedLeafExec: DataSourceExec: file_groups={3 groups: [[/testdata/tpcds/plans_sf1_partitions4/store_sales/part-0.parquet:.., /testdata/tpcds/plans_sf1_partitions4/store_sales/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_sales/part-1.parquet:.., /testdata/tpcds/plans_sf1_partitions4/store_sales/part-2.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_sales/part-2.parquet:.., /testdata/tpcds/plans_sf1_partitions4/store_sales/part-3.parquet:..]]}, projection=[ss_item_sk, ss_store_sk, ss_net_profit], file_type=parquet, predicate=ss_store_sk@7 = 4, pruning_predicate=ss_store_sk_null_count@2 != row_count@3 AND ss_store_sk_min@0 <= 4 AND 4 <= ss_store_sk_max@1, required_guarantees=[ss_store_sk in (4)] └────────────────────────────────────────────────── ┌───── Stage 4 ── Tasks: t0:[p0..p2] t1:[p0..p2] t2:[p0..p2] - │ RepartitionExec: partitioning=Hash([ss_store_sk@0], 3), input_partitions=2 + │ RepartitionExec: partitioning=Hash([ss_store_sk@0], 3), input_partitions=3 │ AggregateExec: mode=Partial, gby=[ss_store_sk@0 as ss_store_sk], aggr=[avg(store_sales.ss_net_profit)], ordering_mode=Sorted │ FilterExec: ss_store_sk@1 = 4 AND ss_addr_sk@0 IS NULL, projection=[ss_store_sk@1, ss_net_profit@2] - │ PartitionIsolatorExec: tasks=3 partitions=6 - │ DataSourceExec: file_groups={6 groups: [[/testdata/tpcds/plans_sf1_partitions4/store_sales/part-0.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_sales/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_sales/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_sales/part-2.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_sales/part-2.parquet:..], ...]}, projection=[ss_addr_sk, ss_store_sk, ss_net_profit], file_type=parquet, predicate=ss_store_sk@7 = 4 AND ss_addr_sk@6 IS NULL, pruning_predicate=ss_store_sk_null_count@2 != row_count@3 AND ss_store_sk_min@0 <= 4 AND 4 <= ss_store_sk_max@1 AND ss_addr_sk_null_count@4 > 0, required_guarantees=[ss_store_sk in (4)] + │ DistributedLeafExec: DataSourceExec: file_groups={3 groups: [[/testdata/tpcds/plans_sf1_partitions4/store_sales/part-0.parquet:.., /testdata/tpcds/plans_sf1_partitions4/store_sales/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_sales/part-1.parquet:.., /testdata/tpcds/plans_sf1_partitions4/store_sales/part-2.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_sales/part-2.parquet:.., /testdata/tpcds/plans_sf1_partitions4/store_sales/part-3.parquet:..]]}, projection=[ss_addr_sk, ss_store_sk, ss_net_profit], file_type=parquet, predicate=ss_store_sk@7 = 4 AND ss_addr_sk@6 IS NULL, pruning_predicate=ss_store_sk_null_count@2 != row_count@3 AND ss_store_sk_min@0 <= 4 AND 4 <= ss_store_sk_max@1 AND ss_addr_sk_null_count@4 > 0, required_guarantees=[ss_store_sk in (4)] └────────────────────────────────────────────────── ┌───── Stage 8 ── Tasks: t0:[p0..p5] │ RepartitionExec: partitioning=Hash([rnk@1], 6), input_partitions=3 @@ -4882,18 +4508,16 @@ mod tests { │ [Stage 7] => NetworkShuffleExec: output_partitions=3, input_tasks=3 └────────────────────────────────────────────────── ┌───── Stage 6 ── Tasks: t0:[p0..p2] t1:[p0..p2] t2:[p0..p2] - │ RepartitionExec: partitioning=Hash([ss_item_sk@0], 3), input_partitions=2 + │ RepartitionExec: partitioning=Hash([ss_item_sk@0], 3), input_partitions=3 │ AggregateExec: mode=Partial, gby=[ss_item_sk@0 as ss_item_sk], aggr=[avg(ss1.ss_net_profit)] │ FilterExec: ss_store_sk@1 = 4, projection=[ss_item_sk@0, ss_net_profit@2] - │ PartitionIsolatorExec: tasks=3 partitions=6 - │ DataSourceExec: file_groups={6 groups: [[/testdata/tpcds/plans_sf1_partitions4/store_sales/part-0.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_sales/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_sales/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_sales/part-2.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_sales/part-2.parquet:..], ...]}, projection=[ss_item_sk, ss_store_sk, ss_net_profit], file_type=parquet, predicate=ss_store_sk@7 = 4, pruning_predicate=ss_store_sk_null_count@2 != row_count@3 AND ss_store_sk_min@0 <= 4 AND 4 <= ss_store_sk_max@1, required_guarantees=[ss_store_sk in (4)] + │ DistributedLeafExec: DataSourceExec: file_groups={3 groups: [[/testdata/tpcds/plans_sf1_partitions4/store_sales/part-0.parquet:.., /testdata/tpcds/plans_sf1_partitions4/store_sales/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_sales/part-1.parquet:.., /testdata/tpcds/plans_sf1_partitions4/store_sales/part-2.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_sales/part-2.parquet:.., /testdata/tpcds/plans_sf1_partitions4/store_sales/part-3.parquet:..]]}, projection=[ss_item_sk, ss_store_sk, ss_net_profit], file_type=parquet, predicate=ss_store_sk@7 = 4, pruning_predicate=ss_store_sk_null_count@2 != row_count@3 AND ss_store_sk_min@0 <= 4 AND 4 <= ss_store_sk_max@1, required_guarantees=[ss_store_sk in (4)] └────────────────────────────────────────────────── ┌───── Stage 7 ── Tasks: t0:[p0..p2] t1:[p0..p2] t2:[p0..p2] - │ RepartitionExec: partitioning=Hash([ss_store_sk@0], 3), input_partitions=2 + │ RepartitionExec: partitioning=Hash([ss_store_sk@0], 3), input_partitions=3 │ AggregateExec: mode=Partial, gby=[ss_store_sk@0 as ss_store_sk], aggr=[avg(store_sales.ss_net_profit)], ordering_mode=Sorted │ FilterExec: ss_store_sk@1 = 4 AND ss_addr_sk@0 IS NULL, projection=[ss_store_sk@1, ss_net_profit@2] - │ PartitionIsolatorExec: tasks=3 partitions=6 - │ DataSourceExec: file_groups={6 groups: [[/testdata/tpcds/plans_sf1_partitions4/store_sales/part-0.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_sales/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_sales/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_sales/part-2.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_sales/part-2.parquet:..], ...]}, projection=[ss_addr_sk, ss_store_sk, ss_net_profit], file_type=parquet, predicate=ss_store_sk@7 = 4 AND ss_addr_sk@6 IS NULL, pruning_predicate=ss_store_sk_null_count@2 != row_count@3 AND ss_store_sk_min@0 <= 4 AND 4 <= ss_store_sk_max@1 AND ss_addr_sk_null_count@4 > 0, required_guarantees=[ss_store_sk in (4)] + │ DistributedLeafExec: DataSourceExec: file_groups={3 groups: [[/testdata/tpcds/plans_sf1_partitions4/store_sales/part-0.parquet:.., /testdata/tpcds/plans_sf1_partitions4/store_sales/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_sales/part-1.parquet:.., /testdata/tpcds/plans_sf1_partitions4/store_sales/part-2.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_sales/part-2.parquet:.., /testdata/tpcds/plans_sf1_partitions4/store_sales/part-3.parquet:..]]}, projection=[ss_addr_sk, ss_store_sk, ss_net_profit], file_type=parquet, predicate=ss_store_sk@7 = 4 AND ss_addr_sk@6 IS NULL, pruning_predicate=ss_store_sk_null_count@2 != row_count@3 AND ss_store_sk_min@0 <= 4 AND 4 <= ss_store_sk_max@1 AND ss_addr_sk_null_count@4 > 0, required_guarantees=[ss_store_sk in (4)] └────────────────────────────────────────────────── "#); Ok(()) @@ -4901,7 +4525,7 @@ mod tests { #[tokio::test] async fn test_tpcds_45() -> Result<()> { let display = test_tpcds_query("q45").await?; - assert_snapshot!(display, @" + assert_snapshot!(display, @r" ┌───── DistributedExec ── Tasks: t0:[p0] │ SortPreservingMergeExec: [ca_zip@0 ASC NULLS LAST, ca_city@1 ASC NULLS LAST], fetch=100 │ SortExec: TopK(fetch=100), expr=[ca_zip@0 ASC NULLS LAST, ca_city@1 ASC NULLS LAST], preserve_partitioning=[true] @@ -4921,8 +4545,7 @@ mod tests { │ CoalescePartitionsExec │ [Stage 5] => NetworkBroadcastExec: partitions_per_consumer=3, stage_partitions=9, input_tasks=3 │ RepartitionExec: partitioning=RoundRobinBatch(3), input_partitions=2 - │ PartitionIsolatorExec: tasks=3 partitions=4 - │ DataSourceExec: file_groups={4 groups: [[/testdata/tpcds/plans_sf1_partitions4/item/part-0.parquet], [/testdata/tpcds/plans_sf1_partitions4/item/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/item/part-2.parquet], [/testdata/tpcds/plans_sf1_partitions4/item/part-3.parquet]]}, projection=[i_item_sk, i_item_id], file_type=parquet, predicate=DynamicFilter [ empty ] + │ DistributedLeafExec: DataSourceExec: file_groups={2 groups: [[/testdata/tpcds/plans_sf1_partitions4/item/part-0.parquet, /testdata/tpcds/plans_sf1_partitions4/item/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/item/part-2.parquet, /testdata/tpcds/plans_sf1_partitions4/item/part-3.parquet]]}, projection=[i_item_sk, i_item_id], file_type=parquet, predicate=DynamicFilter [ empty ] └────────────────────────────────────────────────── ┌───── Stage 5 ── Tasks: t0:[p0..p8] t1:[p9..p17] t2:[p18..p26] │ BroadcastExec: input_partitions=3, consumer_tasks=3, output_partitions=9 @@ -4943,25 +4566,21 @@ mod tests { │ ProjectionExec: expr=[d_date_sk@0 as d_date_sk, CAST(d_date_sk@0 AS Float64) as CAST(date_dim.d_date_sk AS Float64)] │ FilterExec: d_qoy@2 = 2 AND d_year@1 = 2001, projection=[d_date_sk@0] │ RepartitionExec: partitioning=RoundRobinBatch(3), input_partitions=2 - │ PartitionIsolatorExec: tasks=2 partitions=4 - │ DataSourceExec: file_groups={4 groups: [[/testdata/tpcds/plans_sf1_partitions4/date_dim/part-0.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-2.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-3.parquet]]}, projection=[d_date_sk, d_year, d_qoy], file_type=parquet, predicate=d_qoy@10 = 2 AND d_year@6 = 2001, pruning_predicate=d_qoy_null_count@2 != row_count@3 AND d_qoy_min@0 <= 2 AND 2 <= d_qoy_max@1 AND d_year_null_count@6 != row_count@3 AND d_year_min@4 <= 2001 AND 2001 <= d_year_max@5, required_guarantees=[d_qoy in (2), d_year in (2001)] + │ DistributedLeafExec: DataSourceExec: file_groups={2 groups: [[/testdata/tpcds/plans_sf1_partitions4/date_dim/part-0.parquet, /testdata/tpcds/plans_sf1_partitions4/date_dim/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-2.parquet, /testdata/tpcds/plans_sf1_partitions4/date_dim/part-3.parquet]]}, projection=[d_date_sk, d_year, d_qoy], file_type=parquet, predicate=d_qoy@10 = 2 AND d_year@6 = 2001, pruning_predicate=d_qoy_null_count@2 != row_count@3 AND d_qoy_min@0 <= 2 AND 2 <= d_qoy_max@1 AND d_year_null_count@6 != row_count@3 AND d_year_min@4 <= 2001 AND 2001 <= d_year_max@5, required_guarantees=[d_qoy in (2), d_year in (2001)] └────────────────────────────────────────────────── ┌───── Stage 2 ── Tasks: t0:[p0..p5] t1:[p6..p11] │ BroadcastExec: input_partitions=2, consumer_tasks=3, output_partitions=6 - │ PartitionIsolatorExec: tasks=2 partitions=4 - │ DataSourceExec: file_groups={4 groups: [[/testdata/tpcds/plans_sf1_partitions4/customer_address/part-0.parquet], [/testdata/tpcds/plans_sf1_partitions4/customer_address/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/customer_address/part-2.parquet], [/testdata/tpcds/plans_sf1_partitions4/customer_address/part-3.parquet]]}, projection=[ca_address_sk, ca_city, ca_zip], file_type=parquet + │ DistributedLeafExec: DataSourceExec: file_groups={2 groups: [[/testdata/tpcds/plans_sf1_partitions4/customer_address/part-0.parquet, /testdata/tpcds/plans_sf1_partitions4/customer_address/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/customer_address/part-2.parquet, /testdata/tpcds/plans_sf1_partitions4/customer_address/part-3.parquet]]}, projection=[ca_address_sk, ca_city, ca_zip], file_type=parquet └────────────────────────────────────────────────── ┌───── Stage 3 ── Tasks: t0:[p0..p8] t1:[p0..p8] │ RepartitionExec: partitioning=Hash([CAST(customer.c_customer_sk AS Float64)@2], 9), input_partitions=3 │ ProjectionExec: expr=[c_customer_sk@0 as c_customer_sk, c_current_addr_sk@1 as c_current_addr_sk, CAST(c_customer_sk@0 AS Float64) as CAST(customer.c_customer_sk AS Float64)] │ RepartitionExec: partitioning=RoundRobinBatch(3), input_partitions=2 - │ PartitionIsolatorExec: tasks=2 partitions=4 - │ DataSourceExec: file_groups={4 groups: [[/testdata/tpcds/plans_sf1_partitions4/customer/part-0.parquet], [/testdata/tpcds/plans_sf1_partitions4/customer/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/customer/part-2.parquet], [/testdata/tpcds/plans_sf1_partitions4/customer/part-3.parquet]]}, projection=[c_customer_sk, c_current_addr_sk], file_type=parquet, predicate=DynamicFilter [ empty ] + │ DistributedLeafExec: DataSourceExec: file_groups={2 groups: [[/testdata/tpcds/plans_sf1_partitions4/customer/part-0.parquet, /testdata/tpcds/plans_sf1_partitions4/customer/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/customer/part-2.parquet, /testdata/tpcds/plans_sf1_partitions4/customer/part-3.parquet]]}, projection=[c_customer_sk, c_current_addr_sk], file_type=parquet, predicate=DynamicFilter [ empty ] └────────────────────────────────────────────────── ┌───── Stage 4 ── Tasks: t0:[p0..p8] t1:[p0..p8] t2:[p0..p8] - │ RepartitionExec: partitioning=Hash([ws_bill_customer_sk@2], 9), input_partitions=2 - │ PartitionIsolatorExec: tasks=3 partitions=6 - │ DataSourceExec: file_groups={6 groups: [[/testdata/tpcds/plans_sf1_partitions4/web_sales/part-0.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/web_sales/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/web_sales/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/web_sales/part-2.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/web_sales/part-2.parquet:..], ...]}, projection=[ws_sold_date_sk, ws_item_sk, ws_bill_customer_sk, ws_sales_price], file_type=parquet, predicate=DynamicFilter [ empty ] AND DynamicFilter [ empty ] + │ RepartitionExec: partitioning=Hash([ws_bill_customer_sk@2], 9), input_partitions=3 + │ DistributedLeafExec: DataSourceExec: file_groups={3 groups: [[/testdata/tpcds/plans_sf1_partitions4/web_sales/part-0.parquet:.., /testdata/tpcds/plans_sf1_partitions4/web_sales/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/web_sales/part-1.parquet:.., /testdata/tpcds/plans_sf1_partitions4/web_sales/part-2.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/web_sales/part-2.parquet:.., /testdata/tpcds/plans_sf1_partitions4/web_sales/part-3.parquet:..]]}, projection=[ws_sold_date_sk, ws_item_sk, ws_bill_customer_sk, ws_sales_price], file_type=parquet, predicate=DynamicFilter [ empty ] AND DynamicFilter [ empty ] └────────────────────────────────────────────────── "); Ok(()) @@ -4969,7 +4588,7 @@ mod tests { #[tokio::test] async fn test_tpcds_46() -> Result<()> { let display = test_tpcds_query("q46").await?; - assert_snapshot!(display, @" + assert_snapshot!(display, @r" ┌───── DistributedExec ── Tasks: t0:[p0] │ SortPreservingMergeExec: [c_last_name@0 ASC, c_first_name@1 ASC, ca_city@2 ASC, bought_city@3 ASC, ss_ticket_number@4 ASC], fetch=100 │ [Stage 8] => NetworkCoalesceExec: output_partitions=6, input_tasks=2 @@ -4981,8 +4600,7 @@ mod tests { │ CoalescePartitionsExec │ [Stage 7] => NetworkBroadcastExec: partitions_per_consumer=3, stage_partitions=6, input_tasks=2 │ RepartitionExec: partitioning=RoundRobinBatch(3), input_partitions=2 - │ PartitionIsolatorExec: tasks=2 partitions=4 - │ DataSourceExec: file_groups={4 groups: [[/testdata/tpcds/plans_sf1_partitions4/customer_address/part-0.parquet], [/testdata/tpcds/plans_sf1_partitions4/customer_address/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/customer_address/part-2.parquet], [/testdata/tpcds/plans_sf1_partitions4/customer_address/part-3.parquet]]}, projection=[ca_address_sk, ca_city], file_type=parquet, predicate=DynamicFilter [ empty ] + │ DistributedLeafExec: DataSourceExec: file_groups={2 groups: [[/testdata/tpcds/plans_sf1_partitions4/customer_address/part-0.parquet, /testdata/tpcds/plans_sf1_partitions4/customer_address/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/customer_address/part-2.parquet, /testdata/tpcds/plans_sf1_partitions4/customer_address/part-3.parquet]]}, projection=[ca_address_sk, ca_city], file_type=parquet, predicate=DynamicFilter [ empty ] └────────────────────────────────────────────────── ┌───── Stage 7 ── Tasks: t0:[p0..p5] t1:[p6..p11] │ BroadcastExec: input_partitions=3, consumer_tasks=2, output_partitions=6 @@ -4991,8 +4609,7 @@ mod tests { │ [Stage 6] => NetworkBroadcastExec: partitions_per_consumer=3, stage_partitions=6, input_tasks=2 │ ProjectionExec: expr=[c_customer_sk@0 as c_customer_sk, c_current_addr_sk@1 as c_current_addr_sk, c_first_name@2 as c_first_name, c_last_name@3 as c_last_name, CAST(c_customer_sk@0 AS Float64) as CAST(customer.c_customer_sk AS Float64)] │ RepartitionExec: partitioning=RoundRobinBatch(3), input_partitions=2 - │ PartitionIsolatorExec: tasks=2 partitions=4 - │ DataSourceExec: file_groups={4 groups: [[/testdata/tpcds/plans_sf1_partitions4/customer/part-0.parquet], [/testdata/tpcds/plans_sf1_partitions4/customer/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/customer/part-2.parquet], [/testdata/tpcds/plans_sf1_partitions4/customer/part-3.parquet]]}, projection=[c_customer_sk, c_current_addr_sk, c_first_name, c_last_name], file_type=parquet, predicate=DynamicFilter [ empty ] + │ DistributedLeafExec: DataSourceExec: file_groups={2 groups: [[/testdata/tpcds/plans_sf1_partitions4/customer/part-0.parquet, /testdata/tpcds/plans_sf1_partitions4/customer/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/customer/part-2.parquet, /testdata/tpcds/plans_sf1_partitions4/customer/part-3.parquet]]}, projection=[c_customer_sk, c_current_addr_sk, c_first_name, c_last_name], file_type=parquet, predicate=DynamicFilter [ empty ] └────────────────────────────────────────────────── ┌───── Stage 6 ── Tasks: t0:[p0..p5] t1:[p6..p11] │ BroadcastExec: input_partitions=3, consumer_tasks=2, output_partitions=6 @@ -5005,14 +4622,13 @@ mod tests { │ AggregateExec: mode=Partial, gby=[ss_ticket_number@2 as ss_ticket_number, ss_customer_sk@0 as ss_customer_sk, ss_addr_sk@1 as ss_addr_sk, ca_city@5 as ca_city], aggr=[sum(store_sales.ss_coupon_amt), sum(store_sales.ss_net_profit)] │ HashJoinExec: mode=CollectLeft, join_type=Inner, on=[(ss_addr_sk@1, CAST(customer_address.ca_address_sk AS Float64)@2)], projection=[ss_customer_sk@0, ss_addr_sk@1, ss_ticket_number@2, ss_coupon_amt@3, ss_net_profit@4, ca_city@6] │ CoalescePartitionsExec - │ [Stage 4] => NetworkBroadcastExec: partitions_per_consumer=2, stage_partitions=6, input_tasks=3 + │ [Stage 4] => NetworkBroadcastExec: partitions_per_consumer=3, stage_partitions=9, input_tasks=3 │ ProjectionExec: expr=[ca_address_sk@0 as ca_address_sk, ca_city@1 as ca_city, CAST(ca_address_sk@0 AS Float64) as CAST(customer_address.ca_address_sk AS Float64)] │ RepartitionExec: partitioning=RoundRobinBatch(3), input_partitions=2 - │ PartitionIsolatorExec: tasks=3 partitions=4 - │ DataSourceExec: file_groups={4 groups: [[/testdata/tpcds/plans_sf1_partitions4/customer_address/part-0.parquet], [/testdata/tpcds/plans_sf1_partitions4/customer_address/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/customer_address/part-2.parquet], [/testdata/tpcds/plans_sf1_partitions4/customer_address/part-3.parquet]]}, projection=[ca_address_sk, ca_city], file_type=parquet, predicate=DynamicFilter [ empty ] + │ DistributedLeafExec: DataSourceExec: file_groups={2 groups: [[/testdata/tpcds/plans_sf1_partitions4/customer_address/part-0.parquet, /testdata/tpcds/plans_sf1_partitions4/customer_address/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/customer_address/part-2.parquet, /testdata/tpcds/plans_sf1_partitions4/customer_address/part-3.parquet]]}, projection=[ca_address_sk, ca_city], file_type=parquet, predicate=DynamicFilter [ empty ] └────────────────────────────────────────────────── - ┌───── Stage 4 ── Tasks: t0:[p0..p5] t1:[p6..p11] t2:[p12..p17] - │ BroadcastExec: input_partitions=2, consumer_tasks=3, output_partitions=6 + ┌───── Stage 4 ── Tasks: t0:[p0..p8] t1:[p9..p17] t2:[p18..p26] + │ BroadcastExec: input_partitions=3, consumer_tasks=3, output_partitions=9 │ HashJoinExec: mode=CollectLeft, join_type=Inner, on=[(CAST(household_demographics.hd_demo_sk AS Float64)@1, ss_hdemo_sk@1)], projection=[ss_customer_sk@2, ss_addr_sk@4, ss_ticket_number@5, ss_coupon_amt@6, ss_net_profit@7] │ CoalescePartitionsExec │ [Stage 1] => NetworkBroadcastExec: partitions_per_consumer=3, stage_partitions=9, input_tasks=2 @@ -5022,32 +4638,28 @@ mod tests { │ HashJoinExec: mode=CollectLeft, join_type=Inner, on=[(CAST(date_dim.d_date_sk AS Float64)@1, ss_sold_date_sk@0)], projection=[ss_customer_sk@3, ss_hdemo_sk@4, ss_addr_sk@5, ss_store_sk@6, ss_ticket_number@7, ss_coupon_amt@8, ss_net_profit@9] │ CoalescePartitionsExec │ [Stage 3] => NetworkBroadcastExec: partitions_per_consumer=3, stage_partitions=9, input_tasks=2 - │ PartitionIsolatorExec: tasks=3 partitions=6 - │ DataSourceExec: file_groups={6 groups: [[/testdata/tpcds/plans_sf1_partitions4/store_sales/part-0.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_sales/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_sales/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_sales/part-2.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_sales/part-2.parquet:..], ...]}, projection=[ss_sold_date_sk, ss_customer_sk, ss_hdemo_sk, ss_addr_sk, ss_store_sk, ss_ticket_number, ss_coupon_amt, ss_net_profit], file_type=parquet, predicate=DynamicFilter [ empty ] AND DynamicFilter [ empty ] AND DynamicFilter [ empty ] + │ DistributedLeafExec: DataSourceExec: file_groups={3 groups: [[/testdata/tpcds/plans_sf1_partitions4/store_sales/part-0.parquet:.., /testdata/tpcds/plans_sf1_partitions4/store_sales/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_sales/part-1.parquet:.., /testdata/tpcds/plans_sf1_partitions4/store_sales/part-2.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_sales/part-2.parquet:.., /testdata/tpcds/plans_sf1_partitions4/store_sales/part-3.parquet:..]]}, projection=[ss_sold_date_sk, ss_customer_sk, ss_hdemo_sk, ss_addr_sk, ss_store_sk, ss_ticket_number, ss_coupon_amt, ss_net_profit], file_type=parquet, predicate=DynamicFilter [ empty ] AND DynamicFilter [ empty ] AND DynamicFilter [ empty ] └────────────────────────────────────────────────── ┌───── Stage 1 ── Tasks: t0:[p0..p8] t1:[p9..p17] │ BroadcastExec: input_partitions=3, consumer_tasks=3, output_partitions=9 │ ProjectionExec: expr=[hd_demo_sk@0 as hd_demo_sk, CAST(hd_demo_sk@0 AS Float64) as CAST(household_demographics.hd_demo_sk AS Float64)] │ FilterExec: hd_dep_count@1 = 4 OR hd_vehicle_count@2 = 3, projection=[hd_demo_sk@0] │ RepartitionExec: partitioning=RoundRobinBatch(3), input_partitions=2 - │ PartitionIsolatorExec: tasks=2 partitions=4 - │ DataSourceExec: file_groups={4 groups: [[/testdata/tpcds/plans_sf1_partitions4/household_demographics/part-0.parquet], [/testdata/tpcds/plans_sf1_partitions4/household_demographics/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/household_demographics/part-2.parquet], [/testdata/tpcds/plans_sf1_partitions4/household_demographics/part-3.parquet]]}, projection=[hd_demo_sk, hd_dep_count, hd_vehicle_count], file_type=parquet, predicate=hd_dep_count@3 = 4 OR hd_vehicle_count@4 = 3, pruning_predicate=hd_dep_count_null_count@2 != row_count@3 AND hd_dep_count_min@0 <= 4 AND 4 <= hd_dep_count_max@1 OR hd_vehicle_count_null_count@6 != row_count@3 AND hd_vehicle_count_min@4 <= 3 AND 3 <= hd_vehicle_count_max@5, required_guarantees=[] + │ DistributedLeafExec: DataSourceExec: file_groups={2 groups: [[/testdata/tpcds/plans_sf1_partitions4/household_demographics/part-0.parquet, /testdata/tpcds/plans_sf1_partitions4/household_demographics/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/household_demographics/part-2.parquet, /testdata/tpcds/plans_sf1_partitions4/household_demographics/part-3.parquet]]}, projection=[hd_demo_sk, hd_dep_count, hd_vehicle_count], file_type=parquet, predicate=hd_dep_count@3 = 4 OR hd_vehicle_count@4 = 3, pruning_predicate=hd_dep_count_null_count@2 != row_count@3 AND hd_dep_count_min@0 <= 4 AND 4 <= hd_dep_count_max@1 OR hd_vehicle_count_null_count@6 != row_count@3 AND hd_vehicle_count_min@4 <= 3 AND 3 <= hd_vehicle_count_max@5, required_guarantees=[] └────────────────────────────────────────────────── ┌───── Stage 2 ── Tasks: t0:[p0..p8] t1:[p9..p17] │ BroadcastExec: input_partitions=3, consumer_tasks=3, output_partitions=9 │ ProjectionExec: expr=[s_store_sk@0 as s_store_sk, CAST(s_store_sk@0 AS Float64) as CAST(store.s_store_sk AS Float64)] │ FilterExec: s_city@1 = Fairview OR s_city@1 = Midway, projection=[s_store_sk@0] │ RepartitionExec: partitioning=RoundRobinBatch(3), input_partitions=2 - │ PartitionIsolatorExec: tasks=2 partitions=4 - │ DataSourceExec: file_groups={4 groups: [[/testdata/tpcds/plans_sf1_partitions4/store/part-0.parquet], [/testdata/tpcds/plans_sf1_partitions4/store/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/store/part-2.parquet], [/testdata/tpcds/plans_sf1_partitions4/store/part-3.parquet]]}, projection=[s_store_sk, s_city], file_type=parquet, predicate=s_city@22 = Fairview OR s_city@22 = Midway, pruning_predicate=s_city_null_count@2 != row_count@3 AND s_city_min@0 <= Fairview AND Fairview <= s_city_max@1 OR s_city_null_count@2 != row_count@3 AND s_city_min@0 <= Midway AND Midway <= s_city_max@1, required_guarantees=[s_city in (Fairview, Midway)] + │ DistributedLeafExec: DataSourceExec: file_groups={2 groups: [[/testdata/tpcds/plans_sf1_partitions4/store/part-0.parquet, /testdata/tpcds/plans_sf1_partitions4/store/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/store/part-2.parquet, /testdata/tpcds/plans_sf1_partitions4/store/part-3.parquet]]}, projection=[s_store_sk, s_city], file_type=parquet, predicate=s_city@22 = Fairview OR s_city@22 = Midway, pruning_predicate=s_city_null_count@2 != row_count@3 AND s_city_min@0 <= Fairview AND Fairview <= s_city_max@1 OR s_city_null_count@2 != row_count@3 AND s_city_min@0 <= Midway AND Midway <= s_city_max@1, required_guarantees=[s_city in (Fairview, Midway)] └────────────────────────────────────────────────── ┌───── Stage 3 ── Tasks: t0:[p0..p8] t1:[p9..p17] │ BroadcastExec: input_partitions=3, consumer_tasks=3, output_partitions=9 │ ProjectionExec: expr=[d_date_sk@0 as d_date_sk, CAST(d_date_sk@0 AS Float64) as CAST(date_dim.d_date_sk AS Float64)] │ FilterExec: (d_dow@2 = 6 OR d_dow@2 = 0) AND (d_year@1 = 1999 OR d_year@1 = 2000 OR d_year@1 = 2001), projection=[d_date_sk@0] │ RepartitionExec: partitioning=RoundRobinBatch(3), input_partitions=2 - │ PartitionIsolatorExec: tasks=2 partitions=4 - │ DataSourceExec: file_groups={4 groups: [[/testdata/tpcds/plans_sf1_partitions4/date_dim/part-0.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-2.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-3.parquet]]}, projection=[d_date_sk, d_year, d_dow], file_type=parquet, predicate=(d_dow@7 = 6 OR d_dow@7 = 0) AND (d_year@6 = 1999 OR d_year@6 = 2000 OR d_year@6 = 2001), pruning_predicate=(d_dow_null_count@2 != row_count@3 AND d_dow_min@0 <= 6 AND 6 <= d_dow_max@1 OR d_dow_null_count@2 != row_count@3 AND d_dow_min@0 <= 0 AND 0 <= d_dow_max@1) AND (d_year_null_count@6 != row_count@3 AND d_year_min@4 <= 1999 AND 1999 <= d_year_max@5 OR d_year_null_count@6 != row_count@3 AND d_year_min@4 <= 2000 AND 2000 <= d_year_max@5 OR d_year_null_count@6 != row_count@3 AND d_year_min@4 <= 2001 AND 2001 <= d_year_max@5), required_guarantees=[d_dow in (0, 6), d_year in (1999, 2000, 2001)] + │ DistributedLeafExec: DataSourceExec: file_groups={2 groups: [[/testdata/tpcds/plans_sf1_partitions4/date_dim/part-0.parquet, /testdata/tpcds/plans_sf1_partitions4/date_dim/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-2.parquet, /testdata/tpcds/plans_sf1_partitions4/date_dim/part-3.parquet]]}, projection=[d_date_sk, d_year, d_dow], file_type=parquet, predicate=(d_dow@7 = 6 OR d_dow@7 = 0) AND (d_year@6 = 1999 OR d_year@6 = 2000 OR d_year@6 = 2001), pruning_predicate=(d_dow_null_count@2 != row_count@3 AND d_dow_min@0 <= 6 AND 6 <= d_dow_max@1 OR d_dow_null_count@2 != row_count@3 AND d_dow_min@0 <= 0 AND 0 <= d_dow_max@1) AND (d_year_null_count@6 != row_count@3 AND d_year_min@4 <= 1999 AND 1999 <= d_year_max@5 OR d_year_null_count@6 != row_count@3 AND d_year_min@4 <= 2000 AND 2000 <= d_year_max@5 OR d_year_null_count@6 != row_count@3 AND d_year_min@4 <= 2001 AND 2001 <= d_year_max@5), required_guarantees=[d_dow in (0, 6), d_year in (1999, 2000, 2001)] └────────────────────────────────────────────────── "); Ok(()) @@ -5090,7 +4702,7 @@ mod tests { │ [Stage 4] => NetworkShuffleExec: output_partitions=3, input_tasks=3 └────────────────────────────────────────────────── ┌───── Stage 4 ── Tasks: t0:[p0..p5] t1:[p0..p5] t2:[p0..p5] - │ RepartitionExec: partitioning=Hash([i_category@0, i_brand@1, s_store_name@2, s_company_name@3, d_year@4, d_moy@5], 6), input_partitions=2 + │ RepartitionExec: partitioning=Hash([i_category@0, i_brand@1, s_store_name@2, s_company_name@3, d_year@4, d_moy@5], 6), input_partitions=3 │ AggregateExec: mode=Partial, gby=[i_category@1 as i_category, i_brand@0 as i_brand, s_store_name@5 as s_store_name, s_company_name@6 as s_company_name, d_year@3 as d_year, d_moy@4 as d_moy], aggr=[sum(store_sales.ss_sales_price)] │ ProjectionExec: expr=[i_brand@2 as i_brand, i_category@3 as i_category, ss_sales_price@4 as ss_sales_price, d_year@5 as d_year, d_moy@6 as d_moy, s_store_name@0 as s_store_name, s_company_name@1 as s_company_name] │ HashJoinExec: mode=CollectLeft, join_type=Inner, on=[(CAST(store.s_store_sk AS Float64)@3, ss_store_sk@2)], projection=[s_store_name@1, s_company_name@2, i_brand@4, i_category@5, ss_sales_price@7, d_year@8, d_moy@9] @@ -5103,26 +4715,22 @@ mod tests { │ HashJoinExec: mode=CollectLeft, join_type=Inner, on=[(i_item_sk@0, ss_item_sk@1)], projection=[i_brand@1, i_category@2, ss_sold_date_sk@3, ss_store_sk@5, ss_sales_price@6] │ CoalescePartitionsExec │ [Stage 3] => NetworkBroadcastExec: partitions_per_consumer=2, stage_partitions=6, input_tasks=2 - │ PartitionIsolatorExec: tasks=3 partitions=6 - │ DataSourceExec: file_groups={6 groups: [[/testdata/tpcds/plans_sf1_partitions4/store_sales/part-0.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_sales/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_sales/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_sales/part-2.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_sales/part-2.parquet:..], ...]}, projection=[ss_sold_date_sk, ss_item_sk, ss_store_sk, ss_sales_price], file_type=parquet, predicate=DynamicFilter [ empty ] AND DynamicFilter [ empty ] AND DynamicFilter [ empty ] + │ DistributedLeafExec: DataSourceExec: file_groups={3 groups: [[/testdata/tpcds/plans_sf1_partitions4/store_sales/part-0.parquet:.., /testdata/tpcds/plans_sf1_partitions4/store_sales/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_sales/part-1.parquet:.., /testdata/tpcds/plans_sf1_partitions4/store_sales/part-2.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_sales/part-2.parquet:.., /testdata/tpcds/plans_sf1_partitions4/store_sales/part-3.parquet:..]]}, projection=[ss_sold_date_sk, ss_item_sk, ss_store_sk, ss_sales_price], file_type=parquet, predicate=DynamicFilter [ empty ] AND DynamicFilter [ empty ] AND DynamicFilter [ empty ] └────────────────────────────────────────────────── ┌───── Stage 1 ── Tasks: t0:[p0..p5] t1:[p6..p11] │ BroadcastExec: input_partitions=2, consumer_tasks=3, output_partitions=6 - │ PartitionIsolatorExec: tasks=2 partitions=4 - │ DataSourceExec: file_groups={4 groups: [[/testdata/tpcds/plans_sf1_partitions4/store/part-0.parquet], [/testdata/tpcds/plans_sf1_partitions4/store/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/store/part-2.parquet], [/testdata/tpcds/plans_sf1_partitions4/store/part-3.parquet]]}, projection=[s_store_sk, s_store_name, s_company_name, CAST(s_store_sk@0 AS Float64) as CAST(store.s_store_sk AS Float64)], file_type=parquet + │ DistributedLeafExec: DataSourceExec: file_groups={2 groups: [[/testdata/tpcds/plans_sf1_partitions4/store/part-0.parquet, /testdata/tpcds/plans_sf1_partitions4/store/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/store/part-2.parquet, /testdata/tpcds/plans_sf1_partitions4/store/part-3.parquet]]}, projection=[s_store_sk, s_store_name, s_company_name, CAST(s_store_sk@0 AS Float64) as CAST(store.s_store_sk AS Float64)], file_type=parquet └────────────────────────────────────────────────── ┌───── Stage 2 ── Tasks: t0:[p0..p8] t1:[p9..p17] │ BroadcastExec: input_partitions=3, consumer_tasks=3, output_partitions=9 │ ProjectionExec: expr=[d_date_sk@0 as d_date_sk, d_year@1 as d_year, d_moy@2 as d_moy, CAST(d_date_sk@0 AS Float64) as CAST(date_dim.d_date_sk AS Float64)] │ FilterExec: d_year@1 = 1999 OR d_year@1 = 1998 AND d_moy@2 = 12 OR d_year@1 = 2000 AND d_moy@2 = 1 │ RepartitionExec: partitioning=RoundRobinBatch(3), input_partitions=2 - │ PartitionIsolatorExec: tasks=2 partitions=4 - │ DataSourceExec: file_groups={4 groups: [[/testdata/tpcds/plans_sf1_partitions4/date_dim/part-0.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-2.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-3.parquet]]}, projection=[d_date_sk, d_year, d_moy], file_type=parquet, predicate=d_year@6 = 1999 OR d_year@6 = 1998 AND d_moy@8 = 12 OR d_year@6 = 2000 AND d_moy@8 = 1, pruning_predicate=d_year_null_count@2 != row_count@3 AND d_year_min@0 <= 1999 AND 1999 <= d_year_max@1 OR d_year_null_count@2 != row_count@3 AND d_year_min@0 <= 1998 AND 1998 <= d_year_max@1 AND d_moy_null_count@6 != row_count@3 AND d_moy_min@4 <= 12 AND 12 <= d_moy_max@5 OR d_year_null_count@2 != row_count@3 AND d_year_min@0 <= 2000 AND 2000 <= d_year_max@1 AND d_moy_null_count@6 != row_count@3 AND d_moy_min@4 <= 1 AND 1 <= d_moy_max@5, required_guarantees=[d_year in (1998, 1999, 2000)] + │ DistributedLeafExec: DataSourceExec: file_groups={2 groups: [[/testdata/tpcds/plans_sf1_partitions4/date_dim/part-0.parquet, /testdata/tpcds/plans_sf1_partitions4/date_dim/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-2.parquet, /testdata/tpcds/plans_sf1_partitions4/date_dim/part-3.parquet]]}, projection=[d_date_sk, d_year, d_moy], file_type=parquet, predicate=d_year@6 = 1999 OR d_year@6 = 1998 AND d_moy@8 = 12 OR d_year@6 = 2000 AND d_moy@8 = 1, pruning_predicate=d_year_null_count@2 != row_count@3 AND d_year_min@0 <= 1999 AND 1999 <= d_year_max@1 OR d_year_null_count@2 != row_count@3 AND d_year_min@0 <= 1998 AND 1998 <= d_year_max@1 AND d_moy_null_count@6 != row_count@3 AND d_moy_min@4 <= 12 AND 12 <= d_moy_max@5 OR d_year_null_count@2 != row_count@3 AND d_year_min@0 <= 2000 AND 2000 <= d_year_max@1 AND d_moy_null_count@6 != row_count@3 AND d_moy_min@4 <= 1 AND 1 <= d_moy_max@5, required_guarantees=[d_year in (1998, 1999, 2000)] └────────────────────────────────────────────────── ┌───── Stage 3 ── Tasks: t0:[p0..p5] t1:[p6..p11] │ BroadcastExec: input_partitions=2, consumer_tasks=3, output_partitions=6 - │ PartitionIsolatorExec: tasks=2 partitions=4 - │ DataSourceExec: file_groups={4 groups: [[/testdata/tpcds/plans_sf1_partitions4/item/part-0.parquet], [/testdata/tpcds/plans_sf1_partitions4/item/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/item/part-2.parquet], [/testdata/tpcds/plans_sf1_partitions4/item/part-3.parquet]]}, projection=[i_item_sk, i_brand, i_category], file_type=parquet + │ DistributedLeafExec: DataSourceExec: file_groups={2 groups: [[/testdata/tpcds/plans_sf1_partitions4/item/part-0.parquet, /testdata/tpcds/plans_sf1_partitions4/item/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/item/part-2.parquet, /testdata/tpcds/plans_sf1_partitions4/item/part-3.parquet]]}, projection=[i_item_sk, i_brand, i_category], file_type=parquet └────────────────────────────────────────────────── ┌───── Stage 10 ── Tasks: t0:[p0..p2] t1:[p0..p2] │ RepartitionExec: partitioning=Hash([i_category@0, i_brand@1, s_store_name@2, s_company_name@3], 3), input_partitions=3 @@ -5130,7 +4738,7 @@ mod tests { │ [Stage 9] => NetworkShuffleExec: output_partitions=3, input_tasks=3 └────────────────────────────────────────────────── ┌───── Stage 9 ── Tasks: t0:[p0..p5] t1:[p0..p5] t2:[p0..p5] - │ RepartitionExec: partitioning=Hash([i_category@0, i_brand@1, s_store_name@2, s_company_name@3, d_year@4, d_moy@5], 6), input_partitions=2 + │ RepartitionExec: partitioning=Hash([i_category@0, i_brand@1, s_store_name@2, s_company_name@3, d_year@4, d_moy@5], 6), input_partitions=3 │ AggregateExec: mode=Partial, gby=[i_category@1 as i_category, i_brand@0 as i_brand, s_store_name@5 as s_store_name, s_company_name@6 as s_company_name, d_year@3 as d_year, d_moy@4 as d_moy], aggr=[sum(store_sales.ss_sales_price)] │ ProjectionExec: expr=[i_brand@2 as i_brand, i_category@3 as i_category, ss_sales_price@4 as ss_sales_price, d_year@5 as d_year, d_moy@6 as d_moy, s_store_name@0 as s_store_name, s_company_name@1 as s_company_name] │ HashJoinExec: mode=CollectLeft, join_type=Inner, on=[(CAST(store.s_store_sk AS Float64)@3, ss_store_sk@2)], projection=[s_store_name@1, s_company_name@2, i_brand@4, i_category@5, ss_sales_price@7, d_year@8, d_moy@9] @@ -5143,26 +4751,22 @@ mod tests { │ HashJoinExec: mode=CollectLeft, join_type=Inner, on=[(i_item_sk@0, ss_item_sk@1)], projection=[i_brand@1, i_category@2, ss_sold_date_sk@3, ss_store_sk@5, ss_sales_price@6] │ CoalescePartitionsExec │ [Stage 8] => NetworkBroadcastExec: partitions_per_consumer=2, stage_partitions=6, input_tasks=2 - │ PartitionIsolatorExec: tasks=3 partitions=6 - │ DataSourceExec: file_groups={6 groups: [[/testdata/tpcds/plans_sf1_partitions4/store_sales/part-0.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_sales/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_sales/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_sales/part-2.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_sales/part-2.parquet:..], ...]}, projection=[ss_sold_date_sk, ss_item_sk, ss_store_sk, ss_sales_price], file_type=parquet, predicate=DynamicFilter [ empty ] AND DynamicFilter [ empty ] AND DynamicFilter [ empty ] + │ DistributedLeafExec: DataSourceExec: file_groups={3 groups: [[/testdata/tpcds/plans_sf1_partitions4/store_sales/part-0.parquet:.., /testdata/tpcds/plans_sf1_partitions4/store_sales/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_sales/part-1.parquet:.., /testdata/tpcds/plans_sf1_partitions4/store_sales/part-2.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_sales/part-2.parquet:.., /testdata/tpcds/plans_sf1_partitions4/store_sales/part-3.parquet:..]]}, projection=[ss_sold_date_sk, ss_item_sk, ss_store_sk, ss_sales_price], file_type=parquet, predicate=DynamicFilter [ empty ] AND DynamicFilter [ empty ] AND DynamicFilter [ empty ] └────────────────────────────────────────────────── ┌───── Stage 6 ── Tasks: t0:[p0..p5] t1:[p6..p11] │ BroadcastExec: input_partitions=2, consumer_tasks=3, output_partitions=6 - │ PartitionIsolatorExec: tasks=2 partitions=4 - │ DataSourceExec: file_groups={4 groups: [[/testdata/tpcds/plans_sf1_partitions4/store/part-0.parquet], [/testdata/tpcds/plans_sf1_partitions4/store/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/store/part-2.parquet], [/testdata/tpcds/plans_sf1_partitions4/store/part-3.parquet]]}, projection=[s_store_sk, s_store_name, s_company_name, CAST(s_store_sk@0 AS Float64) as CAST(store.s_store_sk AS Float64)], file_type=parquet + │ DistributedLeafExec: DataSourceExec: file_groups={2 groups: [[/testdata/tpcds/plans_sf1_partitions4/store/part-0.parquet, /testdata/tpcds/plans_sf1_partitions4/store/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/store/part-2.parquet, /testdata/tpcds/plans_sf1_partitions4/store/part-3.parquet]]}, projection=[s_store_sk, s_store_name, s_company_name, CAST(s_store_sk@0 AS Float64) as CAST(store.s_store_sk AS Float64)], file_type=parquet └────────────────────────────────────────────────── ┌───── Stage 7 ── Tasks: t0:[p0..p8] t1:[p9..p17] │ BroadcastExec: input_partitions=3, consumer_tasks=3, output_partitions=9 │ ProjectionExec: expr=[d_date_sk@0 as d_date_sk, d_year@1 as d_year, d_moy@2 as d_moy, CAST(d_date_sk@0 AS Float64) as CAST(date_dim.d_date_sk AS Float64)] │ FilterExec: d_year@1 = 1999 OR d_year@1 = 1998 AND d_moy@2 = 12 OR d_year@1 = 2000 AND d_moy@2 = 1 │ RepartitionExec: partitioning=RoundRobinBatch(3), input_partitions=2 - │ PartitionIsolatorExec: tasks=2 partitions=4 - │ DataSourceExec: file_groups={4 groups: [[/testdata/tpcds/plans_sf1_partitions4/date_dim/part-0.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-2.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-3.parquet]]}, projection=[d_date_sk, d_year, d_moy], file_type=parquet, predicate=d_year@6 = 1999 OR d_year@6 = 1998 AND d_moy@8 = 12 OR d_year@6 = 2000 AND d_moy@8 = 1, pruning_predicate=d_year_null_count@2 != row_count@3 AND d_year_min@0 <= 1999 AND 1999 <= d_year_max@1 OR d_year_null_count@2 != row_count@3 AND d_year_min@0 <= 1998 AND 1998 <= d_year_max@1 AND d_moy_null_count@6 != row_count@3 AND d_moy_min@4 <= 12 AND 12 <= d_moy_max@5 OR d_year_null_count@2 != row_count@3 AND d_year_min@0 <= 2000 AND 2000 <= d_year_max@1 AND d_moy_null_count@6 != row_count@3 AND d_moy_min@4 <= 1 AND 1 <= d_moy_max@5, required_guarantees=[d_year in (1998, 1999, 2000)] + │ DistributedLeafExec: DataSourceExec: file_groups={2 groups: [[/testdata/tpcds/plans_sf1_partitions4/date_dim/part-0.parquet, /testdata/tpcds/plans_sf1_partitions4/date_dim/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-2.parquet, /testdata/tpcds/plans_sf1_partitions4/date_dim/part-3.parquet]]}, projection=[d_date_sk, d_year, d_moy], file_type=parquet, predicate=d_year@6 = 1999 OR d_year@6 = 1998 AND d_moy@8 = 12 OR d_year@6 = 2000 AND d_moy@8 = 1, pruning_predicate=d_year_null_count@2 != row_count@3 AND d_year_min@0 <= 1999 AND 1999 <= d_year_max@1 OR d_year_null_count@2 != row_count@3 AND d_year_min@0 <= 1998 AND 1998 <= d_year_max@1 AND d_moy_null_count@6 != row_count@3 AND d_moy_min@4 <= 12 AND 12 <= d_moy_max@5 OR d_year_null_count@2 != row_count@3 AND d_year_min@0 <= 2000 AND 2000 <= d_year_max@1 AND d_moy_null_count@6 != row_count@3 AND d_moy_min@4 <= 1 AND 1 <= d_moy_max@5, required_guarantees=[d_year in (1998, 1999, 2000)] └────────────────────────────────────────────────── ┌───── Stage 8 ── Tasks: t0:[p0..p5] t1:[p6..p11] │ BroadcastExec: input_partitions=2, consumer_tasks=3, output_partitions=6 - │ PartitionIsolatorExec: tasks=2 partitions=4 - │ DataSourceExec: file_groups={4 groups: [[/testdata/tpcds/plans_sf1_partitions4/item/part-0.parquet], [/testdata/tpcds/plans_sf1_partitions4/item/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/item/part-2.parquet], [/testdata/tpcds/plans_sf1_partitions4/item/part-3.parquet]]}, projection=[i_item_sk, i_brand, i_category], file_type=parquet + │ DistributedLeafExec: DataSourceExec: file_groups={2 groups: [[/testdata/tpcds/plans_sf1_partitions4/item/part-0.parquet, /testdata/tpcds/plans_sf1_partitions4/item/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/item/part-2.parquet, /testdata/tpcds/plans_sf1_partitions4/item/part-3.parquet]]}, projection=[i_item_sk, i_brand, i_category], file_type=parquet └────────────────────────────────────────────────── ┌───── Stage 15 ── Tasks: t0:[p0..p2] t1:[p0..p2] │ RepartitionExec: partitioning=Hash([i_category@0, i_brand@1, s_store_name@2, s_company_name@3], 3), input_partitions=3 @@ -5170,7 +4774,7 @@ mod tests { │ [Stage 14] => NetworkShuffleExec: output_partitions=3, input_tasks=3 └────────────────────────────────────────────────── ┌───── Stage 14 ── Tasks: t0:[p0..p5] t1:[p0..p5] t2:[p0..p5] - │ RepartitionExec: partitioning=Hash([i_category@0, i_brand@1, s_store_name@2, s_company_name@3, d_year@4, d_moy@5], 6), input_partitions=2 + │ RepartitionExec: partitioning=Hash([i_category@0, i_brand@1, s_store_name@2, s_company_name@3, d_year@4, d_moy@5], 6), input_partitions=3 │ AggregateExec: mode=Partial, gby=[i_category@1 as i_category, i_brand@0 as i_brand, s_store_name@5 as s_store_name, s_company_name@6 as s_company_name, d_year@3 as d_year, d_moy@4 as d_moy], aggr=[sum(store_sales.ss_sales_price)] │ ProjectionExec: expr=[i_brand@2 as i_brand, i_category@3 as i_category, ss_sales_price@4 as ss_sales_price, d_year@5 as d_year, d_moy@6 as d_moy, s_store_name@0 as s_store_name, s_company_name@1 as s_company_name] │ HashJoinExec: mode=CollectLeft, join_type=Inner, on=[(CAST(store.s_store_sk AS Float64)@3, ss_store_sk@2)], projection=[s_store_name@1, s_company_name@2, i_brand@4, i_category@5, ss_sales_price@7, d_year@8, d_moy@9] @@ -5183,26 +4787,22 @@ mod tests { │ HashJoinExec: mode=CollectLeft, join_type=Inner, on=[(i_item_sk@0, ss_item_sk@1)], projection=[i_brand@1, i_category@2, ss_sold_date_sk@3, ss_store_sk@5, ss_sales_price@6] │ CoalescePartitionsExec │ [Stage 13] => NetworkBroadcastExec: partitions_per_consumer=2, stage_partitions=6, input_tasks=2 - │ PartitionIsolatorExec: tasks=3 partitions=6 - │ DataSourceExec: file_groups={6 groups: [[/testdata/tpcds/plans_sf1_partitions4/store_sales/part-0.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_sales/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_sales/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_sales/part-2.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_sales/part-2.parquet:..], ...]}, projection=[ss_sold_date_sk, ss_item_sk, ss_store_sk, ss_sales_price], file_type=parquet, predicate=DynamicFilter [ empty ] AND DynamicFilter [ empty ] AND DynamicFilter [ empty ] + │ DistributedLeafExec: DataSourceExec: file_groups={3 groups: [[/testdata/tpcds/plans_sf1_partitions4/store_sales/part-0.parquet:.., /testdata/tpcds/plans_sf1_partitions4/store_sales/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_sales/part-1.parquet:.., /testdata/tpcds/plans_sf1_partitions4/store_sales/part-2.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_sales/part-2.parquet:.., /testdata/tpcds/plans_sf1_partitions4/store_sales/part-3.parquet:..]]}, projection=[ss_sold_date_sk, ss_item_sk, ss_store_sk, ss_sales_price], file_type=parquet, predicate=DynamicFilter [ empty ] AND DynamicFilter [ empty ] AND DynamicFilter [ empty ] └────────────────────────────────────────────────── ┌───── Stage 11 ── Tasks: t0:[p0..p5] t1:[p6..p11] │ BroadcastExec: input_partitions=2, consumer_tasks=3, output_partitions=6 - │ PartitionIsolatorExec: tasks=2 partitions=4 - │ DataSourceExec: file_groups={4 groups: [[/testdata/tpcds/plans_sf1_partitions4/store/part-0.parquet], [/testdata/tpcds/plans_sf1_partitions4/store/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/store/part-2.parquet], [/testdata/tpcds/plans_sf1_partitions4/store/part-3.parquet]]}, projection=[s_store_sk, s_store_name, s_company_name, CAST(s_store_sk@0 AS Float64) as CAST(store.s_store_sk AS Float64)], file_type=parquet + │ DistributedLeafExec: DataSourceExec: file_groups={2 groups: [[/testdata/tpcds/plans_sf1_partitions4/store/part-0.parquet, /testdata/tpcds/plans_sf1_partitions4/store/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/store/part-2.parquet, /testdata/tpcds/plans_sf1_partitions4/store/part-3.parquet]]}, projection=[s_store_sk, s_store_name, s_company_name, CAST(s_store_sk@0 AS Float64) as CAST(store.s_store_sk AS Float64)], file_type=parquet └────────────────────────────────────────────────── ┌───── Stage 12 ── Tasks: t0:[p0..p8] t1:[p9..p17] │ BroadcastExec: input_partitions=3, consumer_tasks=3, output_partitions=9 │ ProjectionExec: expr=[d_date_sk@0 as d_date_sk, d_year@1 as d_year, d_moy@2 as d_moy, CAST(d_date_sk@0 AS Float64) as CAST(date_dim.d_date_sk AS Float64)] │ FilterExec: d_year@1 = 1999 OR d_year@1 = 1998 AND d_moy@2 = 12 OR d_year@1 = 2000 AND d_moy@2 = 1 │ RepartitionExec: partitioning=RoundRobinBatch(3), input_partitions=2 - │ PartitionIsolatorExec: tasks=2 partitions=4 - │ DataSourceExec: file_groups={4 groups: [[/testdata/tpcds/plans_sf1_partitions4/date_dim/part-0.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-2.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-3.parquet]]}, projection=[d_date_sk, d_year, d_moy], file_type=parquet, predicate=d_year@6 = 1999 OR d_year@6 = 1998 AND d_moy@8 = 12 OR d_year@6 = 2000 AND d_moy@8 = 1, pruning_predicate=d_year_null_count@2 != row_count@3 AND d_year_min@0 <= 1999 AND 1999 <= d_year_max@1 OR d_year_null_count@2 != row_count@3 AND d_year_min@0 <= 1998 AND 1998 <= d_year_max@1 AND d_moy_null_count@6 != row_count@3 AND d_moy_min@4 <= 12 AND 12 <= d_moy_max@5 OR d_year_null_count@2 != row_count@3 AND d_year_min@0 <= 2000 AND 2000 <= d_year_max@1 AND d_moy_null_count@6 != row_count@3 AND d_moy_min@4 <= 1 AND 1 <= d_moy_max@5, required_guarantees=[d_year in (1998, 1999, 2000)] + │ DistributedLeafExec: DataSourceExec: file_groups={2 groups: [[/testdata/tpcds/plans_sf1_partitions4/date_dim/part-0.parquet, /testdata/tpcds/plans_sf1_partitions4/date_dim/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-2.parquet, /testdata/tpcds/plans_sf1_partitions4/date_dim/part-3.parquet]]}, projection=[d_date_sk, d_year, d_moy], file_type=parquet, predicate=d_year@6 = 1999 OR d_year@6 = 1998 AND d_moy@8 = 12 OR d_year@6 = 2000 AND d_moy@8 = 1, pruning_predicate=d_year_null_count@2 != row_count@3 AND d_year_min@0 <= 1999 AND 1999 <= d_year_max@1 OR d_year_null_count@2 != row_count@3 AND d_year_min@0 <= 1998 AND 1998 <= d_year_max@1 AND d_moy_null_count@6 != row_count@3 AND d_moy_min@4 <= 12 AND 12 <= d_moy_max@5 OR d_year_null_count@2 != row_count@3 AND d_year_min@0 <= 2000 AND 2000 <= d_year_max@1 AND d_moy_null_count@6 != row_count@3 AND d_moy_min@4 <= 1 AND 1 <= d_moy_max@5, required_guarantees=[d_year in (1998, 1999, 2000)] └────────────────────────────────────────────────── ┌───── Stage 13 ── Tasks: t0:[p0..p5] t1:[p6..p11] │ BroadcastExec: input_partitions=2, consumer_tasks=3, output_partitions=6 - │ PartitionIsolatorExec: tasks=2 partitions=4 - │ DataSourceExec: file_groups={4 groups: [[/testdata/tpcds/plans_sf1_partitions4/item/part-0.parquet], [/testdata/tpcds/plans_sf1_partitions4/item/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/item/part-2.parquet], [/testdata/tpcds/plans_sf1_partitions4/item/part-3.parquet]]}, projection=[i_item_sk, i_brand, i_category], file_type=parquet + │ DistributedLeafExec: DataSourceExec: file_groups={2 groups: [[/testdata/tpcds/plans_sf1_partitions4/item/part-0.parquet, /testdata/tpcds/plans_sf1_partitions4/item/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/item/part-2.parquet, /testdata/tpcds/plans_sf1_partitions4/item/part-3.parquet]]}, projection=[i_item_sk, i_brand, i_category], file_type=parquet └────────────────────────────────────────────────── "#); Ok(()) @@ -5210,7 +4810,7 @@ mod tests { #[tokio::test] async fn test_tpcds_48() -> Result<()> { let display = test_tpcds_query("q48").await?; - assert_snapshot!(display, @" + assert_snapshot!(display, @r" ┌───── DistributedExec ── Tasks: t0:[p0] │ AggregateExec: mode=Final, gby=[], aggr=[sum(store_sales.ss_quantity)] │ CoalescePartitionsExec @@ -5224,43 +4824,38 @@ mod tests { │ ProjectionExec: expr=[d_date_sk@0 as d_date_sk, CAST(d_date_sk@0 AS Float64) as CAST(date_dim.d_date_sk AS Float64)] │ FilterExec: d_year@1 = 2000, projection=[d_date_sk@0] │ RepartitionExec: partitioning=RoundRobinBatch(3), input_partitions=2 - │ PartitionIsolatorExec: tasks=3 partitions=4 - │ DataSourceExec: file_groups={4 groups: [[/testdata/tpcds/plans_sf1_partitions4/date_dim/part-0.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-2.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-3.parquet]]}, projection=[d_date_sk, d_year], file_type=parquet, predicate=d_year@6 = 2000 AND DynamicFilter [ empty ], pruning_predicate=d_year_null_count@2 != row_count@3 AND d_year_min@0 <= 2000 AND 2000 <= d_year_max@1, required_guarantees=[d_year in (2000)] + │ DistributedLeafExec: DataSourceExec: file_groups={2 groups: [[/testdata/tpcds/plans_sf1_partitions4/date_dim/part-0.parquet, /testdata/tpcds/plans_sf1_partitions4/date_dim/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-2.parquet, /testdata/tpcds/plans_sf1_partitions4/date_dim/part-3.parquet]]}, projection=[d_date_sk, d_year], file_type=parquet, predicate=d_year@6 = 2000 AND DynamicFilter [ empty ], pruning_predicate=d_year_null_count@2 != row_count@3 AND d_year_min@0 <= 2000 AND 2000 <= d_year_max@1, required_guarantees=[d_year in (2000)] └────────────────────────────────────────────────── ┌───── Stage 4 ── Tasks: t0:[p0..p8] t1:[p9..p17] t2:[p18..p26] │ BroadcastExec: input_partitions=3, consumer_tasks=3, output_partitions=9 │ HashJoinExec: mode=CollectLeft, join_type=Inner, on=[(ss_addr_sk@1, CAST(customer_address.ca_address_sk AS Float64)@2)], filter=(ca_state@1 = CO OR ca_state@1 = OH OR ca_state@1 = TX) AND ss_net_profit@0 >= Some(0),6,2 AND ss_net_profit@0 <= Some(200000),6,2 OR (ca_state@1 = OR OR ca_state@1 = MN OR ca_state@1 = KY) AND ss_net_profit@0 >= Some(15000),6,2 AND ss_net_profit@0 <= Some(300000),6,2 OR (ca_state@1 = VA OR ca_state@1 = CA OR ca_state@1 = MS) AND ss_net_profit@0 >= Some(5000),6,2 AND CAST(ss_net_profit@0 AS Decimal128(22, 2)) <= Some(2500000),22,2, projection=[ss_sold_date_sk@0, ss_quantity@2] │ CoalescePartitionsExec - │ [Stage 3] => NetworkBroadcastExec: partitions_per_consumer=2, stage_partitions=6, input_tasks=3 + │ [Stage 3] => NetworkBroadcastExec: partitions_per_consumer=3, stage_partitions=9, input_tasks=3 │ ProjectionExec: expr=[ca_address_sk@0 as ca_address_sk, ca_state@1 as ca_state, CAST(ca_address_sk@0 AS Float64) as CAST(customer_address.ca_address_sk AS Float64)] │ FilterExec: (ca_state@1 = CO OR ca_state@1 = OH OR ca_state@1 = TX OR ca_state@1 = OR OR ca_state@1 = MN OR ca_state@1 = KY OR ca_state@1 = VA OR ca_state@1 = CA OR ca_state@1 = MS) AND ca_state@1 IN (SET) ([CO, OH, TX, OR, MN, KY, VA, CA, MS]) AND ca_country@2 = United States, projection=[ca_address_sk@0, ca_state@1] │ RepartitionExec: partitioning=RoundRobinBatch(3), input_partitions=2 - │ PartitionIsolatorExec: tasks=3 partitions=4 - │ DataSourceExec: file_groups={4 groups: [[/testdata/tpcds/plans_sf1_partitions4/customer_address/part-0.parquet], [/testdata/tpcds/plans_sf1_partitions4/customer_address/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/customer_address/part-2.parquet], [/testdata/tpcds/plans_sf1_partitions4/customer_address/part-3.parquet]]}, projection=[ca_address_sk, ca_state, ca_country], file_type=parquet, predicate=(ca_state@8 = CO OR ca_state@8 = OH OR ca_state@8 = TX OR ca_state@8 = OR OR ca_state@8 = MN OR ca_state@8 = KY OR ca_state@8 = VA OR ca_state@8 = CA OR ca_state@8 = MS) AND ca_state@8 IN (SET) ([CO, OH, TX, OR, MN, KY, VA, CA, MS]) AND ca_country@10 = United States AND DynamicFilter [ empty ], pruning_predicate=(ca_state_null_count@2 != row_count@3 AND ca_state_min@0 <= CO AND CO <= ca_state_max@1 OR ca_state_null_count@2 != row_count@3 AND ca_state_min@0 <= OH AND OH <= ca_state_max@1 OR ca_state_null_count@2 != row_count@3 AND ca_state_min@0 <= TX AND TX <= ca_state_max@1 OR ca_state_null_count@2 != row_count@3 AND ca_state_min@0 <= OR AND OR <= ca_state_max@1 OR ca_state_null_count@2 != row_count@3 AND ca_state_min@0 <= MN AND MN <= ca_state_max@1 OR ca_state_null_count@2 != row_count@3 AND ca_state_min@0 <= KY AND KY <= ca_state_max@1 OR ca_state_null_count@2 != row_count@3 AND ca_state_min@0 <= VA AND VA <= ca_state_max@1 OR ca_state_null_count@2 != row_count@3 AND ca_state_min@0 <= CA AND CA <= ca_state_max@1 OR ca_state_null_count@2 != row_count@3 AND ca_state_min@0 <= MS AND MS <= ca_state_max@1) AND (ca_state_null_count@2 != row_count@3 AND ca_state_min@0 <= CO AND CO <= ca_state_max@1 OR ca_state_null_count@2 != row_count@3 AND ca_state_min@0 <= OH AND OH <= ca_state_max@1 OR ca_state_null_count@2 != row_count@3 AND ca_state_min@0 <= TX AND TX <= ca_state_max@1 OR ca_state_null_count@2 != row_count@3 AND ca_state_min@0 <= OR AND OR <= ca_state_max@1 OR ca_state_null_count@2 != row_count@3 AND ca_state_min@0 <= MN AND MN <= ca_state_max@1 OR ca_state_null_count@2 != row_count@3 AND ca_state_min@0 <= KY AND KY <= ca_state_max@1 OR ca_state_null_count@2 != row_count@3 AND ca_state_min@0 <= VA AND VA <= ca_state_max@1 OR ca_state_null_count@2 != row_count@3 AND ca_state_min@0 <= CA AND CA <= ca_state_max@1 OR ca_state_null_count@2 != row_count@3 AND ca_state_min@0 <= MS AND MS <= ca_state_max@1) AND ca_country_null_count@6 != row_count@3 AND ca_country_min@4 <= United States AND United States <= ca_country_max@5, required_guarantees=[ca_country in (United States), ca_state in (CA, CO, KY, MN, MS, OH, OR, TX, VA)] + │ DistributedLeafExec: DataSourceExec: file_groups={2 groups: [[/testdata/tpcds/plans_sf1_partitions4/customer_address/part-0.parquet, /testdata/tpcds/plans_sf1_partitions4/customer_address/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/customer_address/part-2.parquet, /testdata/tpcds/plans_sf1_partitions4/customer_address/part-3.parquet]]}, projection=[ca_address_sk, ca_state, ca_country], file_type=parquet, predicate=(ca_state@8 = CO OR ca_state@8 = OH OR ca_state@8 = TX OR ca_state@8 = OR OR ca_state@8 = MN OR ca_state@8 = KY OR ca_state@8 = VA OR ca_state@8 = CA OR ca_state@8 = MS) AND ca_state@8 IN (SET) ([CO, OH, TX, OR, MN, KY, VA, CA, MS]) AND ca_country@10 = United States AND DynamicFilter [ empty ], pruning_predicate=(ca_state_null_count@2 != row_count@3 AND ca_state_min@0 <= CO AND CO <= ca_state_max@1 OR ca_state_null_count@2 != row_count@3 AND ca_state_min@0 <= OH AND OH <= ca_state_max@1 OR ca_state_null_count@2 != row_count@3 AND ca_state_min@0 <= TX AND TX <= ca_state_max@1 OR ca_state_null_count@2 != row_count@3 AND ca_state_min@0 <= OR AND OR <= ca_state_max@1 OR ca_state_null_count@2 != row_count@3 AND ca_state_min@0 <= MN AND MN <= ca_state_max@1 OR ca_state_null_count@2 != row_count@3 AND ca_state_min@0 <= KY AND KY <= ca_state_max@1 OR ca_state_null_count@2 != row_count@3 AND ca_state_min@0 <= VA AND VA <= ca_state_max@1 OR ca_state_null_count@2 != row_count@3 AND ca_state_min@0 <= CA AND CA <= ca_state_max@1 OR ca_state_null_count@2 != row_count@3 AND ca_state_min@0 <= MS AND MS <= ca_state_max@1) AND (ca_state_null_count@2 != row_count@3 AND ca_state_min@0 <= CO AND CO <= ca_state_max@1 OR ca_state_null_count@2 != row_count@3 AND ca_state_min@0 <= OH AND OH <= ca_state_max@1 OR ca_state_null_count@2 != row_count@3 AND ca_state_min@0 <= TX AND TX <= ca_state_max@1 OR ca_state_null_count@2 != row_count@3 AND ca_state_min@0 <= OR AND OR <= ca_state_max@1 OR ca_state_null_count@2 != row_count@3 AND ca_state_min@0 <= MN AND MN <= ca_state_max@1 OR ca_state_null_count@2 != row_count@3 AND ca_state_min@0 <= KY AND KY <= ca_state_max@1 OR ca_state_null_count@2 != row_count@3 AND ca_state_min@0 <= VA AND VA <= ca_state_max@1 OR ca_state_null_count@2 != row_count@3 AND ca_state_min@0 <= CA AND CA <= ca_state_max@1 OR ca_state_null_count@2 != row_count@3 AND ca_state_min@0 <= MS AND MS <= ca_state_max@1) AND ca_country_null_count@6 != row_count@3 AND ca_country_min@4 <= United States AND United States <= ca_country_max@5, required_guarantees=[ca_country in (United States), ca_state in (CA, CO, KY, MN, MS, OH, OR, TX, VA)] └────────────────────────────────────────────────── - ┌───── Stage 3 ── Tasks: t0:[p0..p5] t1:[p6..p11] t2:[p12..p17] - │ BroadcastExec: input_partitions=2, consumer_tasks=3, output_partitions=6 + ┌───── Stage 3 ── Tasks: t0:[p0..p8] t1:[p9..p17] t2:[p18..p26] + │ BroadcastExec: input_partitions=3, consumer_tasks=3, output_partitions=9 │ HashJoinExec: mode=CollectLeft, join_type=Inner, on=[(ss_cdemo_sk@1, CAST(customer_demographics.cd_demo_sk AS Float64)@3)], filter=cd_marital_status@1 = M AND cd_education_status@2 = 4 yr Degree AND ss_sales_price@0 >= Some(10000),5,2 AND ss_sales_price@0 <= Some(15000),5,2 OR cd_marital_status@1 = D AND cd_education_status@2 = 2 yr Degree AND ss_sales_price@0 >= Some(5000),5,2 AND ss_sales_price@0 <= Some(10000),5,2 OR cd_marital_status@1 = S AND cd_education_status@2 = College AND ss_sales_price@0 >= Some(15000),5,2 AND ss_sales_price@0 <= Some(20000),5,2, projection=[ss_sold_date_sk@0, ss_addr_sk@2, ss_quantity@3, ss_net_profit@5] │ CoalescePartitionsExec - │ [Stage 2] => NetworkBroadcastExec: partitions_per_consumer=2, stage_partitions=6, input_tasks=3 + │ [Stage 2] => NetworkBroadcastExec: partitions_per_consumer=3, stage_partitions=9, input_tasks=3 │ ProjectionExec: expr=[cd_demo_sk@0 as cd_demo_sk, cd_marital_status@1 as cd_marital_status, cd_education_status@2 as cd_education_status, CAST(cd_demo_sk@0 AS Float64) as CAST(customer_demographics.cd_demo_sk AS Float64)] │ FilterExec: cd_marital_status@1 = M AND cd_education_status@2 = 4 yr Degree OR cd_marital_status@1 = D AND cd_education_status@2 = 2 yr Degree OR cd_marital_status@1 = S AND cd_education_status@2 = College - │ PartitionIsolatorExec: tasks=3 partitions=6 - │ DataSourceExec: file_groups={6 groups: [[/testdata/tpcds/plans_sf1_partitions4/customer_demographics/part-0.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/customer_demographics/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/customer_demographics/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/customer_demographics/part-2.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/customer_demographics/part-2.parquet:..], ...]}, projection=[cd_demo_sk, cd_marital_status, cd_education_status], file_type=parquet, predicate=(cd_marital_status@2 = M AND cd_education_status@3 = 4 yr Degree OR cd_marital_status@2 = D AND cd_education_status@3 = 2 yr Degree OR cd_marital_status@2 = S AND cd_education_status@3 = College) AND DynamicFilter [ empty ], pruning_predicate=cd_marital_status_null_count@2 != row_count@3 AND cd_marital_status_min@0 <= M AND M <= cd_marital_status_max@1 AND cd_education_status_null_count@6 != row_count@3 AND cd_education_status_min@4 <= 4 yr Degree AND 4 yr Degree <= cd_education_status_max@5 OR cd_marital_status_null_count@2 != row_count@3 AND cd_marital_status_min@0 <= D AND D <= cd_marital_status_max@1 AND cd_education_status_null_count@6 != row_count@3 AND cd_education_status_min@4 <= 2 yr Degree AND 2 yr Degree <= cd_education_status_max@5 OR cd_marital_status_null_count@2 != row_count@3 AND cd_marital_status_min@0 <= S AND S <= cd_marital_status_max@1 AND cd_education_status_null_count@6 != row_count@3 AND cd_education_status_min@4 <= College AND College <= cd_education_status_max@5, required_guarantees=[cd_education_status in (2 yr Degree, 4 yr Degree, College), cd_marital_status in (D, M, S)] + │ DistributedLeafExec: DataSourceExec: file_groups={3 groups: [[/testdata/tpcds/plans_sf1_partitions4/customer_demographics/part-0.parquet:.., /testdata/tpcds/plans_sf1_partitions4/customer_demographics/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/customer_demographics/part-1.parquet:.., /testdata/tpcds/plans_sf1_partitions4/customer_demographics/part-2.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/customer_demographics/part-2.parquet:.., /testdata/tpcds/plans_sf1_partitions4/customer_demographics/part-3.parquet:..]]}, projection=[cd_demo_sk, cd_marital_status, cd_education_status], file_type=parquet, predicate=(cd_marital_status@2 = M AND cd_education_status@3 = 4 yr Degree OR cd_marital_status@2 = D AND cd_education_status@3 = 2 yr Degree OR cd_marital_status@2 = S AND cd_education_status@3 = College) AND DynamicFilter [ empty ], pruning_predicate=cd_marital_status_null_count@2 != row_count@3 AND cd_marital_status_min@0 <= M AND M <= cd_marital_status_max@1 AND cd_education_status_null_count@6 != row_count@3 AND cd_education_status_min@4 <= 4 yr Degree AND 4 yr Degree <= cd_education_status_max@5 OR cd_marital_status_null_count@2 != row_count@3 AND cd_marital_status_min@0 <= D AND D <= cd_marital_status_max@1 AND cd_education_status_null_count@6 != row_count@3 AND cd_education_status_min@4 <= 2 yr Degree AND 2 yr Degree <= cd_education_status_max@5 OR cd_marital_status_null_count@2 != row_count@3 AND cd_marital_status_min@0 <= S AND S <= cd_marital_status_max@1 AND cd_education_status_null_count@6 != row_count@3 AND cd_education_status_min@4 <= College AND College <= cd_education_status_max@5, required_guarantees=[cd_education_status in (2 yr Degree, 4 yr Degree, College), cd_marital_status in (D, M, S)] └────────────────────────────────────────────────── - ┌───── Stage 2 ── Tasks: t0:[p0..p5] t1:[p6..p11] t2:[p12..p17] - │ BroadcastExec: input_partitions=2, consumer_tasks=3, output_partitions=6 + ┌───── Stage 2 ── Tasks: t0:[p0..p8] t1:[p9..p17] t2:[p18..p26] + │ BroadcastExec: input_partitions=3, consumer_tasks=3, output_partitions=9 │ HashJoinExec: mode=CollectLeft, join_type=Inner, on=[(CAST(store.s_store_sk AS Float64)@1, ss_store_sk@3)], projection=[ss_sold_date_sk@2, ss_cdemo_sk@3, ss_addr_sk@4, ss_quantity@6, ss_sales_price@7, ss_net_profit@8] │ CoalescePartitionsExec │ [Stage 1] => NetworkBroadcastExec: partitions_per_consumer=2, stage_partitions=6, input_tasks=2 │ FilterExec: (ss_net_profit@6 >= Some(0),6,2 AND ss_net_profit@6 <= Some(200000),6,2 OR ss_net_profit@6 >= Some(15000),6,2 AND ss_net_profit@6 <= Some(300000),6,2 OR ss_net_profit@6 >= Some(5000),6,2 AND CAST(ss_net_profit@6 AS Decimal128(22, 2)) <= Some(2500000),22,2) AND (ss_sales_price@5 >= Some(10000),5,2 AND ss_sales_price@5 <= Some(15000),5,2 OR ss_sales_price@5 >= Some(5000),5,2 AND ss_sales_price@5 <= Some(10000),5,2 OR ss_sales_price@5 >= Some(15000),5,2 AND ss_sales_price@5 <= Some(20000),5,2) - │ PartitionIsolatorExec: tasks=3 partitions=6 - │ DataSourceExec: file_groups={6 groups: [[/testdata/tpcds/plans_sf1_partitions4/store_sales/part-0.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_sales/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_sales/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_sales/part-2.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_sales/part-2.parquet:..], ...]}, projection=[ss_sold_date_sk, ss_cdemo_sk, ss_addr_sk, ss_store_sk, ss_quantity, ss_sales_price, ss_net_profit], file_type=parquet, predicate=(ss_net_profit@22 >= Some(0),6,2 AND ss_net_profit@22 <= Some(200000),6,2 OR ss_net_profit@22 >= Some(15000),6,2 AND ss_net_profit@22 <= Some(300000),6,2 OR ss_net_profit@22 >= Some(5000),6,2 AND CAST(ss_net_profit@22 AS Decimal128(22, 2)) <= Some(2500000),22,2) AND (ss_sales_price@13 >= Some(10000),5,2 AND ss_sales_price@13 <= Some(15000),5,2 OR ss_sales_price@13 >= Some(5000),5,2 AND ss_sales_price@13 <= Some(10000),5,2 OR ss_sales_price@13 >= Some(15000),5,2 AND ss_sales_price@13 <= Some(20000),5,2) AND DynamicFilter [ empty ], pruning_predicate=(ss_net_profit_null_count@1 != row_count@2 AND ss_net_profit_max@0 >= Some(0),6,2 AND ss_net_profit_null_count@1 != row_count@2 AND ss_net_profit_min@3 <= Some(200000),6,2 OR ss_net_profit_null_count@1 != row_count@2 AND ss_net_profit_max@0 >= Some(15000),6,2 AND ss_net_profit_null_count@1 != row_count@2 AND ss_net_profit_min@3 <= Some(300000),6,2 OR ss_net_profit_null_count@1 != row_count@2 AND ss_net_profit_max@0 >= Some(5000),6,2 AND ss_net_profit_null_count@1 != row_count@2 AND CAST(ss_net_profit_min@3 AS Decimal128(22, 2)) <= Some(2500000),22,2) AND (ss_sales_price_null_count@5 != row_count@2 AND ss_sales_price_max@4 >= Some(10000),5,2 AND ss_sales_price_null_count@5 != row_count@2 AND ss_sales_price_min@6 <= Some(15000),5,2 OR ss_sales_price_null_count@5 != row_count@2 AND ss_sales_price_max@4 >= Some(5000),5,2 AND ss_sales_price_null_count@5 != row_count@2 AND ss_sales_price_min@6 <= Some(10000),5,2 OR ss_sales_price_null_count@5 != row_count@2 AND ss_sales_price_max@4 >= Some(15000),5,2 AND ss_sales_price_null_count@5 != row_count@2 AND ss_sales_price_min@6 <= Some(20000),5,2), required_guarantees=[] + │ DistributedLeafExec: DataSourceExec: file_groups={3 groups: [[/testdata/tpcds/plans_sf1_partitions4/store_sales/part-0.parquet:.., /testdata/tpcds/plans_sf1_partitions4/store_sales/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_sales/part-1.parquet:.., /testdata/tpcds/plans_sf1_partitions4/store_sales/part-2.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_sales/part-2.parquet:.., /testdata/tpcds/plans_sf1_partitions4/store_sales/part-3.parquet:..]]}, projection=[ss_sold_date_sk, ss_cdemo_sk, ss_addr_sk, ss_store_sk, ss_quantity, ss_sales_price, ss_net_profit], file_type=parquet, predicate=(ss_net_profit@22 >= Some(0),6,2 AND ss_net_profit@22 <= Some(200000),6,2 OR ss_net_profit@22 >= Some(15000),6,2 AND ss_net_profit@22 <= Some(300000),6,2 OR ss_net_profit@22 >= Some(5000),6,2 AND CAST(ss_net_profit@22 AS Decimal128(22, 2)) <= Some(2500000),22,2) AND (ss_sales_price@13 >= Some(10000),5,2 AND ss_sales_price@13 <= Some(15000),5,2 OR ss_sales_price@13 >= Some(5000),5,2 AND ss_sales_price@13 <= Some(10000),5,2 OR ss_sales_price@13 >= Some(15000),5,2 AND ss_sales_price@13 <= Some(20000),5,2) AND DynamicFilter [ empty ], pruning_predicate=(ss_net_profit_null_count@1 != row_count@2 AND ss_net_profit_max@0 >= Some(0),6,2 AND ss_net_profit_null_count@1 != row_count@2 AND ss_net_profit_min@3 <= Some(200000),6,2 OR ss_net_profit_null_count@1 != row_count@2 AND ss_net_profit_max@0 >= Some(15000),6,2 AND ss_net_profit_null_count@1 != row_count@2 AND ss_net_profit_min@3 <= Some(300000),6,2 OR ss_net_profit_null_count@1 != row_count@2 AND ss_net_profit_max@0 >= Some(5000),6,2 AND ss_net_profit_null_count@1 != row_count@2 AND CAST(ss_net_profit_min@3 AS Decimal128(22, 2)) <= Some(2500000),22,2) AND (ss_sales_price_null_count@5 != row_count@2 AND ss_sales_price_max@4 >= Some(10000),5,2 AND ss_sales_price_null_count@5 != row_count@2 AND ss_sales_price_min@6 <= Some(15000),5,2 OR ss_sales_price_null_count@5 != row_count@2 AND ss_sales_price_max@4 >= Some(5000),5,2 AND ss_sales_price_null_count@5 != row_count@2 AND ss_sales_price_min@6 <= Some(10000),5,2 OR ss_sales_price_null_count@5 != row_count@2 AND ss_sales_price_max@4 >= Some(15000),5,2 AND ss_sales_price_null_count@5 != row_count@2 AND ss_sales_price_min@6 <= Some(20000),5,2), required_guarantees=[] └────────────────────────────────────────────────── ┌───── Stage 1 ── Tasks: t0:[p0..p5] t1:[p6..p11] │ BroadcastExec: input_partitions=2, consumer_tasks=3, output_partitions=6 - │ PartitionIsolatorExec: tasks=2 partitions=4 - │ DataSourceExec: file_groups={4 groups: [[/testdata/tpcds/plans_sf1_partitions4/store/part-0.parquet], [/testdata/tpcds/plans_sf1_partitions4/store/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/store/part-2.parquet], [/testdata/tpcds/plans_sf1_partitions4/store/part-3.parquet]]}, projection=[s_store_sk, CAST(s_store_sk@0 AS Float64) as CAST(store.s_store_sk AS Float64)], file_type=parquet + │ DistributedLeafExec: DataSourceExec: file_groups={2 groups: [[/testdata/tpcds/plans_sf1_partitions4/store/part-0.parquet, /testdata/tpcds/plans_sf1_partitions4/store/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/store/part-2.parquet, /testdata/tpcds/plans_sf1_partitions4/store/part-3.parquet]]}, projection=[s_store_sk, CAST(s_store_sk@0 AS Float64) as CAST(store.s_store_sk AS Float64)], file_type=parquet └────────────────────────────────────────────────── "); Ok(()) @@ -5328,19 +4923,16 @@ mod tests { │ ProjectionExec: expr=[d_date_sk@0 as d_date_sk, CAST(d_date_sk@0 AS Float64) as CAST(date_dim.d_date_sk AS Float64)] │ FilterExec: d_year@1 = 2001 AND d_moy@2 = 12, projection=[d_date_sk@0] │ RepartitionExec: partitioning=RoundRobinBatch(3), input_partitions=2 - │ PartitionIsolatorExec: tasks=2 partitions=4 - │ DataSourceExec: file_groups={4 groups: [[/testdata/tpcds/plans_sf1_partitions4/date_dim/part-0.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-2.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-3.parquet]]}, projection=[d_date_sk, d_year, d_moy], file_type=parquet, predicate=d_year@6 = 2001 AND d_moy@8 = 12, pruning_predicate=d_year_null_count@2 != row_count@3 AND d_year_min@0 <= 2001 AND 2001 <= d_year_max@1 AND d_moy_null_count@6 != row_count@3 AND d_moy_min@4 <= 12 AND 12 <= d_moy_max@5, required_guarantees=[d_moy in (12), d_year in (2001)] + │ DistributedLeafExec: DataSourceExec: file_groups={2 groups: [[/testdata/tpcds/plans_sf1_partitions4/date_dim/part-0.parquet, /testdata/tpcds/plans_sf1_partitions4/date_dim/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-2.parquet, /testdata/tpcds/plans_sf1_partitions4/date_dim/part-3.parquet]]}, projection=[d_date_sk, d_year, d_moy], file_type=parquet, predicate=d_year@6 = 2001 AND d_moy@8 = 12, pruning_predicate=d_year_null_count@2 != row_count@3 AND d_year_min@0 <= 2001 AND 2001 <= d_year_max@1 AND d_moy_null_count@6 != row_count@3 AND d_moy_min@4 <= 12 AND 12 <= d_moy_max@5, required_guarantees=[d_moy in (12), d_year in (2001)] └────────────────────────────────────────────────── ┌───── Stage 2 ── Tasks: t0:[p0..p5] t1:[p0..p5] │ RepartitionExec: partitioning=Hash([wr_order_number@1, wr_item_sk@0], 6), input_partitions=2 - │ PartitionIsolatorExec: tasks=2 partitions=4 - │ DataSourceExec: file_groups={4 groups: [[/testdata/tpcds/plans_sf1_partitions4/web_returns/part-0.parquet], [/testdata/tpcds/plans_sf1_partitions4/web_returns/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/web_returns/part-2.parquet], [/testdata/tpcds/plans_sf1_partitions4/web_returns/part-3.parquet]]}, projection=[wr_item_sk, wr_order_number, wr_return_quantity, wr_return_amt], file_type=parquet + │ DistributedLeafExec: DataSourceExec: file_groups={2 groups: [[/testdata/tpcds/plans_sf1_partitions4/web_returns/part-0.parquet, /testdata/tpcds/plans_sf1_partitions4/web_returns/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/web_returns/part-2.parquet, /testdata/tpcds/plans_sf1_partitions4/web_returns/part-3.parquet]]}, projection=[wr_item_sk, wr_order_number, wr_return_quantity, wr_return_amt], file_type=parquet └────────────────────────────────────────────────── ┌───── Stage 3 ── Tasks: t0:[p0..p5] t1:[p0..p5] t2:[p0..p5] - │ RepartitionExec: partitioning=Hash([ws_order_number@2, ws_item_sk@1], 6), input_partitions=2 + │ RepartitionExec: partitioning=Hash([ws_order_number@2, ws_item_sk@1], 6), input_partitions=3 │ FilterExec: ws_net_profit@5 > Some(100),7,2 AND ws_net_paid@4 > Some(0),7,2 AND ws_quantity@3 > 0, projection=[ws_sold_date_sk@0, ws_item_sk@1, ws_order_number@2, ws_quantity@3, ws_net_paid@4] - │ PartitionIsolatorExec: tasks=3 partitions=6 - │ DataSourceExec: file_groups={6 groups: [[/testdata/tpcds/plans_sf1_partitions4/web_sales/part-0.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/web_sales/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/web_sales/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/web_sales/part-2.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/web_sales/part-2.parquet:..], ...]}, projection=[ws_sold_date_sk, ws_item_sk, ws_order_number, ws_quantity, ws_net_paid, ws_net_profit], file_type=parquet, predicate=ws_net_profit@33 > Some(100),7,2 AND ws_net_paid@29 > Some(0),7,2 AND ws_quantity@18 > 0 AND DynamicFilter [ empty ], pruning_predicate=ws_net_profit_null_count@1 != row_count@2 AND ws_net_profit_max@0 > Some(100),7,2 AND ws_net_paid_null_count@4 != row_count@2 AND ws_net_paid_max@3 > Some(0),7,2 AND ws_quantity_null_count@6 != row_count@2 AND ws_quantity_max@5 > 0, required_guarantees=[] + │ DistributedLeafExec: DataSourceExec: file_groups={3 groups: [[/testdata/tpcds/plans_sf1_partitions4/web_sales/part-0.parquet:.., /testdata/tpcds/plans_sf1_partitions4/web_sales/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/web_sales/part-1.parquet:.., /testdata/tpcds/plans_sf1_partitions4/web_sales/part-2.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/web_sales/part-2.parquet:.., /testdata/tpcds/plans_sf1_partitions4/web_sales/part-3.parquet:..]]}, projection=[ws_sold_date_sk, ws_item_sk, ws_order_number, ws_quantity, ws_net_paid, ws_net_profit], file_type=parquet, predicate=ws_net_profit@33 > Some(100),7,2 AND ws_net_paid@29 > Some(0),7,2 AND ws_quantity@18 > 0 AND DynamicFilter [ empty ], pruning_predicate=ws_net_profit_null_count@1 != row_count@2 AND ws_net_profit_max@0 > Some(100),7,2 AND ws_net_paid_null_count@4 != row_count@2 AND ws_net_paid_max@3 > Some(0),7,2 AND ws_quantity_null_count@6 != row_count@2 AND ws_quantity_max@5 > 0, required_guarantees=[] └────────────────────────────────────────────────── ┌───── Stage 9 ── Tasks: t0:[p0..p2] t1:[p0..p2] │ SortExec: expr=[return_ratio@1 ASC NULLS LAST], preserve_partitioning=[true] @@ -5366,19 +4958,16 @@ mod tests { │ ProjectionExec: expr=[d_date_sk@0 as d_date_sk, CAST(d_date_sk@0 AS Float64) as CAST(date_dim.d_date_sk AS Float64)] │ FilterExec: d_year@1 = 2001 AND d_moy@2 = 12, projection=[d_date_sk@0] │ RepartitionExec: partitioning=RoundRobinBatch(3), input_partitions=2 - │ PartitionIsolatorExec: tasks=2 partitions=4 - │ DataSourceExec: file_groups={4 groups: [[/testdata/tpcds/plans_sf1_partitions4/date_dim/part-0.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-2.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-3.parquet]]}, projection=[d_date_sk, d_year, d_moy], file_type=parquet, predicate=d_year@6 = 2001 AND d_moy@8 = 12, pruning_predicate=d_year_null_count@2 != row_count@3 AND d_year_min@0 <= 2001 AND 2001 <= d_year_max@1 AND d_moy_null_count@6 != row_count@3 AND d_moy_min@4 <= 12 AND 12 <= d_moy_max@5, required_guarantees=[d_moy in (12), d_year in (2001)] + │ DistributedLeafExec: DataSourceExec: file_groups={2 groups: [[/testdata/tpcds/plans_sf1_partitions4/date_dim/part-0.parquet, /testdata/tpcds/plans_sf1_partitions4/date_dim/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-2.parquet, /testdata/tpcds/plans_sf1_partitions4/date_dim/part-3.parquet]]}, projection=[d_date_sk, d_year, d_moy], file_type=parquet, predicate=d_year@6 = 2001 AND d_moy@8 = 12, pruning_predicate=d_year_null_count@2 != row_count@3 AND d_year_min@0 <= 2001 AND 2001 <= d_year_max@1 AND d_moy_null_count@6 != row_count@3 AND d_moy_min@4 <= 12 AND 12 <= d_moy_max@5, required_guarantees=[d_moy in (12), d_year in (2001)] └────────────────────────────────────────────────── ┌───── Stage 6 ── Tasks: t0:[p0..p8] t1:[p0..p8] t2:[p0..p8] - │ RepartitionExec: partitioning=Hash([cr_order_number@1, cr_item_sk@0], 9), input_partitions=2 - │ PartitionIsolatorExec: tasks=3 partitions=6 - │ DataSourceExec: file_groups={6 groups: [[/testdata/tpcds/plans_sf1_partitions4/catalog_returns/part-0.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/catalog_returns/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/catalog_returns/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/catalog_returns/part-2.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/catalog_returns/part-2.parquet:..], ...]}, projection=[cr_item_sk, cr_order_number, cr_return_quantity, cr_return_amount], file_type=parquet + │ RepartitionExec: partitioning=Hash([cr_order_number@1, cr_item_sk@0], 9), input_partitions=3 + │ DistributedLeafExec: DataSourceExec: file_groups={3 groups: [[/testdata/tpcds/plans_sf1_partitions4/catalog_returns/part-0.parquet:.., /testdata/tpcds/plans_sf1_partitions4/catalog_returns/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/catalog_returns/part-1.parquet:.., /testdata/tpcds/plans_sf1_partitions4/catalog_returns/part-2.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/catalog_returns/part-2.parquet:.., /testdata/tpcds/plans_sf1_partitions4/catalog_returns/part-3.parquet:..]]}, projection=[cr_item_sk, cr_order_number, cr_return_quantity, cr_return_amount], file_type=parquet └────────────────────────────────────────────────── ┌───── Stage 7 ── Tasks: t0:[p0..p8] t1:[p0..p8] t2:[p0..p8] - │ RepartitionExec: partitioning=Hash([cs_order_number@2, cs_item_sk@1], 9), input_partitions=2 + │ RepartitionExec: partitioning=Hash([cs_order_number@2, cs_item_sk@1], 9), input_partitions=3 │ FilterExec: cs_net_profit@5 > Some(100),7,2 AND cs_net_paid@4 > Some(0),7,2 AND cs_quantity@3 > 0, projection=[cs_sold_date_sk@0, cs_item_sk@1, cs_order_number@2, cs_quantity@3, cs_net_paid@4] - │ PartitionIsolatorExec: tasks=3 partitions=6 - │ DataSourceExec: file_groups={6 groups: [[/testdata/tpcds/plans_sf1_partitions4/catalog_sales/part-0.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/catalog_sales/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/catalog_sales/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/catalog_sales/part-2.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/catalog_sales/part-2.parquet:..], ...]}, projection=[cs_sold_date_sk, cs_item_sk, cs_order_number, cs_quantity, cs_net_paid, cs_net_profit], file_type=parquet, predicate=cs_net_profit@33 > Some(100),7,2 AND cs_net_paid@29 > Some(0),7,2 AND cs_quantity@18 > 0 AND DynamicFilter [ empty ], pruning_predicate=cs_net_profit_null_count@1 != row_count@2 AND cs_net_profit_max@0 > Some(100),7,2 AND cs_net_paid_null_count@4 != row_count@2 AND cs_net_paid_max@3 > Some(0),7,2 AND cs_quantity_null_count@6 != row_count@2 AND cs_quantity_max@5 > 0, required_guarantees=[] + │ DistributedLeafExec: DataSourceExec: file_groups={3 groups: [[/testdata/tpcds/plans_sf1_partitions4/catalog_sales/part-0.parquet:.., /testdata/tpcds/plans_sf1_partitions4/catalog_sales/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/catalog_sales/part-1.parquet:.., /testdata/tpcds/plans_sf1_partitions4/catalog_sales/part-2.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/catalog_sales/part-2.parquet:.., /testdata/tpcds/plans_sf1_partitions4/catalog_sales/part-3.parquet:..]]}, projection=[cs_sold_date_sk, cs_item_sk, cs_order_number, cs_quantity, cs_net_paid, cs_net_profit], file_type=parquet, predicate=cs_net_profit@33 > Some(100),7,2 AND cs_net_paid@29 > Some(0),7,2 AND cs_quantity@18 > 0 AND DynamicFilter [ empty ], pruning_predicate=cs_net_profit_null_count@1 != row_count@2 AND cs_net_profit_max@0 > Some(100),7,2 AND cs_net_paid_null_count@4 != row_count@2 AND cs_net_paid_max@3 > Some(0),7,2 AND cs_quantity_null_count@6 != row_count@2 AND cs_quantity_max@5 > 0, required_guarantees=[] └────────────────────────────────────────────────── ┌───── Stage 14 ── Tasks: t0:[p0..p2] t1:[p0..p2] │ SortExec: expr=[return_ratio@1 ASC NULLS LAST], preserve_partitioning=[true] @@ -5404,19 +4993,16 @@ mod tests { │ ProjectionExec: expr=[d_date_sk@0 as d_date_sk, CAST(d_date_sk@0 AS Float64) as CAST(date_dim.d_date_sk AS Float64)] │ FilterExec: d_year@1 = 2001 AND d_moy@2 = 12, projection=[d_date_sk@0] │ RepartitionExec: partitioning=RoundRobinBatch(3), input_partitions=2 - │ PartitionIsolatorExec: tasks=2 partitions=4 - │ DataSourceExec: file_groups={4 groups: [[/testdata/tpcds/plans_sf1_partitions4/date_dim/part-0.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-2.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-3.parquet]]}, projection=[d_date_sk, d_year, d_moy], file_type=parquet, predicate=d_year@6 = 2001 AND d_moy@8 = 12, pruning_predicate=d_year_null_count@2 != row_count@3 AND d_year_min@0 <= 2001 AND 2001 <= d_year_max@1 AND d_moy_null_count@6 != row_count@3 AND d_moy_min@4 <= 12 AND 12 <= d_moy_max@5, required_guarantees=[d_moy in (12), d_year in (2001)] + │ DistributedLeafExec: DataSourceExec: file_groups={2 groups: [[/testdata/tpcds/plans_sf1_partitions4/date_dim/part-0.parquet, /testdata/tpcds/plans_sf1_partitions4/date_dim/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-2.parquet, /testdata/tpcds/plans_sf1_partitions4/date_dim/part-3.parquet]]}, projection=[d_date_sk, d_year, d_moy], file_type=parquet, predicate=d_year@6 = 2001 AND d_moy@8 = 12, pruning_predicate=d_year_null_count@2 != row_count@3 AND d_year_min@0 <= 2001 AND 2001 <= d_year_max@1 AND d_moy_null_count@6 != row_count@3 AND d_moy_min@4 <= 12 AND 12 <= d_moy_max@5, required_guarantees=[d_moy in (12), d_year in (2001)] └────────────────────────────────────────────────── ┌───── Stage 11 ── Tasks: t0:[p0..p8] t1:[p0..p8] t2:[p0..p8] - │ RepartitionExec: partitioning=Hash([sr_ticket_number@1, sr_item_sk@0], 9), input_partitions=2 - │ PartitionIsolatorExec: tasks=3 partitions=6 - │ DataSourceExec: file_groups={6 groups: [[/testdata/tpcds/plans_sf1_partitions4/store_returns/part-0.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_returns/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_returns/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_returns/part-2.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_returns/part-2.parquet:..], ...]}, projection=[sr_item_sk, sr_ticket_number, sr_return_quantity, sr_return_amt], file_type=parquet + │ RepartitionExec: partitioning=Hash([sr_ticket_number@1, sr_item_sk@0], 9), input_partitions=3 + │ DistributedLeafExec: DataSourceExec: file_groups={3 groups: [[/testdata/tpcds/plans_sf1_partitions4/store_returns/part-0.parquet:.., /testdata/tpcds/plans_sf1_partitions4/store_returns/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_returns/part-1.parquet:.., /testdata/tpcds/plans_sf1_partitions4/store_returns/part-2.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_returns/part-2.parquet:.., /testdata/tpcds/plans_sf1_partitions4/store_returns/part-3.parquet:..]]}, projection=[sr_item_sk, sr_ticket_number, sr_return_quantity, sr_return_amt], file_type=parquet └────────────────────────────────────────────────── ┌───── Stage 12 ── Tasks: t0:[p0..p8] t1:[p0..p8] t2:[p0..p8] - │ RepartitionExec: partitioning=Hash([ss_ticket_number@2, ss_item_sk@1], 9), input_partitions=2 + │ RepartitionExec: partitioning=Hash([ss_ticket_number@2, ss_item_sk@1], 9), input_partitions=3 │ FilterExec: ss_net_profit@5 > Some(100),6,2 AND ss_net_paid@4 > Some(0),7,2 AND ss_quantity@3 > 0, projection=[ss_sold_date_sk@0, ss_item_sk@1, ss_ticket_number@2, ss_quantity@3, ss_net_paid@4] - │ PartitionIsolatorExec: tasks=3 partitions=6 - │ DataSourceExec: file_groups={6 groups: [[/testdata/tpcds/plans_sf1_partitions4/store_sales/part-0.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_sales/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_sales/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_sales/part-2.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_sales/part-2.parquet:..], ...]}, projection=[ss_sold_date_sk, ss_item_sk, ss_ticket_number, ss_quantity, ss_net_paid, ss_net_profit], file_type=parquet, predicate=ss_net_profit@22 > Some(100),6,2 AND ss_net_paid@20 > Some(0),7,2 AND ss_quantity@10 > 0 AND DynamicFilter [ empty ], pruning_predicate=ss_net_profit_null_count@1 != row_count@2 AND ss_net_profit_max@0 > Some(100),6,2 AND ss_net_paid_null_count@4 != row_count@2 AND ss_net_paid_max@3 > Some(0),7,2 AND ss_quantity_null_count@6 != row_count@2 AND ss_quantity_max@5 > 0, required_guarantees=[] + │ DistributedLeafExec: DataSourceExec: file_groups={3 groups: [[/testdata/tpcds/plans_sf1_partitions4/store_sales/part-0.parquet:.., /testdata/tpcds/plans_sf1_partitions4/store_sales/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_sales/part-1.parquet:.., /testdata/tpcds/plans_sf1_partitions4/store_sales/part-2.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_sales/part-2.parquet:.., /testdata/tpcds/plans_sf1_partitions4/store_sales/part-3.parquet:..]]}, projection=[ss_sold_date_sk, ss_item_sk, ss_ticket_number, ss_quantity, ss_net_paid, ss_net_profit], file_type=parquet, predicate=ss_net_profit@22 > Some(100),6,2 AND ss_net_paid@20 > Some(0),7,2 AND ss_quantity@10 > 0 AND DynamicFilter [ empty ], pruning_predicate=ss_net_profit_null_count@1 != row_count@2 AND ss_net_profit_max@0 > Some(100),6,2 AND ss_net_paid_null_count@4 != row_count@2 AND ss_net_paid_max@3 > Some(0),7,2 AND ss_quantity_null_count@6 != row_count@2 AND ss_quantity_max@5 > 0, required_guarantees=[] └────────────────────────────────────────────────── "#); Ok(()) @@ -5424,7 +5010,7 @@ mod tests { #[tokio::test] async fn test_tpcds_50() -> Result<()> { let display = test_tpcds_query("q50").await?; - assert_snapshot!(display, @" + assert_snapshot!(display, @r" ┌───── DistributedExec ── Tasks: t0:[p0] │ SortPreservingMergeExec: [s_store_name@0 ASC NULLS LAST, s_company_id@1 ASC NULLS LAST, s_street_number@2 ASC NULLS LAST, s_street_name@3 ASC NULLS LAST, s_street_type@4 ASC NULLS LAST, s_suite_number@5 ASC NULLS LAST, s_city@6 ASC NULLS LAST, s_county@7 ASC NULLS LAST, s_state@8 ASC NULLS LAST, s_zip@9 ASC NULLS LAST], fetch=100 │ [Stage 7] => NetworkCoalesceExec: output_partitions=6, input_tasks=2 @@ -5445,8 +5031,7 @@ mod tests { │ ProjectionExec: expr=[d_date_sk@0 as d_date_sk, CAST(d_date_sk@0 AS Float64) as CAST(d2.d_date_sk AS Float64)] │ FilterExec: d_year@1 = 2001 AND d_moy@2 = 8, projection=[d_date_sk@0] │ RepartitionExec: partitioning=RoundRobinBatch(3), input_partitions=2 - │ PartitionIsolatorExec: tasks=3 partitions=4 - │ DataSourceExec: file_groups={4 groups: [[/testdata/tpcds/plans_sf1_partitions4/date_dim/part-0.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-2.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-3.parquet]]}, projection=[d_date_sk, d_year, d_moy], file_type=parquet, predicate=d_year@6 = 2001 AND d_moy@8 = 8 AND DynamicFilter [ empty ], pruning_predicate=d_year_null_count@2 != row_count@3 AND d_year_min@0 <= 2001 AND 2001 <= d_year_max@1 AND d_moy_null_count@6 != row_count@3 AND d_moy_min@4 <= 8 AND 8 <= d_moy_max@5, required_guarantees=[d_moy in (8), d_year in (2001)] + │ DistributedLeafExec: DataSourceExec: file_groups={2 groups: [[/testdata/tpcds/plans_sf1_partitions4/date_dim/part-0.parquet, /testdata/tpcds/plans_sf1_partitions4/date_dim/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-2.parquet, /testdata/tpcds/plans_sf1_partitions4/date_dim/part-3.parquet]]}, projection=[d_date_sk, d_year, d_moy], file_type=parquet, predicate=d_year@6 = 2001 AND d_moy@8 = 8 AND DynamicFilter [ empty ], pruning_predicate=d_year_null_count@2 != row_count@3 AND d_year_min@0 <= 2001 AND 2001 <= d_year_max@1 AND d_moy_null_count@6 != row_count@3 AND d_moy_min@4 <= 8 AND 8 <= d_moy_max@5, required_guarantees=[d_moy in (8), d_year in (2001)] └────────────────────────────────────────────────── ┌───── Stage 5 ── Tasks: t0:[p0..p8] t1:[p9..p17] t2:[p18..p26] │ BroadcastExec: input_partitions=3, consumer_tasks=3, output_partitions=9 @@ -5455,8 +5040,7 @@ mod tests { │ [Stage 4] => NetworkBroadcastExec: partitions_per_consumer=3, stage_partitions=9, input_tasks=3 │ ProjectionExec: expr=[d_date_sk@0 as d_date_sk, CAST(d_date_sk@0 AS Float64) as CAST(d1.d_date_sk AS Float64)] │ RepartitionExec: partitioning=RoundRobinBatch(3), input_partitions=2 - │ PartitionIsolatorExec: tasks=3 partitions=4 - │ DataSourceExec: file_groups={4 groups: [[/testdata/tpcds/plans_sf1_partitions4/date_dim/part-0.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-2.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-3.parquet]]}, projection=[d_date_sk], file_type=parquet, predicate=DynamicFilter [ empty ] + │ DistributedLeafExec: DataSourceExec: file_groups={2 groups: [[/testdata/tpcds/plans_sf1_partitions4/date_dim/part-0.parquet, /testdata/tpcds/plans_sf1_partitions4/date_dim/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-2.parquet, /testdata/tpcds/plans_sf1_partitions4/date_dim/part-3.parquet]]}, projection=[d_date_sk], file_type=parquet, predicate=DynamicFilter [ empty ] └────────────────────────────────────────────────── ┌───── Stage 4 ── Tasks: t0:[p0..p8] t1:[p9..p17] t2:[p18..p26] │ BroadcastExec: input_partitions=3, consumer_tasks=3, output_partitions=9 @@ -5471,18 +5055,15 @@ mod tests { └────────────────────────────────────────────────── ┌───── Stage 1 ── Tasks: t0:[p0..p5] t1:[p6..p11] │ BroadcastExec: input_partitions=2, consumer_tasks=3, output_partitions=6 - │ PartitionIsolatorExec: tasks=2 partitions=4 - │ DataSourceExec: file_groups={4 groups: [[/testdata/tpcds/plans_sf1_partitions4/store/part-0.parquet], [/testdata/tpcds/plans_sf1_partitions4/store/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/store/part-2.parquet], [/testdata/tpcds/plans_sf1_partitions4/store/part-3.parquet]]}, projection=[s_store_sk, s_store_name, s_company_id, s_street_number, s_street_name, s_street_type, s_suite_number, s_city, s_county, s_state, s_zip, CAST(s_store_sk@0 AS Float64) as CAST(store.s_store_sk AS Float64)], file_type=parquet + │ DistributedLeafExec: DataSourceExec: file_groups={2 groups: [[/testdata/tpcds/plans_sf1_partitions4/store/part-0.parquet, /testdata/tpcds/plans_sf1_partitions4/store/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/store/part-2.parquet, /testdata/tpcds/plans_sf1_partitions4/store/part-3.parquet]]}, projection=[s_store_sk, s_store_name, s_company_id, s_street_number, s_street_name, s_street_type, s_suite_number, s_city, s_county, s_state, s_zip, CAST(s_store_sk@0 AS Float64) as CAST(store.s_store_sk AS Float64)], file_type=parquet └────────────────────────────────────────────────── ┌───── Stage 2 ── Tasks: t0:[p0..p8] t1:[p0..p8] t2:[p0..p8] - │ RepartitionExec: partitioning=Hash([sr_ticket_number@3, sr_item_sk@1, sr_customer_sk@2], 9), input_partitions=2 - │ PartitionIsolatorExec: tasks=3 partitions=6 - │ DataSourceExec: file_groups={6 groups: [[/testdata/tpcds/plans_sf1_partitions4/store_returns/part-0.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_returns/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_returns/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_returns/part-2.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_returns/part-2.parquet:..], ...]}, projection=[sr_returned_date_sk, sr_item_sk, sr_customer_sk, sr_ticket_number], file_type=parquet + │ RepartitionExec: partitioning=Hash([sr_ticket_number@3, sr_item_sk@1, sr_customer_sk@2], 9), input_partitions=3 + │ DistributedLeafExec: DataSourceExec: file_groups={3 groups: [[/testdata/tpcds/plans_sf1_partitions4/store_returns/part-0.parquet:.., /testdata/tpcds/plans_sf1_partitions4/store_returns/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_returns/part-1.parquet:.., /testdata/tpcds/plans_sf1_partitions4/store_returns/part-2.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_returns/part-2.parquet:.., /testdata/tpcds/plans_sf1_partitions4/store_returns/part-3.parquet:..]]}, projection=[sr_returned_date_sk, sr_item_sk, sr_customer_sk, sr_ticket_number], file_type=parquet └────────────────────────────────────────────────── ┌───── Stage 3 ── Tasks: t0:[p0..p8] t1:[p0..p8] t2:[p0..p8] - │ RepartitionExec: partitioning=Hash([ss_ticket_number@4, ss_item_sk@1, ss_customer_sk@2], 9), input_partitions=2 - │ PartitionIsolatorExec: tasks=3 partitions=6 - │ DataSourceExec: file_groups={6 groups: [[/testdata/tpcds/plans_sf1_partitions4/store_sales/part-0.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_sales/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_sales/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_sales/part-2.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_sales/part-2.parquet:..], ...]}, projection=[ss_sold_date_sk, ss_item_sk, ss_customer_sk, ss_store_sk, ss_ticket_number], file_type=parquet, predicate=DynamicFilter [ empty ] AND DynamicFilter [ empty ] + │ RepartitionExec: partitioning=Hash([ss_ticket_number@4, ss_item_sk@1, ss_customer_sk@2], 9), input_partitions=3 + │ DistributedLeafExec: DataSourceExec: file_groups={3 groups: [[/testdata/tpcds/plans_sf1_partitions4/store_sales/part-0.parquet:.., /testdata/tpcds/plans_sf1_partitions4/store_sales/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_sales/part-1.parquet:.., /testdata/tpcds/plans_sf1_partitions4/store_sales/part-2.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_sales/part-2.parquet:.., /testdata/tpcds/plans_sf1_partitions4/store_sales/part-3.parquet:..]]}, projection=[ss_sold_date_sk, ss_item_sk, ss_customer_sk, ss_store_sk, ss_ticket_number], file_type=parquet, predicate=DynamicFilter [ empty ] AND DynamicFilter [ empty ] └────────────────────────────────────────────────── "); Ok(()) @@ -5517,23 +5098,21 @@ mod tests { │ [Stage 2] => NetworkShuffleExec: output_partitions=3, input_tasks=3 └────────────────────────────────────────────────── ┌───── Stage 2 ── Tasks: t0:[p0..p5] t1:[p0..p5] t2:[p0..p5] - │ RepartitionExec: partitioning=Hash([ws_item_sk@0, d_date@1], 6), input_partitions=2 + │ RepartitionExec: partitioning=Hash([ws_item_sk@0, d_date@1], 6), input_partitions=3 │ AggregateExec: mode=Partial, gby=[ws_item_sk@0 as ws_item_sk, d_date@2 as d_date], aggr=[sum(web_sales.ws_sales_price)] │ ProjectionExec: expr=[ws_item_sk@1 as ws_item_sk, ws_sales_price@2 as ws_sales_price, d_date@0 as d_date] │ HashJoinExec: mode=CollectLeft, join_type=Inner, on=[(CAST(date_dim.d_date_sk AS Float64)@2, ws_sold_date_sk@0)], projection=[d_date@1, ws_item_sk@4, ws_sales_price@5] │ CoalescePartitionsExec │ [Stage 1] => NetworkBroadcastExec: partitions_per_consumer=3, stage_partitions=9, input_tasks=2 │ FilterExec: ws_item_sk@1 IS NOT NULL - │ PartitionIsolatorExec: tasks=3 partitions=6 - │ DataSourceExec: file_groups={6 groups: [[/testdata/tpcds/plans_sf1_partitions4/web_sales/part-0.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/web_sales/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/web_sales/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/web_sales/part-2.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/web_sales/part-2.parquet:..], ...]}, projection=[ws_sold_date_sk, ws_item_sk, ws_sales_price], file_type=parquet, predicate=ws_item_sk@3 IS NOT NULL AND DynamicFilter [ empty ], pruning_predicate=ws_item_sk_null_count@1 != row_count@0, required_guarantees=[] + │ DistributedLeafExec: DataSourceExec: file_groups={3 groups: [[/testdata/tpcds/plans_sf1_partitions4/web_sales/part-0.parquet:.., /testdata/tpcds/plans_sf1_partitions4/web_sales/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/web_sales/part-1.parquet:.., /testdata/tpcds/plans_sf1_partitions4/web_sales/part-2.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/web_sales/part-2.parquet:.., /testdata/tpcds/plans_sf1_partitions4/web_sales/part-3.parquet:..]]}, projection=[ws_sold_date_sk, ws_item_sk, ws_sales_price], file_type=parquet, predicate=ws_item_sk@3 IS NOT NULL AND DynamicFilter [ empty ], pruning_predicate=ws_item_sk_null_count@1 != row_count@0, required_guarantees=[] └────────────────────────────────────────────────── ┌───── Stage 1 ── Tasks: t0:[p0..p8] t1:[p9..p17] │ BroadcastExec: input_partitions=3, consumer_tasks=3, output_partitions=9 │ ProjectionExec: expr=[d_date_sk@0 as d_date_sk, d_date@1 as d_date, CAST(d_date_sk@0 AS Float64) as CAST(date_dim.d_date_sk AS Float64)] │ FilterExec: d_month_seq@2 >= 1200 AND d_month_seq@2 <= 1211, projection=[d_date_sk@0, d_date@1] │ RepartitionExec: partitioning=RoundRobinBatch(3), input_partitions=2 - │ PartitionIsolatorExec: tasks=2 partitions=4 - │ DataSourceExec: file_groups={4 groups: [[/testdata/tpcds/plans_sf1_partitions4/date_dim/part-0.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-2.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-3.parquet]]}, projection=[d_date_sk, d_date, d_month_seq], file_type=parquet, predicate=d_month_seq@3 >= 1200 AND d_month_seq@3 <= 1211, pruning_predicate=d_month_seq_null_count@1 != row_count@2 AND d_month_seq_max@0 >= 1200 AND d_month_seq_null_count@1 != row_count@2 AND d_month_seq_min@3 <= 1211, required_guarantees=[] + │ DistributedLeafExec: DataSourceExec: file_groups={2 groups: [[/testdata/tpcds/plans_sf1_partitions4/date_dim/part-0.parquet, /testdata/tpcds/plans_sf1_partitions4/date_dim/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-2.parquet, /testdata/tpcds/plans_sf1_partitions4/date_dim/part-3.parquet]]}, projection=[d_date_sk, d_date, d_month_seq], file_type=parquet, predicate=d_month_seq@3 >= 1200 AND d_month_seq@3 <= 1211, pruning_predicate=d_month_seq_null_count@1 != row_count@2 AND d_month_seq_max@0 >= 1200 AND d_month_seq_null_count@1 != row_count@2 AND d_month_seq_min@3 <= 1211, required_guarantees=[] └────────────────────────────────────────────────── ┌───── Stage 6 ── Tasks: t0:[p0..p2] t1:[p0..p2] │ RepartitionExec: partitioning=Hash([ss_item_sk@0], 3), input_partitions=3 @@ -5541,23 +5120,21 @@ mod tests { │ [Stage 5] => NetworkShuffleExec: output_partitions=3, input_tasks=3 └────────────────────────────────────────────────── ┌───── Stage 5 ── Tasks: t0:[p0..p5] t1:[p0..p5] t2:[p0..p5] - │ RepartitionExec: partitioning=Hash([ss_item_sk@0, d_date@1], 6), input_partitions=2 + │ RepartitionExec: partitioning=Hash([ss_item_sk@0, d_date@1], 6), input_partitions=3 │ AggregateExec: mode=Partial, gby=[ss_item_sk@0 as ss_item_sk, d_date@2 as d_date], aggr=[sum(store_sales.ss_sales_price)] │ ProjectionExec: expr=[ss_item_sk@1 as ss_item_sk, ss_sales_price@2 as ss_sales_price, d_date@0 as d_date] │ HashJoinExec: mode=CollectLeft, join_type=Inner, on=[(CAST(date_dim.d_date_sk AS Float64)@2, ss_sold_date_sk@0)], projection=[d_date@1, ss_item_sk@4, ss_sales_price@5] │ CoalescePartitionsExec │ [Stage 4] => NetworkBroadcastExec: partitions_per_consumer=3, stage_partitions=9, input_tasks=2 │ FilterExec: ss_item_sk@1 IS NOT NULL - │ PartitionIsolatorExec: tasks=3 partitions=6 - │ DataSourceExec: file_groups={6 groups: [[/testdata/tpcds/plans_sf1_partitions4/store_sales/part-0.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_sales/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_sales/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_sales/part-2.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_sales/part-2.parquet:..], ...]}, projection=[ss_sold_date_sk, ss_item_sk, ss_sales_price], file_type=parquet, predicate=ss_item_sk@2 IS NOT NULL AND DynamicFilter [ empty ], pruning_predicate=ss_item_sk_null_count@1 != row_count@0, required_guarantees=[] + │ DistributedLeafExec: DataSourceExec: file_groups={3 groups: [[/testdata/tpcds/plans_sf1_partitions4/store_sales/part-0.parquet:.., /testdata/tpcds/plans_sf1_partitions4/store_sales/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_sales/part-1.parquet:.., /testdata/tpcds/plans_sf1_partitions4/store_sales/part-2.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_sales/part-2.parquet:.., /testdata/tpcds/plans_sf1_partitions4/store_sales/part-3.parquet:..]]}, projection=[ss_sold_date_sk, ss_item_sk, ss_sales_price], file_type=parquet, predicate=ss_item_sk@2 IS NOT NULL AND DynamicFilter [ empty ], pruning_predicate=ss_item_sk_null_count@1 != row_count@0, required_guarantees=[] └────────────────────────────────────────────────── ┌───── Stage 4 ── Tasks: t0:[p0..p8] t1:[p9..p17] │ BroadcastExec: input_partitions=3, consumer_tasks=3, output_partitions=9 │ ProjectionExec: expr=[d_date_sk@0 as d_date_sk, d_date@1 as d_date, CAST(d_date_sk@0 AS Float64) as CAST(date_dim.d_date_sk AS Float64)] │ FilterExec: d_month_seq@2 >= 1200 AND d_month_seq@2 <= 1211, projection=[d_date_sk@0, d_date@1] │ RepartitionExec: partitioning=RoundRobinBatch(3), input_partitions=2 - │ PartitionIsolatorExec: tasks=2 partitions=4 - │ DataSourceExec: file_groups={4 groups: [[/testdata/tpcds/plans_sf1_partitions4/date_dim/part-0.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-2.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-3.parquet]]}, projection=[d_date_sk, d_date, d_month_seq], file_type=parquet, predicate=d_month_seq@3 >= 1200 AND d_month_seq@3 <= 1211, pruning_predicate=d_month_seq_null_count@1 != row_count@2 AND d_month_seq_max@0 >= 1200 AND d_month_seq_null_count@1 != row_count@2 AND d_month_seq_min@3 <= 1211, required_guarantees=[] + │ DistributedLeafExec: DataSourceExec: file_groups={2 groups: [[/testdata/tpcds/plans_sf1_partitions4/date_dim/part-0.parquet, /testdata/tpcds/plans_sf1_partitions4/date_dim/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-2.parquet, /testdata/tpcds/plans_sf1_partitions4/date_dim/part-3.parquet]]}, projection=[d_date_sk, d_date, d_month_seq], file_type=parquet, predicate=d_month_seq@3 >= 1200 AND d_month_seq@3 <= 1211, pruning_predicate=d_month_seq_null_count@1 != row_count@2 AND d_month_seq_max@0 >= 1200 AND d_month_seq_null_count@1 != row_count@2 AND d_month_seq_min@3 <= 1211, required_guarantees=[] └────────────────────────────────────────────────── "#); Ok(()) @@ -5565,7 +5142,7 @@ mod tests { #[tokio::test] async fn test_tpcds_52() -> Result<()> { let display = test_tpcds_query("q52").await?; - assert_snapshot!(display, @" + assert_snapshot!(display, @r" ┌───── DistributedExec ── Tasks: t0:[p0] │ SortPreservingMergeExec: [d_year@0 ASC NULLS LAST, ext_price@3 DESC, brand_id@1 ASC NULLS LAST], fetch=100 │ [Stage 4] => NetworkCoalesceExec: output_partitions=6, input_tasks=2 @@ -5581,27 +5158,24 @@ mod tests { │ AggregateExec: mode=Partial, gby=[d_year@0 as d_year, i_brand@3 as i_brand, i_brand_id@2 as i_brand_id], aggr=[sum(store_sales.ss_ext_sales_price)], ordering_mode=PartiallySorted([0]) │ HashJoinExec: mode=CollectLeft, join_type=Inner, on=[(ss_item_sk@1, i_item_sk@0)], projection=[d_year@0, ss_ext_sales_price@2, i_brand_id@4, i_brand@5] │ CoalescePartitionsExec - │ [Stage 2] => NetworkBroadcastExec: partitions_per_consumer=2, stage_partitions=6, input_tasks=3 + │ [Stage 2] => NetworkBroadcastExec: partitions_per_consumer=3, stage_partitions=9, input_tasks=3 │ FilterExec: i_manager_id@3 = 1, projection=[i_item_sk@0, i_brand_id@1, i_brand@2] │ RepartitionExec: partitioning=RoundRobinBatch(3), input_partitions=2 - │ PartitionIsolatorExec: tasks=3 partitions=4 - │ DataSourceExec: file_groups={4 groups: [[/testdata/tpcds/plans_sf1_partitions4/item/part-0.parquet], [/testdata/tpcds/plans_sf1_partitions4/item/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/item/part-2.parquet], [/testdata/tpcds/plans_sf1_partitions4/item/part-3.parquet]]}, projection=[i_item_sk, i_brand_id, i_brand, i_manager_id], file_type=parquet, predicate=i_manager_id@20 = 1 AND DynamicFilter [ empty ], pruning_predicate=i_manager_id_null_count@2 != row_count@3 AND i_manager_id_min@0 <= 1 AND 1 <= i_manager_id_max@1, required_guarantees=[i_manager_id in (1)] + │ DistributedLeafExec: DataSourceExec: file_groups={2 groups: [[/testdata/tpcds/plans_sf1_partitions4/item/part-0.parquet, /testdata/tpcds/plans_sf1_partitions4/item/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/item/part-2.parquet, /testdata/tpcds/plans_sf1_partitions4/item/part-3.parquet]]}, projection=[i_item_sk, i_brand_id, i_brand, i_manager_id], file_type=parquet, predicate=i_manager_id@20 = 1 AND DynamicFilter [ empty ], pruning_predicate=i_manager_id_null_count@2 != row_count@3 AND i_manager_id_min@0 <= 1 AND 1 <= i_manager_id_max@1, required_guarantees=[i_manager_id in (1)] └────────────────────────────────────────────────── - ┌───── Stage 2 ── Tasks: t0:[p0..p5] t1:[p6..p11] t2:[p12..p17] - │ BroadcastExec: input_partitions=2, consumer_tasks=3, output_partitions=6 + ┌───── Stage 2 ── Tasks: t0:[p0..p8] t1:[p9..p17] t2:[p18..p26] + │ BroadcastExec: input_partitions=3, consumer_tasks=3, output_partitions=9 │ HashJoinExec: mode=CollectLeft, join_type=Inner, on=[(CAST(dt.d_date_sk AS Float64)@2, ss_sold_date_sk@0)], projection=[d_year@1, ss_item_sk@4, ss_ext_sales_price@5] │ CoalescePartitionsExec │ [Stage 1] => NetworkBroadcastExec: partitions_per_consumer=3, stage_partitions=9, input_tasks=2 - │ PartitionIsolatorExec: tasks=3 partitions=6 - │ DataSourceExec: file_groups={6 groups: [[/testdata/tpcds/plans_sf1_partitions4/store_sales/part-0.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_sales/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_sales/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_sales/part-2.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_sales/part-2.parquet:..], ...]}, projection=[ss_sold_date_sk, ss_item_sk, ss_ext_sales_price], file_type=parquet, predicate=DynamicFilter [ empty ] + │ DistributedLeafExec: DataSourceExec: file_groups={3 groups: [[/testdata/tpcds/plans_sf1_partitions4/store_sales/part-0.parquet:.., /testdata/tpcds/plans_sf1_partitions4/store_sales/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_sales/part-1.parquet:.., /testdata/tpcds/plans_sf1_partitions4/store_sales/part-2.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_sales/part-2.parquet:.., /testdata/tpcds/plans_sf1_partitions4/store_sales/part-3.parquet:..]]}, projection=[ss_sold_date_sk, ss_item_sk, ss_ext_sales_price], file_type=parquet, predicate=DynamicFilter [ empty ] └────────────────────────────────────────────────── ┌───── Stage 1 ── Tasks: t0:[p0..p8] t1:[p9..p17] │ BroadcastExec: input_partitions=3, consumer_tasks=3, output_partitions=9 │ ProjectionExec: expr=[d_date_sk@0 as d_date_sk, d_year@1 as d_year, CAST(d_date_sk@0 AS Float64) as CAST(dt.d_date_sk AS Float64)] │ FilterExec: d_moy@2 = 11 AND d_year@1 = 2000, projection=[d_date_sk@0, d_year@1] │ RepartitionExec: partitioning=RoundRobinBatch(3), input_partitions=2 - │ PartitionIsolatorExec: tasks=2 partitions=4 - │ DataSourceExec: file_groups={4 groups: [[/testdata/tpcds/plans_sf1_partitions4/date_dim/part-0.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-2.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-3.parquet]]}, projection=[d_date_sk, d_year, d_moy], file_type=parquet, predicate=d_moy@8 = 11 AND d_year@6 = 2000, pruning_predicate=d_moy_null_count@2 != row_count@3 AND d_moy_min@0 <= 11 AND 11 <= d_moy_max@1 AND d_year_null_count@6 != row_count@3 AND d_year_min@4 <= 2000 AND 2000 <= d_year_max@5, required_guarantees=[d_moy in (11), d_year in (2000)] + │ DistributedLeafExec: DataSourceExec: file_groups={2 groups: [[/testdata/tpcds/plans_sf1_partitions4/date_dim/part-0.parquet, /testdata/tpcds/plans_sf1_partitions4/date_dim/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-2.parquet, /testdata/tpcds/plans_sf1_partitions4/date_dim/part-3.parquet]]}, projection=[d_date_sk, d_year, d_moy], file_type=parquet, predicate=d_moy@8 = 11 AND d_year@6 = 2000, pruning_predicate=d_moy_null_count@2 != row_count@3 AND d_moy_min@0 <= 11 AND 11 <= d_moy_max@1 AND d_year_null_count@6 != row_count@3 AND d_year_min@4 <= 2000 AND 2000 <= d_year_max@5, required_guarantees=[d_moy in (11), d_year in (2000)] └────────────────────────────────────────────────── "); Ok(()) @@ -5626,7 +5200,7 @@ mod tests { │ [Stage 4] => NetworkShuffleExec: output_partitions=3, input_tasks=3 └────────────────────────────────────────────────── ┌───── Stage 4 ── Tasks: t0:[p0..p5] t1:[p0..p5] t2:[p0..p5] - │ RepartitionExec: partitioning=Hash([i_manufact_id@0, d_qoy@1], 6), input_partitions=2 + │ RepartitionExec: partitioning=Hash([i_manufact_id@0, d_qoy@1], 6), input_partitions=3 │ AggregateExec: mode=Partial, gby=[i_manufact_id@0 as i_manufact_id, d_qoy@2 as d_qoy], aggr=[sum(store_sales.ss_sales_price)] │ HashJoinExec: mode=CollectLeft, join_type=Inner, on=[(CAST(store.s_store_sk AS Float64)@1, ss_store_sk@1)], projection=[i_manufact_id@2, ss_sales_price@4, d_qoy@5] │ CoalescePartitionsExec @@ -5638,28 +5212,24 @@ mod tests { │ HashJoinExec: mode=CollectLeft, join_type=Inner, on=[(i_item_sk@0, ss_item_sk@1)], projection=[i_manufact_id@1, ss_sold_date_sk@2, ss_store_sk@4, ss_sales_price@5] │ CoalescePartitionsExec │ [Stage 3] => NetworkBroadcastExec: partitions_per_consumer=3, stage_partitions=9, input_tasks=2 - │ PartitionIsolatorExec: tasks=3 partitions=6 - │ DataSourceExec: file_groups={6 groups: [[/testdata/tpcds/plans_sf1_partitions4/store_sales/part-0.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_sales/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_sales/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_sales/part-2.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_sales/part-2.parquet:..], ...]}, projection=[ss_sold_date_sk, ss_item_sk, ss_store_sk, ss_sales_price], file_type=parquet, predicate=DynamicFilter [ empty ] AND DynamicFilter [ empty ] AND DynamicFilter [ empty ] + │ DistributedLeafExec: DataSourceExec: file_groups={3 groups: [[/testdata/tpcds/plans_sf1_partitions4/store_sales/part-0.parquet:.., /testdata/tpcds/plans_sf1_partitions4/store_sales/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_sales/part-1.parquet:.., /testdata/tpcds/plans_sf1_partitions4/store_sales/part-2.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_sales/part-2.parquet:.., /testdata/tpcds/plans_sf1_partitions4/store_sales/part-3.parquet:..]]}, projection=[ss_sold_date_sk, ss_item_sk, ss_store_sk, ss_sales_price], file_type=parquet, predicate=DynamicFilter [ empty ] AND DynamicFilter [ empty ] AND DynamicFilter [ empty ] └────────────────────────────────────────────────── ┌───── Stage 1 ── Tasks: t0:[p0..p5] t1:[p6..p11] │ BroadcastExec: input_partitions=2, consumer_tasks=3, output_partitions=6 - │ PartitionIsolatorExec: tasks=2 partitions=4 - │ DataSourceExec: file_groups={4 groups: [[/testdata/tpcds/plans_sf1_partitions4/store/part-0.parquet], [/testdata/tpcds/plans_sf1_partitions4/store/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/store/part-2.parquet], [/testdata/tpcds/plans_sf1_partitions4/store/part-3.parquet]]}, projection=[s_store_sk, CAST(s_store_sk@0 AS Float64) as CAST(store.s_store_sk AS Float64)], file_type=parquet + │ DistributedLeafExec: DataSourceExec: file_groups={2 groups: [[/testdata/tpcds/plans_sf1_partitions4/store/part-0.parquet, /testdata/tpcds/plans_sf1_partitions4/store/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/store/part-2.parquet, /testdata/tpcds/plans_sf1_partitions4/store/part-3.parquet]]}, projection=[s_store_sk, CAST(s_store_sk@0 AS Float64) as CAST(store.s_store_sk AS Float64)], file_type=parquet └────────────────────────────────────────────────── ┌───── Stage 2 ── Tasks: t0:[p0..p8] t1:[p9..p17] │ BroadcastExec: input_partitions=3, consumer_tasks=3, output_partitions=9 │ ProjectionExec: expr=[d_date_sk@0 as d_date_sk, d_qoy@1 as d_qoy, CAST(d_date_sk@0 AS Float64) as CAST(date_dim.d_date_sk AS Float64)] │ FilterExec: d_month_seq@1 IN (SET) ([1200, 1201, 1202, 1203, 1204, 1205, 1206, 1207, 1208, 1209, 1210, 1211]), projection=[d_date_sk@0, d_qoy@2] │ RepartitionExec: partitioning=RoundRobinBatch(3), input_partitions=2 - │ PartitionIsolatorExec: tasks=2 partitions=4 - │ DataSourceExec: file_groups={4 groups: [[/testdata/tpcds/plans_sf1_partitions4/date_dim/part-0.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-2.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-3.parquet]]}, projection=[d_date_sk, d_month_seq, d_qoy], file_type=parquet, predicate=d_month_seq@3 IN (SET) ([1200, 1201, 1202, 1203, 1204, 1205, 1206, 1207, 1208, 1209, 1210, 1211]), pruning_predicate=d_month_seq_null_count@2 != row_count@3 AND d_month_seq_min@0 <= 1200 AND 1200 <= d_month_seq_max@1 OR d_month_seq_null_count@2 != row_count@3 AND d_month_seq_min@0 <= 1201 AND 1201 <= d_month_seq_max@1 OR d_month_seq_null_count@2 != row_count@3 AND d_month_seq_min@0 <= 1202 AND 1202 <= d_month_seq_max@1 OR d_month_seq_null_count@2 != row_count@3 AND d_month_seq_min@0 <= 1203 AND 1203 <= d_month_seq_max@1 OR d_month_seq_null_count@2 != row_count@3 AND d_month_seq_min@0 <= 1204 AND 1204 <= d_month_seq_max@1 OR d_month_seq_null_count@2 != row_count@3 AND d_month_seq_min@0 <= 1205 AND 1205 <= d_month_seq_max@1 OR d_month_seq_null_count@2 != row_count@3 AND d_month_seq_min@0 <= 1206 AND 1206 <= d_month_seq_max@1 OR d_month_seq_null_count@2 != row_count@3 AND d_month_seq_min@0 <= 1207 AND 1207 <= d_month_seq_max@1 OR d_month_seq_null_count@2 != row_count@3 AND d_month_seq_min@0 <= 1208 AND 1208 <= d_month_seq_max@1 OR d_month_seq_null_count@2 != row_count@3 AND d_month_seq_min@0 <= 1209 AND 1209 <= d_month_seq_max@1 OR d_month_seq_null_count@2 != row_count@3 AND d_month_seq_min@0 <= 1210 AND 1210 <= d_month_seq_max@1 OR d_month_seq_null_count@2 != row_count@3 AND d_month_seq_min@0 <= 1211 AND 1211 <= d_month_seq_max@1, required_guarantees=[d_month_seq in (1200, 1201, 1202, 1203, 1204, 1205, 1206, 1207, 1208, 1209, 1210, 1211)] + │ DistributedLeafExec: DataSourceExec: file_groups={2 groups: [[/testdata/tpcds/plans_sf1_partitions4/date_dim/part-0.parquet, /testdata/tpcds/plans_sf1_partitions4/date_dim/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-2.parquet, /testdata/tpcds/plans_sf1_partitions4/date_dim/part-3.parquet]]}, projection=[d_date_sk, d_month_seq, d_qoy], file_type=parquet, predicate=d_month_seq@3 IN (SET) ([1200, 1201, 1202, 1203, 1204, 1205, 1206, 1207, 1208, 1209, 1210, 1211]), pruning_predicate=d_month_seq_null_count@2 != row_count@3 AND d_month_seq_min@0 <= 1200 AND 1200 <= d_month_seq_max@1 OR d_month_seq_null_count@2 != row_count@3 AND d_month_seq_min@0 <= 1201 AND 1201 <= d_month_seq_max@1 OR d_month_seq_null_count@2 != row_count@3 AND d_month_seq_min@0 <= 1202 AND 1202 <= d_month_seq_max@1 OR d_month_seq_null_count@2 != row_count@3 AND d_month_seq_min@0 <= 1203 AND 1203 <= d_month_seq_max@1 OR d_month_seq_null_count@2 != row_count@3 AND d_month_seq_min@0 <= 1204 AND 1204 <= d_month_seq_max@1 OR d_month_seq_null_count@2 != row_count@3 AND d_month_seq_min@0 <= 1205 AND 1205 <= d_month_seq_max@1 OR d_month_seq_null_count@2 != row_count@3 AND d_month_seq_min@0 <= 1206 AND 1206 <= d_month_seq_max@1 OR d_month_seq_null_count@2 != row_count@3 AND d_month_seq_min@0 <= 1207 AND 1207 <= d_month_seq_max@1 OR d_month_seq_null_count@2 != row_count@3 AND d_month_seq_min@0 <= 1208 AND 1208 <= d_month_seq_max@1 OR d_month_seq_null_count@2 != row_count@3 AND d_month_seq_min@0 <= 1209 AND 1209 <= d_month_seq_max@1 OR d_month_seq_null_count@2 != row_count@3 AND d_month_seq_min@0 <= 1210 AND 1210 <= d_month_seq_max@1 OR d_month_seq_null_count@2 != row_count@3 AND d_month_seq_min@0 <= 1211 AND 1211 <= d_month_seq_max@1, required_guarantees=[d_month_seq in (1200, 1201, 1202, 1203, 1204, 1205, 1206, 1207, 1208, 1209, 1210, 1211)] └────────────────────────────────────────────────── ┌───── Stage 3 ── Tasks: t0:[p0..p8] t1:[p9..p17] │ BroadcastExec: input_partitions=3, consumer_tasks=3, output_partitions=9 │ FilterExec: (i_category@3 = Books OR i_category@3 = Children OR i_category@3 = Electronics) AND i_class@2 IN (SET) ([personal, portable, reference, self-help]) AND i_brand@1 IN (SET) ([scholaramalgamalg #14, scholaramalgamalg #7, exportiunivamalg #9, scholaramalgamalg #9]) OR (i_category@3 = Women OR i_category@3 = Music OR i_category@3 = Men) AND i_class@2 IN (SET) ([accessories, classical, fragrances, pants]) AND i_brand@1 IN (SET) ([amalgimporto #1, edu packscholar #1, exportiimporto #1, importoamalg #1]), projection=[i_item_sk@0, i_manufact_id@4] │ RepartitionExec: partitioning=RoundRobinBatch(3), input_partitions=2 - │ PartitionIsolatorExec: tasks=2 partitions=4 - │ DataSourceExec: file_groups={4 groups: [[/testdata/tpcds/plans_sf1_partitions4/item/part-0.parquet], [/testdata/tpcds/plans_sf1_partitions4/item/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/item/part-2.parquet], [/testdata/tpcds/plans_sf1_partitions4/item/part-3.parquet]]}, projection=[i_item_sk, i_brand, i_class, i_category, i_manufact_id], file_type=parquet, predicate=(i_category@12 = Books OR i_category@12 = Children OR i_category@12 = Electronics) AND i_class@10 IN (SET) ([personal, portable, reference, self-help]) AND i_brand@8 IN (SET) ([scholaramalgamalg #14, scholaramalgamalg #7, exportiunivamalg #9, scholaramalgamalg #9]) OR (i_category@12 = Women OR i_category@12 = Music OR i_category@12 = Men) AND i_class@10 IN (SET) ([accessories, classical, fragrances, pants]) AND i_brand@8 IN (SET) ([amalgimporto #1, edu packscholar #1, exportiimporto #1, importoamalg #1]), pruning_predicate=(i_category_null_count@2 != row_count@3 AND i_category_min@0 <= Books AND Books <= i_category_max@1 OR i_category_null_count@2 != row_count@3 AND i_category_min@0 <= Children AND Children <= i_category_max@1 OR i_category_null_count@2 != row_count@3 AND i_category_min@0 <= Electronics AND Electronics <= i_category_max@1) AND (i_class_null_count@6 != row_count@3 AND i_class_min@4 <= personal AND personal <= i_class_max@5 OR i_class_null_count@6 != row_count@3 AND i_class_min@4 <= portable AND portable <= i_class_max@5 OR i_class_null_count@6 != row_count@3 AND i_class_min@4 <= reference AND reference <= i_class_max@5 OR i_class_null_count@6 != row_count@3 AND i_class_min@4 <= self-help AND self-help <= i_class_max@5) AND (i_brand_null_count@9 != row_count@3 AND i_brand_min@7 <= scholaramalgamalg #14 AND scholaramalgamalg #14 <= i_brand_max@8 OR i_brand_null_count@9 != row_count@3 AND i_brand_min@7 <= scholaramalgamalg #7 AND scholaramalgamalg #7 <= i_brand_max@8 OR i_brand_null_count@9 != row_count@3 AND i_brand_min@7 <= exportiunivamalg #9 AND exportiunivamalg #9 <= i_brand_max@8 OR i_brand_null_count@9 != row_count@3 AND i_brand_min@7 <= scholaramalgamalg #9 AND scholaramalgamalg #9 <= i_brand_max@8) OR (i_category_null_count@2 != row_count@3 AND i_category_min@0 <= Women AND Women <= i_category_max@1 OR i_category_null_count@2 != row_count@3 AND i_category_min@0 <= Music AND Music <= i_category_max@1 OR i_category_null_count@2 != row_count@3 AND i_category_min@0 <= Men AND Men <= i_category_max@1) AND (i_class_null_count@6 != row_count@3 AND i_class_min@4 <= accessories AND accessories <= i_class_max@5 OR i_class_null_count@6 != row_count@3 AND i_class_min@4 <= classical AND classical <= i_class_max@5 OR i_class_null_count@6 != row_count@3 AND i_class_min@4 <= fragrances AND fragrances <= i_class_max@5 OR i_class_null_count@6 != row_count@3 AND i_class_min@4 <= pants AND pants <= i_class_max@5) AND (i_brand_null_count@9 != row_count@3 AND i_brand_min@7 <= amalgimporto #1 AND amalgimporto #1 <= i_brand_max@8 OR i_brand_null_count@9 != row_count@3 AND i_brand_min@7 <= edu packscholar #1 AND edu packscholar #1 <= i_brand_max@8 OR i_brand_null_count@9 != row_count@3 AND i_brand_min@7 <= exportiimporto #1 AND exportiimporto #1 <= i_brand_max@8 OR i_brand_null_count@9 != row_count@3 AND i_brand_min@7 <= importoamalg #1 AND importoamalg #1 <= i_brand_max@8), required_guarantees=[i_brand in (amalgimporto #1, edu packscholar #1, exportiimporto #1, exportiunivamalg #9, importoamalg #1, scholaramalgamalg #14, scholaramalgamalg #7, scholaramalgamalg #9), i_class in (accessories, classical, fragrances, pants, personal, portable, reference, self-help)] + │ DistributedLeafExec: DataSourceExec: file_groups={2 groups: [[/testdata/tpcds/plans_sf1_partitions4/item/part-0.parquet, /testdata/tpcds/plans_sf1_partitions4/item/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/item/part-2.parquet, /testdata/tpcds/plans_sf1_partitions4/item/part-3.parquet]]}, projection=[i_item_sk, i_brand, i_class, i_category, i_manufact_id], file_type=parquet, predicate=(i_category@12 = Books OR i_category@12 = Children OR i_category@12 = Electronics) AND i_class@10 IN (SET) ([personal, portable, reference, self-help]) AND i_brand@8 IN (SET) ([scholaramalgamalg #14, scholaramalgamalg #7, exportiunivamalg #9, scholaramalgamalg #9]) OR (i_category@12 = Women OR i_category@12 = Music OR i_category@12 = Men) AND i_class@10 IN (SET) ([accessories, classical, fragrances, pants]) AND i_brand@8 IN (SET) ([amalgimporto #1, edu packscholar #1, exportiimporto #1, importoamalg #1]), pruning_predicate=(i_category_null_count@2 != row_count@3 AND i_category_min@0 <= Books AND Books <= i_category_max@1 OR i_category_null_count@2 != row_count@3 AND i_category_min@0 <= Children AND Children <= i_category_max@1 OR i_category_null_count@2 != row_count@3 AND i_category_min@0 <= Electronics AND Electronics <= i_category_max@1) AND (i_class_null_count@6 != row_count@3 AND i_class_min@4 <= personal AND personal <= i_class_max@5 OR i_class_null_count@6 != row_count@3 AND i_class_min@4 <= portable AND portable <= i_class_max@5 OR i_class_null_count@6 != row_count@3 AND i_class_min@4 <= reference AND reference <= i_class_max@5 OR i_class_null_count@6 != row_count@3 AND i_class_min@4 <= self-help AND self-help <= i_class_max@5) AND (i_brand_null_count@9 != row_count@3 AND i_brand_min@7 <= scholaramalgamalg #14 AND scholaramalgamalg #14 <= i_brand_max@8 OR i_brand_null_count@9 != row_count@3 AND i_brand_min@7 <= scholaramalgamalg #7 AND scholaramalgamalg #7 <= i_brand_max@8 OR i_brand_null_count@9 != row_count@3 AND i_brand_min@7 <= exportiunivamalg #9 AND exportiunivamalg #9 <= i_brand_max@8 OR i_brand_null_count@9 != row_count@3 AND i_brand_min@7 <= scholaramalgamalg #9 AND scholaramalgamalg #9 <= i_brand_max@8) OR (i_category_null_count@2 != row_count@3 AND i_category_min@0 <= Women AND Women <= i_category_max@1 OR i_category_null_count@2 != row_count@3 AND i_category_min@0 <= Music AND Music <= i_category_max@1 OR i_category_null_count@2 != row_count@3 AND i_category_min@0 <= Men AND Men <= i_category_max@1) AND (i_class_null_count@6 != row_count@3 AND i_class_min@4 <= accessories AND accessories <= i_class_max@5 OR i_class_null_count@6 != row_count@3 AND i_class_min@4 <= classical AND classical <= i_class_max@5 OR i_class_null_count@6 != row_count@3 AND i_class_min@4 <= fragrances AND fragrances <= i_class_max@5 OR i_class_null_count@6 != row_count@3 AND i_class_min@4 <= pants AND pants <= i_class_max@5) AND (i_brand_null_count@9 != row_count@3 AND i_brand_min@7 <= amalgimporto #1 AND amalgimporto #1 <= i_brand_max@8 OR i_brand_null_count@9 != row_count@3 AND i_brand_min@7 <= edu packscholar #1 AND edu packscholar #1 <= i_brand_max@8 OR i_brand_null_count@9 != row_count@3 AND i_brand_min@7 <= exportiimporto #1 AND exportiimporto #1 <= i_brand_max@8 OR i_brand_null_count@9 != row_count@3 AND i_brand_min@7 <= importoamalg #1 AND importoamalg #1 <= i_brand_max@8), required_guarantees=[i_brand in (amalgimporto #1, edu packscholar #1, exportiimporto #1, exportiunivamalg #9, importoamalg #1, scholaramalgamalg #14, scholaramalgamalg #7, scholaramalgamalg #9), i_class in (accessories, classical, fragrances, pants, personal, portable, reference, self-help)] └────────────────────────────────────────────────── "#); Ok(()) @@ -5704,8 +5274,7 @@ mod tests { │ ProjectionExec: expr=[CAST(d_month_seq@0 AS Int64) + 1 as date_dim.d_month_seq + Int64(1)] │ FilterExec: d_year@1 = 1998 AND d_moy@2 = 12, projection=[d_month_seq@0] │ RepartitionExec: partitioning=RoundRobinBatch(3), input_partitions=2 - │ PartitionIsolatorExec: tasks=2 partitions=4 - │ DataSourceExec: file_groups={4 groups: [[/testdata/tpcds/plans_sf1_partitions4/date_dim/part-0.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-2.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-3.parquet]]}, projection=[d_month_seq, d_year, d_moy], file_type=parquet, predicate=d_year@6 = 1998 AND d_moy@8 = 12, pruning_predicate=d_year_null_count@2 != row_count@3 AND d_year_min@0 <= 1998 AND 1998 <= d_year_max@1 AND d_moy_null_count@6 != row_count@3 AND d_moy_min@4 <= 12 AND 12 <= d_moy_max@5, required_guarantees=[d_moy in (12), d_year in (1998)] + │ DistributedLeafExec: DataSourceExec: file_groups={2 groups: [[/testdata/tpcds/plans_sf1_partitions4/date_dim/part-0.parquet, /testdata/tpcds/plans_sf1_partitions4/date_dim/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-2.parquet, /testdata/tpcds/plans_sf1_partitions4/date_dim/part-3.parquet]]}, projection=[d_month_seq, d_year, d_moy], file_type=parquet, predicate=d_year@6 = 1998 AND d_moy@8 = 12, pruning_predicate=d_year_null_count@2 != row_count@3 AND d_year_min@0 <= 1998 AND 1998 <= d_year_max@1 AND d_moy_null_count@6 != row_count@3 AND d_moy_min@4 <= 12 AND 12 <= d_moy_max@5, required_guarantees=[d_moy in (12), d_year in (1998)] └────────────────────────────────────────────────── ┌───── Stage 9 ── Tasks: t0:[p0..p2] t1:[p3..p5] t2:[p6..p8] │ BroadcastExec: input_partitions=3, consumer_tasks=1, output_partitions=3 @@ -5714,23 +5283,20 @@ mod tests { │ [Stage 2] => NetworkBroadcastExec: partitions_per_consumer=2, stage_partitions=6, input_tasks=2 │ HashJoinExec: mode=CollectLeft, join_type=Inner, on=[(c_current_addr_sk@1, ca_address_sk@0)], projection=[c_customer_sk@0, ss_sold_date_sk@2, ss_ext_sales_price@3, ca_county@5, ca_state@6] │ CoalescePartitionsExec - │ [Stage 8] => NetworkBroadcastExec: partitions_per_consumer=2, stage_partitions=6, input_tasks=3 + │ [Stage 8] => NetworkBroadcastExec: partitions_per_consumer=3, stage_partitions=9, input_tasks=3 │ RepartitionExec: partitioning=RoundRobinBatch(3), input_partitions=2 - │ PartitionIsolatorExec: tasks=3 partitions=4 - │ DataSourceExec: file_groups={4 groups: [[/testdata/tpcds/plans_sf1_partitions4/customer_address/part-0.parquet], [/testdata/tpcds/plans_sf1_partitions4/customer_address/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/customer_address/part-2.parquet], [/testdata/tpcds/plans_sf1_partitions4/customer_address/part-3.parquet]]}, projection=[ca_address_sk, ca_county, ca_state], file_type=parquet, predicate=DynamicFilter [ empty ] AND DynamicFilter [ empty ] + │ DistributedLeafExec: DataSourceExec: file_groups={2 groups: [[/testdata/tpcds/plans_sf1_partitions4/customer_address/part-0.parquet, /testdata/tpcds/plans_sf1_partitions4/customer_address/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/customer_address/part-2.parquet, /testdata/tpcds/plans_sf1_partitions4/customer_address/part-3.parquet]]}, projection=[ca_address_sk, ca_county, ca_state], file_type=parquet, predicate=DynamicFilter [ empty ] AND DynamicFilter [ empty ] └────────────────────────────────────────────────── ┌───── Stage 2 ── Tasks: t0:[p0..p5] t1:[p6..p11] │ BroadcastExec: input_partitions=2, consumer_tasks=3, output_partitions=6 - │ PartitionIsolatorExec: tasks=2 partitions=4 - │ DataSourceExec: file_groups={4 groups: [[/testdata/tpcds/plans_sf1_partitions4/store/part-0.parquet], [/testdata/tpcds/plans_sf1_partitions4/store/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/store/part-2.parquet], [/testdata/tpcds/plans_sf1_partitions4/store/part-3.parquet]]}, projection=[s_county, s_state, CAST(s_state@24 AS Utf8View) as CAST(store.s_state AS Utf8View)], file_type=parquet + │ DistributedLeafExec: DataSourceExec: file_groups={2 groups: [[/testdata/tpcds/plans_sf1_partitions4/store/part-0.parquet, /testdata/tpcds/plans_sf1_partitions4/store/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/store/part-2.parquet, /testdata/tpcds/plans_sf1_partitions4/store/part-3.parquet]]}, projection=[s_county, s_state, CAST(s_state@24 AS Utf8View) as CAST(store.s_state AS Utf8View)], file_type=parquet └────────────────────────────────────────────────── - ┌───── Stage 8 ── Tasks: t0:[p0..p5] t1:[p6..p11] t2:[p12..p17] - │ BroadcastExec: input_partitions=2, consumer_tasks=3, output_partitions=6 + ┌───── Stage 8 ── Tasks: t0:[p0..p8] t1:[p9..p17] t2:[p18..p26] + │ BroadcastExec: input_partitions=3, consumer_tasks=3, output_partitions=9 │ HashJoinExec: mode=CollectLeft, join_type=Inner, on=[(CAST(my_customers.c_customer_sk AS Float64)@2, ss_customer_sk@1)], projection=[c_customer_sk@0, c_current_addr_sk@1, ss_sold_date_sk@3, ss_ext_sales_price@5] │ CoalescePartitionsExec │ [Stage 7] => NetworkBroadcastExec: partitions_per_consumer=3, stage_partitions=9, input_tasks=2 - │ PartitionIsolatorExec: tasks=3 partitions=6 - │ DataSourceExec: file_groups={6 groups: [[/testdata/tpcds/plans_sf1_partitions4/store_sales/part-0.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_sales/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_sales/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_sales/part-2.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_sales/part-2.parquet:..], ...]}, projection=[ss_sold_date_sk, ss_customer_sk, ss_ext_sales_price], file_type=parquet, predicate=DynamicFilter [ empty ] + │ DistributedLeafExec: DataSourceExec: file_groups={3 groups: [[/testdata/tpcds/plans_sf1_partitions4/store_sales/part-0.parquet:.., /testdata/tpcds/plans_sf1_partitions4/store_sales/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_sales/part-1.parquet:.., /testdata/tpcds/plans_sf1_partitions4/store_sales/part-2.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_sales/part-2.parquet:.., /testdata/tpcds/plans_sf1_partitions4/store_sales/part-3.parquet:..]]}, projection=[ss_sold_date_sk, ss_customer_sk, ss_ext_sales_price], file_type=parquet, predicate=DynamicFilter [ empty ] └────────────────────────────────────────────────── ┌───── Stage 7 ── Tasks: t0:[p0..p8] t1:[p9..p17] │ BroadcastExec: input_partitions=3, consumer_tasks=3, output_partitions=9 @@ -5745,9 +5311,8 @@ mod tests { │ CoalescePartitionsExec │ [Stage 5] => NetworkBroadcastExec: partitions_per_consumer=3, stage_partitions=12, input_tasks=4 │ ProjectionExec: expr=[c_customer_sk@0 as c_customer_sk, c_current_addr_sk@1 as c_current_addr_sk, CAST(c_customer_sk@0 AS Float64) as CAST(customer.c_customer_sk AS Float64)] - │ RepartitionExec: partitioning=RoundRobinBatch(3), input_partitions=1 - │ PartitionIsolatorExec: tasks=4 partitions=4 - │ DataSourceExec: file_groups={4 groups: [[/testdata/tpcds/plans_sf1_partitions4/customer/part-0.parquet], [/testdata/tpcds/plans_sf1_partitions4/customer/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/customer/part-2.parquet], [/testdata/tpcds/plans_sf1_partitions4/customer/part-3.parquet]]}, projection=[c_customer_sk, c_current_addr_sk], file_type=parquet, predicate=DynamicFilter [ empty ] + │ RepartitionExec: partitioning=RoundRobinBatch(3), input_partitions=2 + │ DistributedLeafExec: DataSourceExec: file_groups={2 groups: [[/testdata/tpcds/plans_sf1_partitions4/customer/part-0.parquet, /testdata/tpcds/plans_sf1_partitions4/customer/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/customer/part-2.parquet, /testdata/tpcds/plans_sf1_partitions4/customer/part-3.parquet]]}, projection=[c_customer_sk, c_current_addr_sk], file_type=parquet, predicate=DynamicFilter [ empty ] └────────────────────────────────────────────────── ┌───── Stage 5 ── Tasks: t0:[p0..p11] t1:[p12..p23] t2:[p24..p35] t3:[p36..p47] │ BroadcastExec: input_partitions=3, consumer_tasks=4, output_partitions=12 @@ -5758,25 +5323,21 @@ mod tests { │ CoalescePartitionsExec │ [Stage 4] => NetworkBroadcastExec: partitions_per_consumer=3, stage_partitions=12, input_tasks=2 │ DistributedUnionExec: t0:[c0(0/2)] t1:[c0(1/2)] t2:[c1(0/2)] t3:[c1(1/2)] - │ PartitionIsolatorExec: tasks=2 partitions=6 - │ DataSourceExec: file_groups={6 groups: [[/testdata/tpcds/plans_sf1_partitions4/catalog_sales/part-0.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/catalog_sales/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/catalog_sales/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/catalog_sales/part-2.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/catalog_sales/part-2.parquet:..], ...]}, projection=[cs_sold_date_sk@0 as sold_date_sk, cs_bill_customer_sk@3 as customer_sk, cs_item_sk@15 as item_sk], file_type=parquet, predicate=DynamicFilter [ empty ] AND DynamicFilter [ empty ] - │ PartitionIsolatorExec: tasks=2 partitions=6 - │ DataSourceExec: file_groups={6 groups: [[/testdata/tpcds/plans_sf1_partitions4/web_sales/part-0.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/web_sales/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/web_sales/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/web_sales/part-2.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/web_sales/part-2.parquet:..], ...]}, projection=[ws_sold_date_sk@0 as sold_date_sk, ws_bill_customer_sk@4 as customer_sk, ws_item_sk@3 as item_sk], file_type=parquet, predicate=DynamicFilter [ empty ] AND DynamicFilter [ empty ] + │ DistributedLeafExec: DataSourceExec: file_groups={3 groups: [[/testdata/tpcds/plans_sf1_partitions4/catalog_sales/part-0.parquet:.., /testdata/tpcds/plans_sf1_partitions4/catalog_sales/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/catalog_sales/part-1.parquet:.., /testdata/tpcds/plans_sf1_partitions4/catalog_sales/part-2.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/catalog_sales/part-2.parquet:.., /testdata/tpcds/plans_sf1_partitions4/catalog_sales/part-3.parquet:..]]}, projection=[cs_sold_date_sk@0 as sold_date_sk, cs_bill_customer_sk@3 as customer_sk, cs_item_sk@15 as item_sk], file_type=parquet, predicate=DynamicFilter [ empty ] AND DynamicFilter [ empty ] + │ DistributedLeafExec: DataSourceExec: file_groups={3 groups: [[/testdata/tpcds/plans_sf1_partitions4/web_sales/part-0.parquet:.., /testdata/tpcds/plans_sf1_partitions4/web_sales/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/web_sales/part-1.parquet:.., /testdata/tpcds/plans_sf1_partitions4/web_sales/part-2.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/web_sales/part-2.parquet:.., /testdata/tpcds/plans_sf1_partitions4/web_sales/part-3.parquet:..]]}, projection=[ws_sold_date_sk@0 as sold_date_sk, ws_bill_customer_sk@4 as customer_sk, ws_item_sk@3 as item_sk], file_type=parquet, predicate=DynamicFilter [ empty ] AND DynamicFilter [ empty ] └────────────────────────────────────────────────── ┌───── Stage 3 ── Tasks: t0:[p0..p11] t1:[p12..p23] │ BroadcastExec: input_partitions=3, consumer_tasks=4, output_partitions=12 │ ProjectionExec: expr=[d_date_sk@0 as d_date_sk, CAST(d_date_sk@0 AS Float64) as CAST(date_dim.d_date_sk AS Float64)] │ FilterExec: d_moy@2 = 12 AND d_year@1 = 1998, projection=[d_date_sk@0] │ RepartitionExec: partitioning=RoundRobinBatch(3), input_partitions=2 - │ PartitionIsolatorExec: tasks=2 partitions=4 - │ DataSourceExec: file_groups={4 groups: [[/testdata/tpcds/plans_sf1_partitions4/date_dim/part-0.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-2.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-3.parquet]]}, projection=[d_date_sk, d_year, d_moy], file_type=parquet, predicate=d_moy@8 = 12 AND d_year@6 = 1998, pruning_predicate=d_moy_null_count@2 != row_count@3 AND d_moy_min@0 <= 12 AND 12 <= d_moy_max@1 AND d_year_null_count@6 != row_count@3 AND d_year_min@4 <= 1998 AND 1998 <= d_year_max@5, required_guarantees=[d_moy in (12), d_year in (1998)] + │ DistributedLeafExec: DataSourceExec: file_groups={2 groups: [[/testdata/tpcds/plans_sf1_partitions4/date_dim/part-0.parquet, /testdata/tpcds/plans_sf1_partitions4/date_dim/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-2.parquet, /testdata/tpcds/plans_sf1_partitions4/date_dim/part-3.parquet]]}, projection=[d_date_sk, d_year, d_moy], file_type=parquet, predicate=d_moy@8 = 12 AND d_year@6 = 1998, pruning_predicate=d_moy_null_count@2 != row_count@3 AND d_moy_min@0 <= 12 AND 12 <= d_moy_max@1 AND d_year_null_count@6 != row_count@3 AND d_year_min@4 <= 1998 AND 1998 <= d_year_max@5, required_guarantees=[d_moy in (12), d_year in (1998)] └────────────────────────────────────────────────── ┌───── Stage 4 ── Tasks: t0:[p0..p11] t1:[p12..p23] │ BroadcastExec: input_partitions=3, consumer_tasks=4, output_partitions=12 │ FilterExec: i_category@2 = Women AND i_class@1 = maternity, projection=[i_item_sk@0] │ RepartitionExec: partitioning=RoundRobinBatch(3), input_partitions=2 - │ PartitionIsolatorExec: tasks=2 partitions=4 - │ DataSourceExec: file_groups={4 groups: [[/testdata/tpcds/plans_sf1_partitions4/item/part-0.parquet], [/testdata/tpcds/plans_sf1_partitions4/item/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/item/part-2.parquet], [/testdata/tpcds/plans_sf1_partitions4/item/part-3.parquet]]}, projection=[i_item_sk, i_class, i_category], file_type=parquet, predicate=i_category@12 = Women AND i_class@10 = maternity, pruning_predicate=i_category_null_count@2 != row_count@3 AND i_category_min@0 <= Women AND Women <= i_category_max@1 AND i_class_null_count@6 != row_count@3 AND i_class_min@4 <= maternity AND maternity <= i_class_max@5, required_guarantees=[i_category in (Women), i_class in (maternity)] + │ DistributedLeafExec: DataSourceExec: file_groups={2 groups: [[/testdata/tpcds/plans_sf1_partitions4/item/part-0.parquet, /testdata/tpcds/plans_sf1_partitions4/item/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/item/part-2.parquet, /testdata/tpcds/plans_sf1_partitions4/item/part-3.parquet]]}, projection=[i_item_sk, i_class, i_category], file_type=parquet, predicate=i_category@12 = Women AND i_class@10 = maternity, pruning_predicate=i_category_null_count@2 != row_count@3 AND i_category_min@0 <= Women AND Women <= i_category_max@1 AND i_class_null_count@6 != row_count@3 AND i_class_min@4 <= maternity AND maternity <= i_class_max@5, required_guarantees=[i_category in (Women), i_class in (maternity)] └────────────────────────────────────────────────── ┌───── Stage 10 ── Tasks: t0:[p0..p2] t1:[p0..p2] │ RepartitionExec: partitioning=Hash([date_dim.d_month_seq + Int64(3)@0], 3), input_partitions=3 @@ -5784,8 +5345,7 @@ mod tests { │ ProjectionExec: expr=[CAST(d_month_seq@0 AS Int64) + 3 as date_dim.d_month_seq + Int64(3)] │ FilterExec: d_year@1 = 1998 AND d_moy@2 = 12, projection=[d_month_seq@0] │ RepartitionExec: partitioning=RoundRobinBatch(3), input_partitions=2 - │ PartitionIsolatorExec: tasks=2 partitions=4 - │ DataSourceExec: file_groups={4 groups: [[/testdata/tpcds/plans_sf1_partitions4/date_dim/part-0.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-2.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-3.parquet]]}, projection=[d_month_seq, d_year, d_moy], file_type=parquet, predicate=d_year@6 = 1998 AND d_moy@8 = 12, pruning_predicate=d_year_null_count@2 != row_count@3 AND d_year_min@0 <= 1998 AND 1998 <= d_year_max@1 AND d_moy_null_count@6 != row_count@3 AND d_moy_min@4 <= 12 AND 12 <= d_moy_max@5, required_guarantees=[d_moy in (12), d_year in (1998)] + │ DistributedLeafExec: DataSourceExec: file_groups={2 groups: [[/testdata/tpcds/plans_sf1_partitions4/date_dim/part-0.parquet, /testdata/tpcds/plans_sf1_partitions4/date_dim/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-2.parquet, /testdata/tpcds/plans_sf1_partitions4/date_dim/part-3.parquet]]}, projection=[d_month_seq, d_year, d_moy], file_type=parquet, predicate=d_year@6 = 1998 AND d_moy@8 = 12, pruning_predicate=d_year_null_count@2 != row_count@3 AND d_year_min@0 <= 1998 AND 1998 <= d_year_max@1 AND d_moy_null_count@6 != row_count@3 AND d_moy_min@4 <= 12 AND 12 <= d_moy_max@5, required_guarantees=[d_moy in (12), d_year in (1998)] └────────────────────────────────────────────────── "); Ok(()) @@ -5793,7 +5353,7 @@ mod tests { #[tokio::test] async fn test_tpcds_55() -> Result<()> { let display = test_tpcds_query("q55").await?; - assert_snapshot!(display, @" + assert_snapshot!(display, @r" ┌───── DistributedExec ── Tasks: t0:[p0] │ SortPreservingMergeExec: [ext_price@2 DESC, brand_id@0 ASC NULLS LAST], fetch=100 │ [Stage 4] => NetworkCoalesceExec: output_partitions=6, input_tasks=2 @@ -5809,27 +5369,24 @@ mod tests { │ AggregateExec: mode=Partial, gby=[i_brand@2 as i_brand, i_brand_id@1 as i_brand_id], aggr=[sum(store_sales.ss_ext_sales_price)] │ HashJoinExec: mode=CollectLeft, join_type=Inner, on=[(ss_item_sk@0, i_item_sk@0)], projection=[ss_ext_sales_price@1, i_brand_id@3, i_brand@4] │ CoalescePartitionsExec - │ [Stage 2] => NetworkBroadcastExec: partitions_per_consumer=2, stage_partitions=6, input_tasks=3 + │ [Stage 2] => NetworkBroadcastExec: partitions_per_consumer=3, stage_partitions=9, input_tasks=3 │ FilterExec: i_manager_id@3 = 28, projection=[i_item_sk@0, i_brand_id@1, i_brand@2] │ RepartitionExec: partitioning=RoundRobinBatch(3), input_partitions=2 - │ PartitionIsolatorExec: tasks=3 partitions=4 - │ DataSourceExec: file_groups={4 groups: [[/testdata/tpcds/plans_sf1_partitions4/item/part-0.parquet], [/testdata/tpcds/plans_sf1_partitions4/item/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/item/part-2.parquet], [/testdata/tpcds/plans_sf1_partitions4/item/part-3.parquet]]}, projection=[i_item_sk, i_brand_id, i_brand, i_manager_id], file_type=parquet, predicate=i_manager_id@20 = 28 AND DynamicFilter [ empty ], pruning_predicate=i_manager_id_null_count@2 != row_count@3 AND i_manager_id_min@0 <= 28 AND 28 <= i_manager_id_max@1, required_guarantees=[i_manager_id in (28)] + │ DistributedLeafExec: DataSourceExec: file_groups={2 groups: [[/testdata/tpcds/plans_sf1_partitions4/item/part-0.parquet, /testdata/tpcds/plans_sf1_partitions4/item/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/item/part-2.parquet, /testdata/tpcds/plans_sf1_partitions4/item/part-3.parquet]]}, projection=[i_item_sk, i_brand_id, i_brand, i_manager_id], file_type=parquet, predicate=i_manager_id@20 = 28 AND DynamicFilter [ empty ], pruning_predicate=i_manager_id_null_count@2 != row_count@3 AND i_manager_id_min@0 <= 28 AND 28 <= i_manager_id_max@1, required_guarantees=[i_manager_id in (28)] └────────────────────────────────────────────────── - ┌───── Stage 2 ── Tasks: t0:[p0..p5] t1:[p6..p11] t2:[p12..p17] - │ BroadcastExec: input_partitions=2, consumer_tasks=3, output_partitions=6 + ┌───── Stage 2 ── Tasks: t0:[p0..p8] t1:[p9..p17] t2:[p18..p26] + │ BroadcastExec: input_partitions=3, consumer_tasks=3, output_partitions=9 │ HashJoinExec: mode=CollectLeft, join_type=Inner, on=[(CAST(date_dim.d_date_sk AS Float64)@1, ss_sold_date_sk@0)], projection=[ss_item_sk@3, ss_ext_sales_price@4] │ CoalescePartitionsExec │ [Stage 1] => NetworkBroadcastExec: partitions_per_consumer=3, stage_partitions=9, input_tasks=2 - │ PartitionIsolatorExec: tasks=3 partitions=6 - │ DataSourceExec: file_groups={6 groups: [[/testdata/tpcds/plans_sf1_partitions4/store_sales/part-0.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_sales/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_sales/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_sales/part-2.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_sales/part-2.parquet:..], ...]}, projection=[ss_sold_date_sk, ss_item_sk, ss_ext_sales_price], file_type=parquet, predicate=DynamicFilter [ empty ] + │ DistributedLeafExec: DataSourceExec: file_groups={3 groups: [[/testdata/tpcds/plans_sf1_partitions4/store_sales/part-0.parquet:.., /testdata/tpcds/plans_sf1_partitions4/store_sales/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_sales/part-1.parquet:.., /testdata/tpcds/plans_sf1_partitions4/store_sales/part-2.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_sales/part-2.parquet:.., /testdata/tpcds/plans_sf1_partitions4/store_sales/part-3.parquet:..]]}, projection=[ss_sold_date_sk, ss_item_sk, ss_ext_sales_price], file_type=parquet, predicate=DynamicFilter [ empty ] └────────────────────────────────────────────────── ┌───── Stage 1 ── Tasks: t0:[p0..p8] t1:[p9..p17] │ BroadcastExec: input_partitions=3, consumer_tasks=3, output_partitions=9 │ ProjectionExec: expr=[d_date_sk@0 as d_date_sk, CAST(d_date_sk@0 AS Float64) as CAST(date_dim.d_date_sk AS Float64)] │ FilterExec: d_moy@2 = 11 AND d_year@1 = 1999, projection=[d_date_sk@0] │ RepartitionExec: partitioning=RoundRobinBatch(3), input_partitions=2 - │ PartitionIsolatorExec: tasks=2 partitions=4 - │ DataSourceExec: file_groups={4 groups: [[/testdata/tpcds/plans_sf1_partitions4/date_dim/part-0.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-2.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-3.parquet]]}, projection=[d_date_sk, d_year, d_moy], file_type=parquet, predicate=d_moy@8 = 11 AND d_year@6 = 1999, pruning_predicate=d_moy_null_count@2 != row_count@3 AND d_moy_min@0 <= 11 AND 11 <= d_moy_max@1 AND d_year_null_count@6 != row_count@3 AND d_year_min@4 <= 1999 AND 1999 <= d_year_max@5, required_guarantees=[d_moy in (11), d_year in (1999)] + │ DistributedLeafExec: DataSourceExec: file_groups={2 groups: [[/testdata/tpcds/plans_sf1_partitions4/date_dim/part-0.parquet, /testdata/tpcds/plans_sf1_partitions4/date_dim/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-2.parquet, /testdata/tpcds/plans_sf1_partitions4/date_dim/part-3.parquet]]}, projection=[d_date_sk, d_year, d_moy], file_type=parquet, predicate=d_moy@8 = 11 AND d_year@6 = 1999, pruning_predicate=d_moy_null_count@2 != row_count@3 AND d_moy_min@0 <= 11 AND 11 <= d_moy_max@1 AND d_year_null_count@6 != row_count@3 AND d_year_min@4 <= 1999 AND 1999 <= d_year_max@5, required_guarantees=[d_moy in (11), d_year in (1999)] └────────────────────────────────────────────────── "); Ok(()) @@ -5837,7 +5394,7 @@ mod tests { #[tokio::test] async fn test_tpcds_56() -> Result<()> { let display = test_tpcds_query("q56").await?; - assert_snapshot!(display, @" + assert_snapshot!(display, @r" ┌───── DistributedExec ── Tasks: t0:[p0] │ SortPreservingMergeExec: [total_sales@1 ASC, i_item_id@0 ASC], fetch=100 │ SortExec: TopK(fetch=100), expr=[total_sales@1 ASC, i_item_id@0 ASC], preserve_partitioning=[true] @@ -5880,105 +5437,93 @@ mod tests { │ CoalescePartitionsExec │ [Stage 3] => NetworkBroadcastExec: partitions_per_consumer=3, stage_partitions=9, input_tasks=3 │ RepartitionExec: partitioning=RoundRobinBatch(3), input_partitions=2 - │ PartitionIsolatorExec: tasks=3 partitions=4 - │ DataSourceExec: file_groups={4 groups: [[/testdata/tpcds/plans_sf1_partitions4/item/part-0.parquet], [/testdata/tpcds/plans_sf1_partitions4/item/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/item/part-2.parquet], [/testdata/tpcds/plans_sf1_partitions4/item/part-3.parquet]]}, projection=[i_item_sk, i_item_id], file_type=parquet, predicate=DynamicFilter [ empty ] + │ DistributedLeafExec: DataSourceExec: file_groups={2 groups: [[/testdata/tpcds/plans_sf1_partitions4/item/part-0.parquet, /testdata/tpcds/plans_sf1_partitions4/item/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/item/part-2.parquet, /testdata/tpcds/plans_sf1_partitions4/item/part-3.parquet]]}, projection=[i_item_sk, i_item_id], file_type=parquet, predicate=DynamicFilter [ empty ] └────────────────────────────────────────────────── ┌───── Stage 3 ── Tasks: t0:[p0..p8] t1:[p9..p17] t2:[p18..p26] │ BroadcastExec: input_partitions=3, consumer_tasks=3, output_partitions=9 │ HashJoinExec: mode=CollectLeft, join_type=Inner, on=[(ss_addr_sk@1, CAST(customer_address.ca_address_sk AS Float64)@1)], projection=[ss_item_sk@0, ss_ext_sales_price@2] │ CoalescePartitionsExec - │ [Stage 2] => NetworkBroadcastExec: partitions_per_consumer=2, stage_partitions=6, input_tasks=3 + │ [Stage 2] => NetworkBroadcastExec: partitions_per_consumer=3, stage_partitions=9, input_tasks=3 │ ProjectionExec: expr=[ca_address_sk@0 as ca_address_sk, CAST(ca_address_sk@0 AS Float64) as CAST(customer_address.ca_address_sk AS Float64)] │ FilterExec: ca_gmt_offset@1 = Some(-500),4,2, projection=[ca_address_sk@0] │ RepartitionExec: partitioning=RoundRobinBatch(3), input_partitions=2 - │ PartitionIsolatorExec: tasks=3 partitions=4 - │ DataSourceExec: file_groups={4 groups: [[/testdata/tpcds/plans_sf1_partitions4/customer_address/part-0.parquet], [/testdata/tpcds/plans_sf1_partitions4/customer_address/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/customer_address/part-2.parquet], [/testdata/tpcds/plans_sf1_partitions4/customer_address/part-3.parquet]]}, projection=[ca_address_sk, ca_gmt_offset], file_type=parquet, predicate=ca_gmt_offset@11 = Some(-500),4,2 AND DynamicFilter [ empty ], pruning_predicate=ca_gmt_offset_null_count@2 != row_count@3 AND ca_gmt_offset_min@0 <= Some(-500),4,2 AND Some(-500),4,2 <= ca_gmt_offset_max@1, required_guarantees=[ca_gmt_offset in (Some(-500),4,2)] + │ DistributedLeafExec: DataSourceExec: file_groups={2 groups: [[/testdata/tpcds/plans_sf1_partitions4/customer_address/part-0.parquet, /testdata/tpcds/plans_sf1_partitions4/customer_address/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/customer_address/part-2.parquet, /testdata/tpcds/plans_sf1_partitions4/customer_address/part-3.parquet]]}, projection=[ca_address_sk, ca_gmt_offset], file_type=parquet, predicate=ca_gmt_offset@11 = Some(-500),4,2 AND DynamicFilter [ empty ], pruning_predicate=ca_gmt_offset_null_count@2 != row_count@3 AND ca_gmt_offset_min@0 <= Some(-500),4,2 AND Some(-500),4,2 <= ca_gmt_offset_max@1, required_guarantees=[ca_gmt_offset in (Some(-500),4,2)] └────────────────────────────────────────────────── - ┌───── Stage 2 ── Tasks: t0:[p0..p5] t1:[p6..p11] t2:[p12..p17] - │ BroadcastExec: input_partitions=2, consumer_tasks=3, output_partitions=6 + ┌───── Stage 2 ── Tasks: t0:[p0..p8] t1:[p9..p17] t2:[p18..p26] + │ BroadcastExec: input_partitions=3, consumer_tasks=3, output_partitions=9 │ HashJoinExec: mode=CollectLeft, join_type=Inner, on=[(CAST(date_dim.d_date_sk AS Float64)@1, ss_sold_date_sk@0)], projection=[ss_item_sk@3, ss_addr_sk@4, ss_ext_sales_price@5] │ CoalescePartitionsExec │ [Stage 1] => NetworkBroadcastExec: partitions_per_consumer=3, stage_partitions=9, input_tasks=2 - │ PartitionIsolatorExec: tasks=3 partitions=6 - │ DataSourceExec: file_groups={6 groups: [[/testdata/tpcds/plans_sf1_partitions4/store_sales/part-0.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_sales/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_sales/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_sales/part-2.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_sales/part-2.parquet:..], ...]}, projection=[ss_sold_date_sk, ss_item_sk, ss_addr_sk, ss_ext_sales_price], file_type=parquet, predicate=DynamicFilter [ empty ] + │ DistributedLeafExec: DataSourceExec: file_groups={3 groups: [[/testdata/tpcds/plans_sf1_partitions4/store_sales/part-0.parquet:.., /testdata/tpcds/plans_sf1_partitions4/store_sales/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_sales/part-1.parquet:.., /testdata/tpcds/plans_sf1_partitions4/store_sales/part-2.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_sales/part-2.parquet:.., /testdata/tpcds/plans_sf1_partitions4/store_sales/part-3.parquet:..]]}, projection=[ss_sold_date_sk, ss_item_sk, ss_addr_sk, ss_ext_sales_price], file_type=parquet, predicate=DynamicFilter [ empty ] └────────────────────────────────────────────────── ┌───── Stage 1 ── Tasks: t0:[p0..p8] t1:[p9..p17] │ BroadcastExec: input_partitions=3, consumer_tasks=3, output_partitions=9 │ ProjectionExec: expr=[d_date_sk@0 as d_date_sk, CAST(d_date_sk@0 AS Float64) as CAST(date_dim.d_date_sk AS Float64)] │ FilterExec: d_year@1 = 2001 AND d_moy@2 = 2, projection=[d_date_sk@0] │ RepartitionExec: partitioning=RoundRobinBatch(3), input_partitions=2 - │ PartitionIsolatorExec: tasks=2 partitions=4 - │ DataSourceExec: file_groups={4 groups: [[/testdata/tpcds/plans_sf1_partitions4/date_dim/part-0.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-2.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-3.parquet]]}, projection=[d_date_sk, d_year, d_moy], file_type=parquet, predicate=d_year@6 = 2001 AND d_moy@8 = 2, pruning_predicate=d_year_null_count@2 != row_count@3 AND d_year_min@0 <= 2001 AND 2001 <= d_year_max@1 AND d_moy_null_count@6 != row_count@3 AND d_moy_min@4 <= 2 AND 2 <= d_moy_max@5, required_guarantees=[d_moy in (2), d_year in (2001)] + │ DistributedLeafExec: DataSourceExec: file_groups={2 groups: [[/testdata/tpcds/plans_sf1_partitions4/date_dim/part-0.parquet, /testdata/tpcds/plans_sf1_partitions4/date_dim/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-2.parquet, /testdata/tpcds/plans_sf1_partitions4/date_dim/part-3.parquet]]}, projection=[d_date_sk, d_year, d_moy], file_type=parquet, predicate=d_year@6 = 2001 AND d_moy@8 = 2, pruning_predicate=d_year_null_count@2 != row_count@3 AND d_year_min@0 <= 2001 AND 2001 <= d_year_max@1 AND d_moy_null_count@6 != row_count@3 AND d_moy_min@4 <= 2 AND 2 <= d_moy_max@5, required_guarantees=[d_moy in (2), d_year in (2001)] └────────────────────────────────────────────────── ┌───── Stage 8 ── Tasks: t0:[p0..p2] t1:[p3..p5] t2:[p6..p8] │ HashJoinExec: mode=CollectLeft, join_type=Inner, on=[(cs_item_sk@0, i_item_sk@0)], projection=[cs_ext_sales_price@1, i_item_id@3] │ CoalescePartitionsExec │ [Stage 7] => NetworkBroadcastExec: partitions_per_consumer=3, stage_partitions=9, input_tasks=3 │ RepartitionExec: partitioning=RoundRobinBatch(3), input_partitions=2 - │ PartitionIsolatorExec: tasks=3 partitions=4 - │ DataSourceExec: file_groups={4 groups: [[/testdata/tpcds/plans_sf1_partitions4/item/part-0.parquet], [/testdata/tpcds/plans_sf1_partitions4/item/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/item/part-2.parquet], [/testdata/tpcds/plans_sf1_partitions4/item/part-3.parquet]]}, projection=[i_item_sk, i_item_id], file_type=parquet, predicate=DynamicFilter [ empty ] + │ DistributedLeafExec: DataSourceExec: file_groups={2 groups: [[/testdata/tpcds/plans_sf1_partitions4/item/part-0.parquet, /testdata/tpcds/plans_sf1_partitions4/item/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/item/part-2.parquet, /testdata/tpcds/plans_sf1_partitions4/item/part-3.parquet]]}, projection=[i_item_sk, i_item_id], file_type=parquet, predicate=DynamicFilter [ empty ] └────────────────────────────────────────────────── ┌───── Stage 7 ── Tasks: t0:[p0..p8] t1:[p9..p17] t2:[p18..p26] │ BroadcastExec: input_partitions=3, consumer_tasks=3, output_partitions=9 │ HashJoinExec: mode=CollectLeft, join_type=Inner, on=[(cs_bill_addr_sk@0, CAST(customer_address.ca_address_sk AS Float64)@1)], projection=[cs_item_sk@1, cs_ext_sales_price@2] │ CoalescePartitionsExec - │ [Stage 6] => NetworkBroadcastExec: partitions_per_consumer=2, stage_partitions=6, input_tasks=3 + │ [Stage 6] => NetworkBroadcastExec: partitions_per_consumer=3, stage_partitions=9, input_tasks=3 │ ProjectionExec: expr=[ca_address_sk@0 as ca_address_sk, CAST(ca_address_sk@0 AS Float64) as CAST(customer_address.ca_address_sk AS Float64)] │ FilterExec: ca_gmt_offset@1 = Some(-500),4,2, projection=[ca_address_sk@0] │ RepartitionExec: partitioning=RoundRobinBatch(3), input_partitions=2 - │ PartitionIsolatorExec: tasks=3 partitions=4 - │ DataSourceExec: file_groups={4 groups: [[/testdata/tpcds/plans_sf1_partitions4/customer_address/part-0.parquet], [/testdata/tpcds/plans_sf1_partitions4/customer_address/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/customer_address/part-2.parquet], [/testdata/tpcds/plans_sf1_partitions4/customer_address/part-3.parquet]]}, projection=[ca_address_sk, ca_gmt_offset], file_type=parquet, predicate=ca_gmt_offset@11 = Some(-500),4,2 AND DynamicFilter [ empty ], pruning_predicate=ca_gmt_offset_null_count@2 != row_count@3 AND ca_gmt_offset_min@0 <= Some(-500),4,2 AND Some(-500),4,2 <= ca_gmt_offset_max@1, required_guarantees=[ca_gmt_offset in (Some(-500),4,2)] + │ DistributedLeafExec: DataSourceExec: file_groups={2 groups: [[/testdata/tpcds/plans_sf1_partitions4/customer_address/part-0.parquet, /testdata/tpcds/plans_sf1_partitions4/customer_address/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/customer_address/part-2.parquet, /testdata/tpcds/plans_sf1_partitions4/customer_address/part-3.parquet]]}, projection=[ca_address_sk, ca_gmt_offset], file_type=parquet, predicate=ca_gmt_offset@11 = Some(-500),4,2 AND DynamicFilter [ empty ], pruning_predicate=ca_gmt_offset_null_count@2 != row_count@3 AND ca_gmt_offset_min@0 <= Some(-500),4,2 AND Some(-500),4,2 <= ca_gmt_offset_max@1, required_guarantees=[ca_gmt_offset in (Some(-500),4,2)] └────────────────────────────────────────────────── - ┌───── Stage 6 ── Tasks: t0:[p0..p5] t1:[p6..p11] t2:[p12..p17] - │ BroadcastExec: input_partitions=2, consumer_tasks=3, output_partitions=6 + ┌───── Stage 6 ── Tasks: t0:[p0..p8] t1:[p9..p17] t2:[p18..p26] + │ BroadcastExec: input_partitions=3, consumer_tasks=3, output_partitions=9 │ HashJoinExec: mode=CollectLeft, join_type=Inner, on=[(CAST(date_dim.d_date_sk AS Float64)@1, cs_sold_date_sk@0)], projection=[cs_bill_addr_sk@3, cs_item_sk@4, cs_ext_sales_price@5] │ CoalescePartitionsExec │ [Stage 5] => NetworkBroadcastExec: partitions_per_consumer=3, stage_partitions=9, input_tasks=2 - │ PartitionIsolatorExec: tasks=3 partitions=6 - │ DataSourceExec: file_groups={6 groups: [[/testdata/tpcds/plans_sf1_partitions4/catalog_sales/part-0.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/catalog_sales/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/catalog_sales/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/catalog_sales/part-2.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/catalog_sales/part-2.parquet:..], ...]}, projection=[cs_sold_date_sk, cs_bill_addr_sk, cs_item_sk, cs_ext_sales_price], file_type=parquet, predicate=DynamicFilter [ empty ] + │ DistributedLeafExec: DataSourceExec: file_groups={3 groups: [[/testdata/tpcds/plans_sf1_partitions4/catalog_sales/part-0.parquet:.., /testdata/tpcds/plans_sf1_partitions4/catalog_sales/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/catalog_sales/part-1.parquet:.., /testdata/tpcds/plans_sf1_partitions4/catalog_sales/part-2.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/catalog_sales/part-2.parquet:.., /testdata/tpcds/plans_sf1_partitions4/catalog_sales/part-3.parquet:..]]}, projection=[cs_sold_date_sk, cs_bill_addr_sk, cs_item_sk, cs_ext_sales_price], file_type=parquet, predicate=DynamicFilter [ empty ] └────────────────────────────────────────────────── ┌───── Stage 5 ── Tasks: t0:[p0..p8] t1:[p9..p17] │ BroadcastExec: input_partitions=3, consumer_tasks=3, output_partitions=9 │ ProjectionExec: expr=[d_date_sk@0 as d_date_sk, CAST(d_date_sk@0 AS Float64) as CAST(date_dim.d_date_sk AS Float64)] │ FilterExec: d_year@1 = 2001 AND d_moy@2 = 2, projection=[d_date_sk@0] │ RepartitionExec: partitioning=RoundRobinBatch(3), input_partitions=2 - │ PartitionIsolatorExec: tasks=2 partitions=4 - │ DataSourceExec: file_groups={4 groups: [[/testdata/tpcds/plans_sf1_partitions4/date_dim/part-0.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-2.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-3.parquet]]}, projection=[d_date_sk, d_year, d_moy], file_type=parquet, predicate=d_year@6 = 2001 AND d_moy@8 = 2, pruning_predicate=d_year_null_count@2 != row_count@3 AND d_year_min@0 <= 2001 AND 2001 <= d_year_max@1 AND d_moy_null_count@6 != row_count@3 AND d_moy_min@4 <= 2 AND 2 <= d_moy_max@5, required_guarantees=[d_moy in (2), d_year in (2001)] + │ DistributedLeafExec: DataSourceExec: file_groups={2 groups: [[/testdata/tpcds/plans_sf1_partitions4/date_dim/part-0.parquet, /testdata/tpcds/plans_sf1_partitions4/date_dim/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-2.parquet, /testdata/tpcds/plans_sf1_partitions4/date_dim/part-3.parquet]]}, projection=[d_date_sk, d_year, d_moy], file_type=parquet, predicate=d_year@6 = 2001 AND d_moy@8 = 2, pruning_predicate=d_year_null_count@2 != row_count@3 AND d_year_min@0 <= 2001 AND 2001 <= d_year_max@1 AND d_moy_null_count@6 != row_count@3 AND d_moy_min@4 <= 2 AND 2 <= d_moy_max@5, required_guarantees=[d_moy in (2), d_year in (2001)] └────────────────────────────────────────────────── ┌───── Stage 12 ── Tasks: t0:[p0..p2] t1:[p3..p5] t2:[p6..p8] │ HashJoinExec: mode=CollectLeft, join_type=Inner, on=[(ws_item_sk@0, i_item_sk@0)], projection=[ws_ext_sales_price@1, i_item_id@3] │ CoalescePartitionsExec │ [Stage 11] => NetworkBroadcastExec: partitions_per_consumer=3, stage_partitions=9, input_tasks=3 │ RepartitionExec: partitioning=RoundRobinBatch(3), input_partitions=2 - │ PartitionIsolatorExec: tasks=3 partitions=4 - │ DataSourceExec: file_groups={4 groups: [[/testdata/tpcds/plans_sf1_partitions4/item/part-0.parquet], [/testdata/tpcds/plans_sf1_partitions4/item/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/item/part-2.parquet], [/testdata/tpcds/plans_sf1_partitions4/item/part-3.parquet]]}, projection=[i_item_sk, i_item_id], file_type=parquet, predicate=DynamicFilter [ empty ] + │ DistributedLeafExec: DataSourceExec: file_groups={2 groups: [[/testdata/tpcds/plans_sf1_partitions4/item/part-0.parquet, /testdata/tpcds/plans_sf1_partitions4/item/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/item/part-2.parquet, /testdata/tpcds/plans_sf1_partitions4/item/part-3.parquet]]}, projection=[i_item_sk, i_item_id], file_type=parquet, predicate=DynamicFilter [ empty ] └────────────────────────────────────────────────── ┌───── Stage 11 ── Tasks: t0:[p0..p8] t1:[p9..p17] t2:[p18..p26] │ BroadcastExec: input_partitions=3, consumer_tasks=3, output_partitions=9 │ HashJoinExec: mode=CollectLeft, join_type=Inner, on=[(ws_bill_addr_sk@1, CAST(customer_address.ca_address_sk AS Float64)@1)], projection=[ws_item_sk@0, ws_ext_sales_price@2] │ CoalescePartitionsExec - │ [Stage 10] => NetworkBroadcastExec: partitions_per_consumer=2, stage_partitions=6, input_tasks=3 + │ [Stage 10] => NetworkBroadcastExec: partitions_per_consumer=3, stage_partitions=9, input_tasks=3 │ ProjectionExec: expr=[ca_address_sk@0 as ca_address_sk, CAST(ca_address_sk@0 AS Float64) as CAST(customer_address.ca_address_sk AS Float64)] │ FilterExec: ca_gmt_offset@1 = Some(-500),4,2, projection=[ca_address_sk@0] │ RepartitionExec: partitioning=RoundRobinBatch(3), input_partitions=2 - │ PartitionIsolatorExec: tasks=3 partitions=4 - │ DataSourceExec: file_groups={4 groups: [[/testdata/tpcds/plans_sf1_partitions4/customer_address/part-0.parquet], [/testdata/tpcds/plans_sf1_partitions4/customer_address/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/customer_address/part-2.parquet], [/testdata/tpcds/plans_sf1_partitions4/customer_address/part-3.parquet]]}, projection=[ca_address_sk, ca_gmt_offset], file_type=parquet, predicate=ca_gmt_offset@11 = Some(-500),4,2 AND DynamicFilter [ empty ], pruning_predicate=ca_gmt_offset_null_count@2 != row_count@3 AND ca_gmt_offset_min@0 <= Some(-500),4,2 AND Some(-500),4,2 <= ca_gmt_offset_max@1, required_guarantees=[ca_gmt_offset in (Some(-500),4,2)] + │ DistributedLeafExec: DataSourceExec: file_groups={2 groups: [[/testdata/tpcds/plans_sf1_partitions4/customer_address/part-0.parquet, /testdata/tpcds/plans_sf1_partitions4/customer_address/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/customer_address/part-2.parquet, /testdata/tpcds/plans_sf1_partitions4/customer_address/part-3.parquet]]}, projection=[ca_address_sk, ca_gmt_offset], file_type=parquet, predicate=ca_gmt_offset@11 = Some(-500),4,2 AND DynamicFilter [ empty ], pruning_predicate=ca_gmt_offset_null_count@2 != row_count@3 AND ca_gmt_offset_min@0 <= Some(-500),4,2 AND Some(-500),4,2 <= ca_gmt_offset_max@1, required_guarantees=[ca_gmt_offset in (Some(-500),4,2)] └────────────────────────────────────────────────── - ┌───── Stage 10 ── Tasks: t0:[p0..p5] t1:[p6..p11] t2:[p12..p17] - │ BroadcastExec: input_partitions=2, consumer_tasks=3, output_partitions=6 + ┌───── Stage 10 ── Tasks: t0:[p0..p8] t1:[p9..p17] t2:[p18..p26] + │ BroadcastExec: input_partitions=3, consumer_tasks=3, output_partitions=9 │ HashJoinExec: mode=CollectLeft, join_type=Inner, on=[(CAST(date_dim.d_date_sk AS Float64)@1, ws_sold_date_sk@0)], projection=[ws_item_sk@3, ws_bill_addr_sk@4, ws_ext_sales_price@5] │ CoalescePartitionsExec │ [Stage 9] => NetworkBroadcastExec: partitions_per_consumer=3, stage_partitions=9, input_tasks=2 - │ PartitionIsolatorExec: tasks=3 partitions=6 - │ DataSourceExec: file_groups={6 groups: [[/testdata/tpcds/plans_sf1_partitions4/web_sales/part-0.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/web_sales/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/web_sales/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/web_sales/part-2.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/web_sales/part-2.parquet:..], ...]}, projection=[ws_sold_date_sk, ws_item_sk, ws_bill_addr_sk, ws_ext_sales_price], file_type=parquet, predicate=DynamicFilter [ empty ] + │ DistributedLeafExec: DataSourceExec: file_groups={3 groups: [[/testdata/tpcds/plans_sf1_partitions4/web_sales/part-0.parquet:.., /testdata/tpcds/plans_sf1_partitions4/web_sales/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/web_sales/part-1.parquet:.., /testdata/tpcds/plans_sf1_partitions4/web_sales/part-2.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/web_sales/part-2.parquet:.., /testdata/tpcds/plans_sf1_partitions4/web_sales/part-3.parquet:..]]}, projection=[ws_sold_date_sk, ws_item_sk, ws_bill_addr_sk, ws_ext_sales_price], file_type=parquet, predicate=DynamicFilter [ empty ] └────────────────────────────────────────────────── ┌───── Stage 9 ── Tasks: t0:[p0..p8] t1:[p9..p17] │ BroadcastExec: input_partitions=3, consumer_tasks=3, output_partitions=9 │ ProjectionExec: expr=[d_date_sk@0 as d_date_sk, CAST(d_date_sk@0 AS Float64) as CAST(date_dim.d_date_sk AS Float64)] │ FilterExec: d_year@1 = 2001 AND d_moy@2 = 2, projection=[d_date_sk@0] │ RepartitionExec: partitioning=RoundRobinBatch(3), input_partitions=2 - │ PartitionIsolatorExec: tasks=2 partitions=4 - │ DataSourceExec: file_groups={4 groups: [[/testdata/tpcds/plans_sf1_partitions4/date_dim/part-0.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-2.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-3.parquet]]}, projection=[d_date_sk, d_year, d_moy], file_type=parquet, predicate=d_year@6 = 2001 AND d_moy@8 = 2, pruning_predicate=d_year_null_count@2 != row_count@3 AND d_year_min@0 <= 2001 AND 2001 <= d_year_max@1 AND d_moy_null_count@6 != row_count@3 AND d_moy_min@4 <= 2 AND 2 <= d_moy_max@5, required_guarantees=[d_moy in (2), d_year in (2001)] + │ DistributedLeafExec: DataSourceExec: file_groups={2 groups: [[/testdata/tpcds/plans_sf1_partitions4/date_dim/part-0.parquet, /testdata/tpcds/plans_sf1_partitions4/date_dim/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-2.parquet, /testdata/tpcds/plans_sf1_partitions4/date_dim/part-3.parquet]]}, projection=[d_date_sk, d_year, d_moy], file_type=parquet, predicate=d_year@6 = 2001 AND d_moy@8 = 2, pruning_predicate=d_year_null_count@2 != row_count@3 AND d_year_min@0 <= 2001 AND 2001 <= d_year_max@1 AND d_moy_null_count@6 != row_count@3 AND d_moy_min@4 <= 2 AND 2 <= d_moy_max@5, required_guarantees=[d_moy in (2), d_year in (2001)] └────────────────────────────────────────────────── "); Ok(()) @@ -6021,7 +5566,7 @@ mod tests { │ [Stage 4] => NetworkShuffleExec: output_partitions=3, input_tasks=3 └────────────────────────────────────────────────── ┌───── Stage 4 ── Tasks: t0:[p0..p5] t1:[p0..p5] t2:[p0..p5] - │ RepartitionExec: partitioning=Hash([i_category@0, i_brand@1, cc_name@2, d_year@3, d_moy@4], 6), input_partitions=2 + │ RepartitionExec: partitioning=Hash([i_category@0, i_brand@1, cc_name@2, d_year@3, d_moy@4], 6), input_partitions=3 │ AggregateExec: mode=Partial, gby=[i_category@1 as i_category, i_brand@0 as i_brand, cc_name@5 as cc_name, d_year@3 as d_year, d_moy@4 as d_moy], aggr=[sum(catalog_sales.cs_sales_price)] │ ProjectionExec: expr=[i_brand@1 as i_brand, i_category@2 as i_category, cs_sales_price@3 as cs_sales_price, d_year@4 as d_year, d_moy@5 as d_moy, cc_name@0 as cc_name] │ HashJoinExec: mode=CollectLeft, join_type=Inner, on=[(CAST(call_center.cc_call_center_sk AS Float64)@2, cs_call_center_sk@2)], projection=[cc_name@1, i_brand@3, i_category@4, cs_sales_price@6, d_year@7, d_moy@8] @@ -6034,26 +5579,22 @@ mod tests { │ HashJoinExec: mode=CollectLeft, join_type=Inner, on=[(i_item_sk@0, cs_item_sk@2)], projection=[i_brand@1, i_category@2, cs_sold_date_sk@3, cs_call_center_sk@4, cs_sales_price@6] │ CoalescePartitionsExec │ [Stage 3] => NetworkBroadcastExec: partitions_per_consumer=2, stage_partitions=6, input_tasks=2 - │ PartitionIsolatorExec: tasks=3 partitions=6 - │ DataSourceExec: file_groups={6 groups: [[/testdata/tpcds/plans_sf1_partitions4/catalog_sales/part-0.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/catalog_sales/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/catalog_sales/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/catalog_sales/part-2.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/catalog_sales/part-2.parquet:..], ...]}, projection=[cs_sold_date_sk, cs_call_center_sk, cs_item_sk, cs_sales_price], file_type=parquet, predicate=DynamicFilter [ empty ] AND DynamicFilter [ empty ] AND DynamicFilter [ empty ] + │ DistributedLeafExec: DataSourceExec: file_groups={3 groups: [[/testdata/tpcds/plans_sf1_partitions4/catalog_sales/part-0.parquet:.., /testdata/tpcds/plans_sf1_partitions4/catalog_sales/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/catalog_sales/part-1.parquet:.., /testdata/tpcds/plans_sf1_partitions4/catalog_sales/part-2.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/catalog_sales/part-2.parquet:.., /testdata/tpcds/plans_sf1_partitions4/catalog_sales/part-3.parquet:..]]}, projection=[cs_sold_date_sk, cs_call_center_sk, cs_item_sk, cs_sales_price], file_type=parquet, predicate=DynamicFilter [ empty ] AND DynamicFilter [ empty ] AND DynamicFilter [ empty ] └────────────────────────────────────────────────── ┌───── Stage 1 ── Tasks: t0:[p0..p5] t1:[p6..p11] │ BroadcastExec: input_partitions=2, consumer_tasks=3, output_partitions=6 - │ PartitionIsolatorExec: tasks=2 partitions=4 - │ DataSourceExec: file_groups={4 groups: [[/testdata/tpcds/plans_sf1_partitions4/call_center/part-0.parquet], [/testdata/tpcds/plans_sf1_partitions4/call_center/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/call_center/part-2.parquet], [/testdata/tpcds/plans_sf1_partitions4/call_center/part-3.parquet]]}, projection=[cc_call_center_sk, cc_name, CAST(cc_call_center_sk@0 AS Float64) as CAST(call_center.cc_call_center_sk AS Float64)], file_type=parquet + │ DistributedLeafExec: DataSourceExec: file_groups={2 groups: [[/testdata/tpcds/plans_sf1_partitions4/call_center/part-0.parquet, /testdata/tpcds/plans_sf1_partitions4/call_center/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/call_center/part-2.parquet, /testdata/tpcds/plans_sf1_partitions4/call_center/part-3.parquet]]}, projection=[cc_call_center_sk, cc_name, CAST(cc_call_center_sk@0 AS Float64) as CAST(call_center.cc_call_center_sk AS Float64)], file_type=parquet └────────────────────────────────────────────────── ┌───── Stage 2 ── Tasks: t0:[p0..p8] t1:[p9..p17] │ BroadcastExec: input_partitions=3, consumer_tasks=3, output_partitions=9 │ ProjectionExec: expr=[d_date_sk@0 as d_date_sk, d_year@1 as d_year, d_moy@2 as d_moy, CAST(d_date_sk@0 AS Float64) as CAST(date_dim.d_date_sk AS Float64)] │ FilterExec: d_year@1 = 1999 OR d_year@1 = 1998 AND d_moy@2 = 12 OR d_year@1 = 2000 AND d_moy@2 = 1 │ RepartitionExec: partitioning=RoundRobinBatch(3), input_partitions=2 - │ PartitionIsolatorExec: tasks=2 partitions=4 - │ DataSourceExec: file_groups={4 groups: [[/testdata/tpcds/plans_sf1_partitions4/date_dim/part-0.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-2.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-3.parquet]]}, projection=[d_date_sk, d_year, d_moy], file_type=parquet, predicate=d_year@6 = 1999 OR d_year@6 = 1998 AND d_moy@8 = 12 OR d_year@6 = 2000 AND d_moy@8 = 1, pruning_predicate=d_year_null_count@2 != row_count@3 AND d_year_min@0 <= 1999 AND 1999 <= d_year_max@1 OR d_year_null_count@2 != row_count@3 AND d_year_min@0 <= 1998 AND 1998 <= d_year_max@1 AND d_moy_null_count@6 != row_count@3 AND d_moy_min@4 <= 12 AND 12 <= d_moy_max@5 OR d_year_null_count@2 != row_count@3 AND d_year_min@0 <= 2000 AND 2000 <= d_year_max@1 AND d_moy_null_count@6 != row_count@3 AND d_moy_min@4 <= 1 AND 1 <= d_moy_max@5, required_guarantees=[d_year in (1998, 1999, 2000)] + │ DistributedLeafExec: DataSourceExec: file_groups={2 groups: [[/testdata/tpcds/plans_sf1_partitions4/date_dim/part-0.parquet, /testdata/tpcds/plans_sf1_partitions4/date_dim/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-2.parquet, /testdata/tpcds/plans_sf1_partitions4/date_dim/part-3.parquet]]}, projection=[d_date_sk, d_year, d_moy], file_type=parquet, predicate=d_year@6 = 1999 OR d_year@6 = 1998 AND d_moy@8 = 12 OR d_year@6 = 2000 AND d_moy@8 = 1, pruning_predicate=d_year_null_count@2 != row_count@3 AND d_year_min@0 <= 1999 AND 1999 <= d_year_max@1 OR d_year_null_count@2 != row_count@3 AND d_year_min@0 <= 1998 AND 1998 <= d_year_max@1 AND d_moy_null_count@6 != row_count@3 AND d_moy_min@4 <= 12 AND 12 <= d_moy_max@5 OR d_year_null_count@2 != row_count@3 AND d_year_min@0 <= 2000 AND 2000 <= d_year_max@1 AND d_moy_null_count@6 != row_count@3 AND d_moy_min@4 <= 1 AND 1 <= d_moy_max@5, required_guarantees=[d_year in (1998, 1999, 2000)] └────────────────────────────────────────────────── ┌───── Stage 3 ── Tasks: t0:[p0..p5] t1:[p6..p11] │ BroadcastExec: input_partitions=2, consumer_tasks=3, output_partitions=6 - │ PartitionIsolatorExec: tasks=2 partitions=4 - │ DataSourceExec: file_groups={4 groups: [[/testdata/tpcds/plans_sf1_partitions4/item/part-0.parquet], [/testdata/tpcds/plans_sf1_partitions4/item/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/item/part-2.parquet], [/testdata/tpcds/plans_sf1_partitions4/item/part-3.parquet]]}, projection=[i_item_sk, i_brand, i_category], file_type=parquet + │ DistributedLeafExec: DataSourceExec: file_groups={2 groups: [[/testdata/tpcds/plans_sf1_partitions4/item/part-0.parquet, /testdata/tpcds/plans_sf1_partitions4/item/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/item/part-2.parquet, /testdata/tpcds/plans_sf1_partitions4/item/part-3.parquet]]}, projection=[i_item_sk, i_brand, i_category], file_type=parquet └────────────────────────────────────────────────── ┌───── Stage 10 ── Tasks: t0:[p0..p2] t1:[p0..p2] │ RepartitionExec: partitioning=Hash([i_category@0, i_brand@1, cc_name@2], 3), input_partitions=3 @@ -6061,7 +5602,7 @@ mod tests { │ [Stage 9] => NetworkShuffleExec: output_partitions=3, input_tasks=3 └────────────────────────────────────────────────── ┌───── Stage 9 ── Tasks: t0:[p0..p5] t1:[p0..p5] t2:[p0..p5] - │ RepartitionExec: partitioning=Hash([i_category@0, i_brand@1, cc_name@2, d_year@3, d_moy@4], 6), input_partitions=2 + │ RepartitionExec: partitioning=Hash([i_category@0, i_brand@1, cc_name@2, d_year@3, d_moy@4], 6), input_partitions=3 │ AggregateExec: mode=Partial, gby=[i_category@1 as i_category, i_brand@0 as i_brand, cc_name@5 as cc_name, d_year@3 as d_year, d_moy@4 as d_moy], aggr=[sum(catalog_sales.cs_sales_price)] │ ProjectionExec: expr=[i_brand@1 as i_brand, i_category@2 as i_category, cs_sales_price@3 as cs_sales_price, d_year@4 as d_year, d_moy@5 as d_moy, cc_name@0 as cc_name] │ HashJoinExec: mode=CollectLeft, join_type=Inner, on=[(CAST(call_center.cc_call_center_sk AS Float64)@2, cs_call_center_sk@2)], projection=[cc_name@1, i_brand@3, i_category@4, cs_sales_price@6, d_year@7, d_moy@8] @@ -6074,26 +5615,22 @@ mod tests { │ HashJoinExec: mode=CollectLeft, join_type=Inner, on=[(i_item_sk@0, cs_item_sk@2)], projection=[i_brand@1, i_category@2, cs_sold_date_sk@3, cs_call_center_sk@4, cs_sales_price@6] │ CoalescePartitionsExec │ [Stage 8] => NetworkBroadcastExec: partitions_per_consumer=2, stage_partitions=6, input_tasks=2 - │ PartitionIsolatorExec: tasks=3 partitions=6 - │ DataSourceExec: file_groups={6 groups: [[/testdata/tpcds/plans_sf1_partitions4/catalog_sales/part-0.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/catalog_sales/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/catalog_sales/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/catalog_sales/part-2.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/catalog_sales/part-2.parquet:..], ...]}, projection=[cs_sold_date_sk, cs_call_center_sk, cs_item_sk, cs_sales_price], file_type=parquet, predicate=DynamicFilter [ empty ] AND DynamicFilter [ empty ] AND DynamicFilter [ empty ] + │ DistributedLeafExec: DataSourceExec: file_groups={3 groups: [[/testdata/tpcds/plans_sf1_partitions4/catalog_sales/part-0.parquet:.., /testdata/tpcds/plans_sf1_partitions4/catalog_sales/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/catalog_sales/part-1.parquet:.., /testdata/tpcds/plans_sf1_partitions4/catalog_sales/part-2.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/catalog_sales/part-2.parquet:.., /testdata/tpcds/plans_sf1_partitions4/catalog_sales/part-3.parquet:..]]}, projection=[cs_sold_date_sk, cs_call_center_sk, cs_item_sk, cs_sales_price], file_type=parquet, predicate=DynamicFilter [ empty ] AND DynamicFilter [ empty ] AND DynamicFilter [ empty ] └────────────────────────────────────────────────── ┌───── Stage 6 ── Tasks: t0:[p0..p5] t1:[p6..p11] │ BroadcastExec: input_partitions=2, consumer_tasks=3, output_partitions=6 - │ PartitionIsolatorExec: tasks=2 partitions=4 - │ DataSourceExec: file_groups={4 groups: [[/testdata/tpcds/plans_sf1_partitions4/call_center/part-0.parquet], [/testdata/tpcds/plans_sf1_partitions4/call_center/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/call_center/part-2.parquet], [/testdata/tpcds/plans_sf1_partitions4/call_center/part-3.parquet]]}, projection=[cc_call_center_sk, cc_name, CAST(cc_call_center_sk@0 AS Float64) as CAST(call_center.cc_call_center_sk AS Float64)], file_type=parquet + │ DistributedLeafExec: DataSourceExec: file_groups={2 groups: [[/testdata/tpcds/plans_sf1_partitions4/call_center/part-0.parquet, /testdata/tpcds/plans_sf1_partitions4/call_center/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/call_center/part-2.parquet, /testdata/tpcds/plans_sf1_partitions4/call_center/part-3.parquet]]}, projection=[cc_call_center_sk, cc_name, CAST(cc_call_center_sk@0 AS Float64) as CAST(call_center.cc_call_center_sk AS Float64)], file_type=parquet └────────────────────────────────────────────────── ┌───── Stage 7 ── Tasks: t0:[p0..p8] t1:[p9..p17] │ BroadcastExec: input_partitions=3, consumer_tasks=3, output_partitions=9 │ ProjectionExec: expr=[d_date_sk@0 as d_date_sk, d_year@1 as d_year, d_moy@2 as d_moy, CAST(d_date_sk@0 AS Float64) as CAST(date_dim.d_date_sk AS Float64)] │ FilterExec: d_year@1 = 1999 OR d_year@1 = 1998 AND d_moy@2 = 12 OR d_year@1 = 2000 AND d_moy@2 = 1 │ RepartitionExec: partitioning=RoundRobinBatch(3), input_partitions=2 - │ PartitionIsolatorExec: tasks=2 partitions=4 - │ DataSourceExec: file_groups={4 groups: [[/testdata/tpcds/plans_sf1_partitions4/date_dim/part-0.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-2.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-3.parquet]]}, projection=[d_date_sk, d_year, d_moy], file_type=parquet, predicate=d_year@6 = 1999 OR d_year@6 = 1998 AND d_moy@8 = 12 OR d_year@6 = 2000 AND d_moy@8 = 1, pruning_predicate=d_year_null_count@2 != row_count@3 AND d_year_min@0 <= 1999 AND 1999 <= d_year_max@1 OR d_year_null_count@2 != row_count@3 AND d_year_min@0 <= 1998 AND 1998 <= d_year_max@1 AND d_moy_null_count@6 != row_count@3 AND d_moy_min@4 <= 12 AND 12 <= d_moy_max@5 OR d_year_null_count@2 != row_count@3 AND d_year_min@0 <= 2000 AND 2000 <= d_year_max@1 AND d_moy_null_count@6 != row_count@3 AND d_moy_min@4 <= 1 AND 1 <= d_moy_max@5, required_guarantees=[d_year in (1998, 1999, 2000)] + │ DistributedLeafExec: DataSourceExec: file_groups={2 groups: [[/testdata/tpcds/plans_sf1_partitions4/date_dim/part-0.parquet, /testdata/tpcds/plans_sf1_partitions4/date_dim/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-2.parquet, /testdata/tpcds/plans_sf1_partitions4/date_dim/part-3.parquet]]}, projection=[d_date_sk, d_year, d_moy], file_type=parquet, predicate=d_year@6 = 1999 OR d_year@6 = 1998 AND d_moy@8 = 12 OR d_year@6 = 2000 AND d_moy@8 = 1, pruning_predicate=d_year_null_count@2 != row_count@3 AND d_year_min@0 <= 1999 AND 1999 <= d_year_max@1 OR d_year_null_count@2 != row_count@3 AND d_year_min@0 <= 1998 AND 1998 <= d_year_max@1 AND d_moy_null_count@6 != row_count@3 AND d_moy_min@4 <= 12 AND 12 <= d_moy_max@5 OR d_year_null_count@2 != row_count@3 AND d_year_min@0 <= 2000 AND 2000 <= d_year_max@1 AND d_moy_null_count@6 != row_count@3 AND d_moy_min@4 <= 1 AND 1 <= d_moy_max@5, required_guarantees=[d_year in (1998, 1999, 2000)] └────────────────────────────────────────────────── ┌───── Stage 8 ── Tasks: t0:[p0..p5] t1:[p6..p11] │ BroadcastExec: input_partitions=2, consumer_tasks=3, output_partitions=6 - │ PartitionIsolatorExec: tasks=2 partitions=4 - │ DataSourceExec: file_groups={4 groups: [[/testdata/tpcds/plans_sf1_partitions4/item/part-0.parquet], [/testdata/tpcds/plans_sf1_partitions4/item/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/item/part-2.parquet], [/testdata/tpcds/plans_sf1_partitions4/item/part-3.parquet]]}, projection=[i_item_sk, i_brand, i_category], file_type=parquet + │ DistributedLeafExec: DataSourceExec: file_groups={2 groups: [[/testdata/tpcds/plans_sf1_partitions4/item/part-0.parquet, /testdata/tpcds/plans_sf1_partitions4/item/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/item/part-2.parquet, /testdata/tpcds/plans_sf1_partitions4/item/part-3.parquet]]}, projection=[i_item_sk, i_brand, i_category], file_type=parquet └────────────────────────────────────────────────── ┌───── Stage 15 ── Tasks: t0:[p0..p2] t1:[p0..p2] │ RepartitionExec: partitioning=Hash([i_category@0, i_brand@1, cc_name@2], 3), input_partitions=3 @@ -6101,7 +5638,7 @@ mod tests { │ [Stage 14] => NetworkShuffleExec: output_partitions=3, input_tasks=3 └────────────────────────────────────────────────── ┌───── Stage 14 ── Tasks: t0:[p0..p5] t1:[p0..p5] t2:[p0..p5] - │ RepartitionExec: partitioning=Hash([i_category@0, i_brand@1, cc_name@2, d_year@3, d_moy@4], 6), input_partitions=2 + │ RepartitionExec: partitioning=Hash([i_category@0, i_brand@1, cc_name@2, d_year@3, d_moy@4], 6), input_partitions=3 │ AggregateExec: mode=Partial, gby=[i_category@1 as i_category, i_brand@0 as i_brand, cc_name@5 as cc_name, d_year@3 as d_year, d_moy@4 as d_moy], aggr=[sum(catalog_sales.cs_sales_price)] │ ProjectionExec: expr=[i_brand@1 as i_brand, i_category@2 as i_category, cs_sales_price@3 as cs_sales_price, d_year@4 as d_year, d_moy@5 as d_moy, cc_name@0 as cc_name] │ HashJoinExec: mode=CollectLeft, join_type=Inner, on=[(CAST(call_center.cc_call_center_sk AS Float64)@2, cs_call_center_sk@2)], projection=[cc_name@1, i_brand@3, i_category@4, cs_sales_price@6, d_year@7, d_moy@8] @@ -6114,26 +5651,22 @@ mod tests { │ HashJoinExec: mode=CollectLeft, join_type=Inner, on=[(i_item_sk@0, cs_item_sk@2)], projection=[i_brand@1, i_category@2, cs_sold_date_sk@3, cs_call_center_sk@4, cs_sales_price@6] │ CoalescePartitionsExec │ [Stage 13] => NetworkBroadcastExec: partitions_per_consumer=2, stage_partitions=6, input_tasks=2 - │ PartitionIsolatorExec: tasks=3 partitions=6 - │ DataSourceExec: file_groups={6 groups: [[/testdata/tpcds/plans_sf1_partitions4/catalog_sales/part-0.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/catalog_sales/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/catalog_sales/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/catalog_sales/part-2.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/catalog_sales/part-2.parquet:..], ...]}, projection=[cs_sold_date_sk, cs_call_center_sk, cs_item_sk, cs_sales_price], file_type=parquet, predicate=DynamicFilter [ empty ] AND DynamicFilter [ empty ] AND DynamicFilter [ empty ] + │ DistributedLeafExec: DataSourceExec: file_groups={3 groups: [[/testdata/tpcds/plans_sf1_partitions4/catalog_sales/part-0.parquet:.., /testdata/tpcds/plans_sf1_partitions4/catalog_sales/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/catalog_sales/part-1.parquet:.., /testdata/tpcds/plans_sf1_partitions4/catalog_sales/part-2.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/catalog_sales/part-2.parquet:.., /testdata/tpcds/plans_sf1_partitions4/catalog_sales/part-3.parquet:..]]}, projection=[cs_sold_date_sk, cs_call_center_sk, cs_item_sk, cs_sales_price], file_type=parquet, predicate=DynamicFilter [ empty ] AND DynamicFilter [ empty ] AND DynamicFilter [ empty ] └────────────────────────────────────────────────── ┌───── Stage 11 ── Tasks: t0:[p0..p5] t1:[p6..p11] │ BroadcastExec: input_partitions=2, consumer_tasks=3, output_partitions=6 - │ PartitionIsolatorExec: tasks=2 partitions=4 - │ DataSourceExec: file_groups={4 groups: [[/testdata/tpcds/plans_sf1_partitions4/call_center/part-0.parquet], [/testdata/tpcds/plans_sf1_partitions4/call_center/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/call_center/part-2.parquet], [/testdata/tpcds/plans_sf1_partitions4/call_center/part-3.parquet]]}, projection=[cc_call_center_sk, cc_name, CAST(cc_call_center_sk@0 AS Float64) as CAST(call_center.cc_call_center_sk AS Float64)], file_type=parquet + │ DistributedLeafExec: DataSourceExec: file_groups={2 groups: [[/testdata/tpcds/plans_sf1_partitions4/call_center/part-0.parquet, /testdata/tpcds/plans_sf1_partitions4/call_center/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/call_center/part-2.parquet, /testdata/tpcds/plans_sf1_partitions4/call_center/part-3.parquet]]}, projection=[cc_call_center_sk, cc_name, CAST(cc_call_center_sk@0 AS Float64) as CAST(call_center.cc_call_center_sk AS Float64)], file_type=parquet └────────────────────────────────────────────────── ┌───── Stage 12 ── Tasks: t0:[p0..p8] t1:[p9..p17] │ BroadcastExec: input_partitions=3, consumer_tasks=3, output_partitions=9 │ ProjectionExec: expr=[d_date_sk@0 as d_date_sk, d_year@1 as d_year, d_moy@2 as d_moy, CAST(d_date_sk@0 AS Float64) as CAST(date_dim.d_date_sk AS Float64)] │ FilterExec: d_year@1 = 1999 OR d_year@1 = 1998 AND d_moy@2 = 12 OR d_year@1 = 2000 AND d_moy@2 = 1 │ RepartitionExec: partitioning=RoundRobinBatch(3), input_partitions=2 - │ PartitionIsolatorExec: tasks=2 partitions=4 - │ DataSourceExec: file_groups={4 groups: [[/testdata/tpcds/plans_sf1_partitions4/date_dim/part-0.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-2.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-3.parquet]]}, projection=[d_date_sk, d_year, d_moy], file_type=parquet, predicate=d_year@6 = 1999 OR d_year@6 = 1998 AND d_moy@8 = 12 OR d_year@6 = 2000 AND d_moy@8 = 1, pruning_predicate=d_year_null_count@2 != row_count@3 AND d_year_min@0 <= 1999 AND 1999 <= d_year_max@1 OR d_year_null_count@2 != row_count@3 AND d_year_min@0 <= 1998 AND 1998 <= d_year_max@1 AND d_moy_null_count@6 != row_count@3 AND d_moy_min@4 <= 12 AND 12 <= d_moy_max@5 OR d_year_null_count@2 != row_count@3 AND d_year_min@0 <= 2000 AND 2000 <= d_year_max@1 AND d_moy_null_count@6 != row_count@3 AND d_moy_min@4 <= 1 AND 1 <= d_moy_max@5, required_guarantees=[d_year in (1998, 1999, 2000)] + │ DistributedLeafExec: DataSourceExec: file_groups={2 groups: [[/testdata/tpcds/plans_sf1_partitions4/date_dim/part-0.parquet, /testdata/tpcds/plans_sf1_partitions4/date_dim/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-2.parquet, /testdata/tpcds/plans_sf1_partitions4/date_dim/part-3.parquet]]}, projection=[d_date_sk, d_year, d_moy], file_type=parquet, predicate=d_year@6 = 1999 OR d_year@6 = 1998 AND d_moy@8 = 12 OR d_year@6 = 2000 AND d_moy@8 = 1, pruning_predicate=d_year_null_count@2 != row_count@3 AND d_year_min@0 <= 1999 AND 1999 <= d_year_max@1 OR d_year_null_count@2 != row_count@3 AND d_year_min@0 <= 1998 AND 1998 <= d_year_max@1 AND d_moy_null_count@6 != row_count@3 AND d_moy_min@4 <= 12 AND 12 <= d_moy_max@5 OR d_year_null_count@2 != row_count@3 AND d_year_min@0 <= 2000 AND 2000 <= d_year_max@1 AND d_moy_null_count@6 != row_count@3 AND d_moy_min@4 <= 1 AND 1 <= d_moy_max@5, required_guarantees=[d_year in (1998, 1999, 2000)] └────────────────────────────────────────────────── ┌───── Stage 13 ── Tasks: t0:[p0..p5] t1:[p6..p11] │ BroadcastExec: input_partitions=2, consumer_tasks=3, output_partitions=6 - │ PartitionIsolatorExec: tasks=2 partitions=4 - │ DataSourceExec: file_groups={4 groups: [[/testdata/tpcds/plans_sf1_partitions4/item/part-0.parquet], [/testdata/tpcds/plans_sf1_partitions4/item/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/item/part-2.parquet], [/testdata/tpcds/plans_sf1_partitions4/item/part-3.parquet]]}, projection=[i_item_sk, i_brand, i_category], file_type=parquet + │ DistributedLeafExec: DataSourceExec: file_groups={2 groups: [[/testdata/tpcds/plans_sf1_partitions4/item/part-0.parquet, /testdata/tpcds/plans_sf1_partitions4/item/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/item/part-2.parquet, /testdata/tpcds/plans_sf1_partitions4/item/part-3.parquet]]}, projection=[i_item_sk, i_brand, i_category], file_type=parquet └────────────────────────────────────────────────── "#); Ok(()) @@ -6201,29 +5734,25 @@ mod tests { │ RepartitionExec: partitioning=Hash([CAST(date_dim.d_date_sk AS Float64)@2], 9), input_partitions=3 │ ProjectionExec: expr=[d_date_sk@0 as d_date_sk, d_date@1 as d_date, CAST(d_date_sk@0 AS Float64) as CAST(date_dim.d_date_sk AS Float64)] │ RepartitionExec: partitioning=RoundRobinBatch(3), input_partitions=2 - │ PartitionIsolatorExec: tasks=2 partitions=4 - │ DataSourceExec: file_groups={4 groups: [[/testdata/tpcds/plans_sf1_partitions4/date_dim/part-0.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-2.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-3.parquet]]}, projection=[d_date_sk, d_date], file_type=parquet + │ DistributedLeafExec: DataSourceExec: file_groups={2 groups: [[/testdata/tpcds/plans_sf1_partitions4/date_dim/part-0.parquet, /testdata/tpcds/plans_sf1_partitions4/date_dim/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-2.parquet, /testdata/tpcds/plans_sf1_partitions4/date_dim/part-3.parquet]]}, projection=[d_date_sk, d_date], file_type=parquet └────────────────────────────────────────────────── ┌───── Stage 3 ── Tasks: t0:[p0..p8] t1:[p0..p8] t2:[p0..p8] - │ RepartitionExec: partitioning=Hash([ws_sold_date_sk@0], 9), input_partitions=2 + │ RepartitionExec: partitioning=Hash([ws_sold_date_sk@0], 9), input_partitions=3 │ ProjectionExec: expr=[ws_sold_date_sk@1 as ws_sold_date_sk, ws_ext_sales_price@2 as ws_ext_sales_price, i_item_id@0 as i_item_id] │ HashJoinExec: mode=CollectLeft, join_type=Inner, on=[(i_item_sk@0, ws_item_sk@1)], projection=[i_item_id@1, ws_sold_date_sk@2, ws_ext_sales_price@4] │ CoalescePartitionsExec │ [Stage 2] => NetworkBroadcastExec: partitions_per_consumer=2, stage_partitions=6, input_tasks=2 - │ PartitionIsolatorExec: tasks=3 partitions=6 - │ DataSourceExec: file_groups={6 groups: [[/testdata/tpcds/plans_sf1_partitions4/web_sales/part-0.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/web_sales/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/web_sales/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/web_sales/part-2.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/web_sales/part-2.parquet:..], ...]}, projection=[ws_sold_date_sk, ws_item_sk, ws_ext_sales_price], file_type=parquet, predicate=DynamicFilter [ empty ] AND DynamicFilter [ empty ] + │ DistributedLeafExec: DataSourceExec: file_groups={3 groups: [[/testdata/tpcds/plans_sf1_partitions4/web_sales/part-0.parquet:.., /testdata/tpcds/plans_sf1_partitions4/web_sales/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/web_sales/part-1.parquet:.., /testdata/tpcds/plans_sf1_partitions4/web_sales/part-2.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/web_sales/part-2.parquet:.., /testdata/tpcds/plans_sf1_partitions4/web_sales/part-3.parquet:..]]}, projection=[ws_sold_date_sk, ws_item_sk, ws_ext_sales_price], file_type=parquet, predicate=DynamicFilter [ empty ] AND DynamicFilter [ empty ] └────────────────────────────────────────────────── ┌───── Stage 2 ── Tasks: t0:[p0..p5] t1:[p6..p11] │ BroadcastExec: input_partitions=2, consumer_tasks=3, output_partitions=6 - │ PartitionIsolatorExec: tasks=2 partitions=4 - │ DataSourceExec: file_groups={4 groups: [[/testdata/tpcds/plans_sf1_partitions4/item/part-0.parquet], [/testdata/tpcds/plans_sf1_partitions4/item/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/item/part-2.parquet], [/testdata/tpcds/plans_sf1_partitions4/item/part-3.parquet]]}, projection=[i_item_sk, i_item_id], file_type=parquet + │ DistributedLeafExec: DataSourceExec: file_groups={2 groups: [[/testdata/tpcds/plans_sf1_partitions4/item/part-0.parquet, /testdata/tpcds/plans_sf1_partitions4/item/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/item/part-2.parquet, /testdata/tpcds/plans_sf1_partitions4/item/part-3.parquet]]}, projection=[i_item_sk, i_item_id], file_type=parquet └────────────────────────────────────────────────── ┌───── Stage 5 ── Tasks: t0:[p0..p2] t1:[p3..p5] │ BroadcastExec: input_partitions=3, consumer_tasks=1, output_partitions=3 │ FilterExec: d_date@0 = 2000-01-03, projection=[d_week_seq@1] │ RepartitionExec: partitioning=RoundRobinBatch(3), input_partitions=2 - │ PartitionIsolatorExec: tasks=2 partitions=4 - │ DataSourceExec: file_groups={4 groups: [[/testdata/tpcds/plans_sf1_partitions4/date_dim/part-0.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-2.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-3.parquet]]}, projection=[d_date, d_week_seq], file_type=parquet, predicate=d_date@2 = 2000-01-03, pruning_predicate=d_date_null_count@2 != row_count@3 AND d_date_min@0 <= 2000-01-03 AND 2000-01-03 <= d_date_max@1, required_guarantees=[d_date in (2000-01-03)] + │ DistributedLeafExec: DataSourceExec: file_groups={2 groups: [[/testdata/tpcds/plans_sf1_partitions4/date_dim/part-0.parquet, /testdata/tpcds/plans_sf1_partitions4/date_dim/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-2.parquet, /testdata/tpcds/plans_sf1_partitions4/date_dim/part-3.parquet]]}, projection=[d_date, d_week_seq], file_type=parquet, predicate=d_date@2 = 2000-01-03, pruning_predicate=d_date_null_count@2 != row_count@3 AND d_date_min@0 <= 2000-01-03 AND 2000-01-03 <= d_date_max@1, required_guarantees=[d_date in (2000-01-03)] └────────────────────────────────────────────────── ┌───── Stage 9 ── Tasks: t0:[p0..p2] t1:[p0..p2] t2:[p0..p2] │ ProjectionExec: expr=[cs_ext_sales_price@1 as cs_ext_sales_price, i_item_id@2 as i_item_id, d_date@0 as d_date] @@ -6235,29 +5764,25 @@ mod tests { │ RepartitionExec: partitioning=Hash([CAST(date_dim.d_date_sk AS Float64)@2], 9), input_partitions=3 │ ProjectionExec: expr=[d_date_sk@0 as d_date_sk, d_date@1 as d_date, CAST(d_date_sk@0 AS Float64) as CAST(date_dim.d_date_sk AS Float64)] │ RepartitionExec: partitioning=RoundRobinBatch(3), input_partitions=2 - │ PartitionIsolatorExec: tasks=2 partitions=4 - │ DataSourceExec: file_groups={4 groups: [[/testdata/tpcds/plans_sf1_partitions4/date_dim/part-0.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-2.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-3.parquet]]}, projection=[d_date_sk, d_date], file_type=parquet + │ DistributedLeafExec: DataSourceExec: file_groups={2 groups: [[/testdata/tpcds/plans_sf1_partitions4/date_dim/part-0.parquet, /testdata/tpcds/plans_sf1_partitions4/date_dim/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-2.parquet, /testdata/tpcds/plans_sf1_partitions4/date_dim/part-3.parquet]]}, projection=[d_date_sk, d_date], file_type=parquet └────────────────────────────────────────────────── ┌───── Stage 8 ── Tasks: t0:[p0..p8] t1:[p0..p8] t2:[p0..p8] - │ RepartitionExec: partitioning=Hash([cs_sold_date_sk@0], 9), input_partitions=2 + │ RepartitionExec: partitioning=Hash([cs_sold_date_sk@0], 9), input_partitions=3 │ ProjectionExec: expr=[cs_sold_date_sk@1 as cs_sold_date_sk, cs_ext_sales_price@2 as cs_ext_sales_price, i_item_id@0 as i_item_id] │ HashJoinExec: mode=CollectLeft, join_type=Inner, on=[(i_item_sk@0, cs_item_sk@1)], projection=[i_item_id@1, cs_sold_date_sk@2, cs_ext_sales_price@4] │ CoalescePartitionsExec │ [Stage 7] => NetworkBroadcastExec: partitions_per_consumer=2, stage_partitions=6, input_tasks=2 - │ PartitionIsolatorExec: tasks=3 partitions=6 - │ DataSourceExec: file_groups={6 groups: [[/testdata/tpcds/plans_sf1_partitions4/catalog_sales/part-0.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/catalog_sales/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/catalog_sales/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/catalog_sales/part-2.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/catalog_sales/part-2.parquet:..], ...]}, projection=[cs_sold_date_sk, cs_item_sk, cs_ext_sales_price], file_type=parquet, predicate=DynamicFilter [ empty ] AND DynamicFilter [ empty ] + │ DistributedLeafExec: DataSourceExec: file_groups={3 groups: [[/testdata/tpcds/plans_sf1_partitions4/catalog_sales/part-0.parquet:.., /testdata/tpcds/plans_sf1_partitions4/catalog_sales/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/catalog_sales/part-1.parquet:.., /testdata/tpcds/plans_sf1_partitions4/catalog_sales/part-2.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/catalog_sales/part-2.parquet:.., /testdata/tpcds/plans_sf1_partitions4/catalog_sales/part-3.parquet:..]]}, projection=[cs_sold_date_sk, cs_item_sk, cs_ext_sales_price], file_type=parquet, predicate=DynamicFilter [ empty ] AND DynamicFilter [ empty ] └────────────────────────────────────────────────── ┌───── Stage 7 ── Tasks: t0:[p0..p5] t1:[p6..p11] │ BroadcastExec: input_partitions=2, consumer_tasks=3, output_partitions=6 - │ PartitionIsolatorExec: tasks=2 partitions=4 - │ DataSourceExec: file_groups={4 groups: [[/testdata/tpcds/plans_sf1_partitions4/item/part-0.parquet], [/testdata/tpcds/plans_sf1_partitions4/item/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/item/part-2.parquet], [/testdata/tpcds/plans_sf1_partitions4/item/part-3.parquet]]}, projection=[i_item_sk, i_item_id], file_type=parquet + │ DistributedLeafExec: DataSourceExec: file_groups={2 groups: [[/testdata/tpcds/plans_sf1_partitions4/item/part-0.parquet, /testdata/tpcds/plans_sf1_partitions4/item/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/item/part-2.parquet, /testdata/tpcds/plans_sf1_partitions4/item/part-3.parquet]]}, projection=[i_item_sk, i_item_id], file_type=parquet └────────────────────────────────────────────────── ┌───── Stage 10 ── Tasks: t0:[p0..p2] t1:[p3..p5] │ BroadcastExec: input_partitions=3, consumer_tasks=1, output_partitions=3 │ FilterExec: d_date@0 = 2000-01-03, projection=[d_week_seq@1] │ RepartitionExec: partitioning=RoundRobinBatch(3), input_partitions=2 - │ PartitionIsolatorExec: tasks=2 partitions=4 - │ DataSourceExec: file_groups={4 groups: [[/testdata/tpcds/plans_sf1_partitions4/date_dim/part-0.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-2.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-3.parquet]]}, projection=[d_date, d_week_seq], file_type=parquet, predicate=d_date@2 = 2000-01-03, pruning_predicate=d_date_null_count@2 != row_count@3 AND d_date_min@0 <= 2000-01-03 AND 2000-01-03 <= d_date_max@1, required_guarantees=[d_date in (2000-01-03)] + │ DistributedLeafExec: DataSourceExec: file_groups={2 groups: [[/testdata/tpcds/plans_sf1_partitions4/date_dim/part-0.parquet, /testdata/tpcds/plans_sf1_partitions4/date_dim/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-2.parquet, /testdata/tpcds/plans_sf1_partitions4/date_dim/part-3.parquet]]}, projection=[d_date, d_week_seq], file_type=parquet, predicate=d_date@2 = 2000-01-03, pruning_predicate=d_date_null_count@2 != row_count@3 AND d_date_min@0 <= 2000-01-03 AND 2000-01-03 <= d_date_max@1, required_guarantees=[d_date in (2000-01-03)] └────────────────────────────────────────────────── ┌───── Stage 14 ── Tasks: t0:[p0..p2] t1:[p0..p2] t2:[p0..p2] │ ProjectionExec: expr=[ss_ext_sales_price@1 as ss_ext_sales_price, i_item_id@2 as i_item_id, d_date@0 as d_date] @@ -6269,29 +5794,25 @@ mod tests { │ RepartitionExec: partitioning=Hash([CAST(date_dim.d_date_sk AS Float64)@2], 9), input_partitions=3 │ ProjectionExec: expr=[d_date_sk@0 as d_date_sk, d_date@1 as d_date, CAST(d_date_sk@0 AS Float64) as CAST(date_dim.d_date_sk AS Float64)] │ RepartitionExec: partitioning=RoundRobinBatch(3), input_partitions=2 - │ PartitionIsolatorExec: tasks=2 partitions=4 - │ DataSourceExec: file_groups={4 groups: [[/testdata/tpcds/plans_sf1_partitions4/date_dim/part-0.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-2.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-3.parquet]]}, projection=[d_date_sk, d_date], file_type=parquet + │ DistributedLeafExec: DataSourceExec: file_groups={2 groups: [[/testdata/tpcds/plans_sf1_partitions4/date_dim/part-0.parquet, /testdata/tpcds/plans_sf1_partitions4/date_dim/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-2.parquet, /testdata/tpcds/plans_sf1_partitions4/date_dim/part-3.parquet]]}, projection=[d_date_sk, d_date], file_type=parquet └────────────────────────────────────────────────── ┌───── Stage 13 ── Tasks: t0:[p0..p8] t1:[p0..p8] t2:[p0..p8] - │ RepartitionExec: partitioning=Hash([ss_sold_date_sk@0], 9), input_partitions=2 + │ RepartitionExec: partitioning=Hash([ss_sold_date_sk@0], 9), input_partitions=3 │ ProjectionExec: expr=[ss_sold_date_sk@1 as ss_sold_date_sk, ss_ext_sales_price@2 as ss_ext_sales_price, i_item_id@0 as i_item_id] │ HashJoinExec: mode=CollectLeft, join_type=Inner, on=[(i_item_sk@0, ss_item_sk@1)], projection=[i_item_id@1, ss_sold_date_sk@2, ss_ext_sales_price@4] │ CoalescePartitionsExec │ [Stage 12] => NetworkBroadcastExec: partitions_per_consumer=2, stage_partitions=6, input_tasks=2 - │ PartitionIsolatorExec: tasks=3 partitions=6 - │ DataSourceExec: file_groups={6 groups: [[/testdata/tpcds/plans_sf1_partitions4/store_sales/part-0.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_sales/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_sales/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_sales/part-2.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_sales/part-2.parquet:..], ...]}, projection=[ss_sold_date_sk, ss_item_sk, ss_ext_sales_price], file_type=parquet, predicate=DynamicFilter [ empty ] AND DynamicFilter [ empty ] + │ DistributedLeafExec: DataSourceExec: file_groups={3 groups: [[/testdata/tpcds/plans_sf1_partitions4/store_sales/part-0.parquet:.., /testdata/tpcds/plans_sf1_partitions4/store_sales/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_sales/part-1.parquet:.., /testdata/tpcds/plans_sf1_partitions4/store_sales/part-2.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_sales/part-2.parquet:.., /testdata/tpcds/plans_sf1_partitions4/store_sales/part-3.parquet:..]]}, projection=[ss_sold_date_sk, ss_item_sk, ss_ext_sales_price], file_type=parquet, predicate=DynamicFilter [ empty ] AND DynamicFilter [ empty ] └────────────────────────────────────────────────── ┌───── Stage 12 ── Tasks: t0:[p0..p5] t1:[p6..p11] │ BroadcastExec: input_partitions=2, consumer_tasks=3, output_partitions=6 - │ PartitionIsolatorExec: tasks=2 partitions=4 - │ DataSourceExec: file_groups={4 groups: [[/testdata/tpcds/plans_sf1_partitions4/item/part-0.parquet], [/testdata/tpcds/plans_sf1_partitions4/item/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/item/part-2.parquet], [/testdata/tpcds/plans_sf1_partitions4/item/part-3.parquet]]}, projection=[i_item_sk, i_item_id], file_type=parquet + │ DistributedLeafExec: DataSourceExec: file_groups={2 groups: [[/testdata/tpcds/plans_sf1_partitions4/item/part-0.parquet, /testdata/tpcds/plans_sf1_partitions4/item/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/item/part-2.parquet, /testdata/tpcds/plans_sf1_partitions4/item/part-3.parquet]]}, projection=[i_item_sk, i_item_id], file_type=parquet └────────────────────────────────────────────────── ┌───── Stage 15 ── Tasks: t0:[p0..p2] t1:[p3..p5] │ BroadcastExec: input_partitions=3, consumer_tasks=1, output_partitions=3 │ FilterExec: d_date@0 = 2000-01-03, projection=[d_week_seq@1] │ RepartitionExec: partitioning=RoundRobinBatch(3), input_partitions=2 - │ PartitionIsolatorExec: tasks=2 partitions=4 - │ DataSourceExec: file_groups={4 groups: [[/testdata/tpcds/plans_sf1_partitions4/date_dim/part-0.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-2.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-3.parquet]]}, projection=[d_date, d_week_seq], file_type=parquet, predicate=d_date@2 = 2000-01-03, pruning_predicate=d_date_null_count@2 != row_count@3 AND d_date_min@0 <= 2000-01-03 AND 2000-01-03 <= d_date_max@1, required_guarantees=[d_date in (2000-01-03)] + │ DistributedLeafExec: DataSourceExec: file_groups={2 groups: [[/testdata/tpcds/plans_sf1_partitions4/date_dim/part-0.parquet, /testdata/tpcds/plans_sf1_partitions4/date_dim/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-2.parquet, /testdata/tpcds/plans_sf1_partitions4/date_dim/part-3.parquet]]}, projection=[d_date, d_week_seq], file_type=parquet, predicate=d_date@2 = 2000-01-03, pruning_predicate=d_date_null_count@2 != row_count@3 AND d_date_min@0 <= 2000-01-03 AND 2000-01-03 <= d_date_max@1, required_guarantees=[d_date in (2000-01-03)] └────────────────────────────────────────────────── "); Ok(()) @@ -6316,8 +5837,7 @@ mod tests { │ [Stage 9] => NetworkBroadcastExec: partitions_per_consumer=3, stage_partitions=6, input_tasks=2 │ FilterExec: d_month_seq@0 >= 1224 AND d_month_seq@0 <= 1235, projection=[d_week_seq@1] │ RepartitionExec: partitioning=RoundRobinBatch(3), input_partitions=2 - │ PartitionIsolatorExec: tasks=2 partitions=4 - │ DataSourceExec: file_groups={4 groups: [[/testdata/tpcds/plans_sf1_partitions4/date_dim/part-0.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-2.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-3.parquet]]}, projection=[d_month_seq, d_week_seq], file_type=parquet, predicate=d_month_seq@3 >= 1224 AND d_month_seq@3 <= 1235 AND DynamicFilter [ empty ], pruning_predicate=d_month_seq_null_count@1 != row_count@2 AND d_month_seq_max@0 >= 1224 AND d_month_seq_null_count@1 != row_count@2 AND d_month_seq_min@3 <= 1235, required_guarantees=[] + │ DistributedLeafExec: DataSourceExec: file_groups={2 groups: [[/testdata/tpcds/plans_sf1_partitions4/date_dim/part-0.parquet, /testdata/tpcds/plans_sf1_partitions4/date_dim/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-2.parquet, /testdata/tpcds/plans_sf1_partitions4/date_dim/part-3.parquet]]}, projection=[d_month_seq, d_week_seq], file_type=parquet, predicate=d_month_seq@3 >= 1224 AND d_month_seq@3 <= 1235 AND DynamicFilter [ empty ], pruning_predicate=d_month_seq_null_count@1 != row_count@2 AND d_month_seq_max@0 >= 1224 AND d_month_seq_null_count@1 != row_count@2 AND d_month_seq_min@3 <= 1235, required_guarantees=[] └────────────────────────────────────────────────── ┌───── Stage 5 ── Tasks: t0:[p0..p5] t1:[p6..p11] │ BroadcastExec: input_partitions=3, consumer_tasks=2, output_partitions=6 @@ -6327,8 +5847,7 @@ mod tests { │ [Stage 4] => NetworkBroadcastExec: partitions_per_consumer=3, stage_partitions=6, input_tasks=2 │ FilterExec: d_month_seq@0 >= 1212 AND d_month_seq@0 <= 1223, projection=[d_week_seq@1] │ RepartitionExec: partitioning=RoundRobinBatch(3), input_partitions=2 - │ PartitionIsolatorExec: tasks=2 partitions=4 - │ DataSourceExec: file_groups={4 groups: [[/testdata/tpcds/plans_sf1_partitions4/date_dim/part-0.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-2.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-3.parquet]]}, projection=[d_month_seq, d_week_seq], file_type=parquet, predicate=d_month_seq@3 >= 1212 AND d_month_seq@3 <= 1223 AND DynamicFilter [ empty ], pruning_predicate=d_month_seq_null_count@1 != row_count@2 AND d_month_seq_max@0 >= 1212 AND d_month_seq_null_count@1 != row_count@2 AND d_month_seq_min@3 <= 1223, required_guarantees=[] + │ DistributedLeafExec: DataSourceExec: file_groups={2 groups: [[/testdata/tpcds/plans_sf1_partitions4/date_dim/part-0.parquet, /testdata/tpcds/plans_sf1_partitions4/date_dim/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-2.parquet, /testdata/tpcds/plans_sf1_partitions4/date_dim/part-3.parquet]]}, projection=[d_month_seq, d_week_seq], file_type=parquet, predicate=d_month_seq@3 >= 1212 AND d_month_seq@3 <= 1223 AND DynamicFilter [ empty ], pruning_predicate=d_month_seq_null_count@1 != row_count@2 AND d_month_seq_max@0 >= 1212 AND d_month_seq_null_count@1 != row_count@2 AND d_month_seq_min@3 <= 1223, required_guarantees=[] └────────────────────────────────────────────────── ┌───── Stage 4 ── Tasks: t0:[p0..p5] t1:[p6..p11] │ BroadcastExec: input_partitions=3, consumer_tasks=2, output_partitions=6 @@ -6342,25 +5861,22 @@ mod tests { └────────────────────────────────────────────────── ┌───── Stage 1 ── Tasks: t0:[p0..p3] t1:[p4..p7] │ BroadcastExec: input_partitions=2, consumer_tasks=2, output_partitions=4 - │ PartitionIsolatorExec: tasks=2 partitions=4 - │ DataSourceExec: file_groups={4 groups: [[/testdata/tpcds/plans_sf1_partitions4/store/part-0.parquet], [/testdata/tpcds/plans_sf1_partitions4/store/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/store/part-2.parquet], [/testdata/tpcds/plans_sf1_partitions4/store/part-3.parquet]]}, projection=[s_store_sk, s_store_id, s_store_name, CAST(s_store_sk@0 AS Float64) as CAST(store.s_store_sk AS Float64)], file_type=parquet + │ DistributedLeafExec: DataSourceExec: file_groups={2 groups: [[/testdata/tpcds/plans_sf1_partitions4/store/part-0.parquet, /testdata/tpcds/plans_sf1_partitions4/store/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/store/part-2.parquet, /testdata/tpcds/plans_sf1_partitions4/store/part-3.parquet]]}, projection=[s_store_sk, s_store_id, s_store_name, CAST(s_store_sk@0 AS Float64) as CAST(store.s_store_sk AS Float64)], file_type=parquet └────────────────────────────────────────────────── ┌───── Stage 3 ── Tasks: t0:[p0..p5] t1:[p0..p5] t2:[p0..p5] - │ RepartitionExec: partitioning=Hash([d_week_seq@0, ss_store_sk@1], 6), input_partitions=2 + │ RepartitionExec: partitioning=Hash([d_week_seq@0, ss_store_sk@1], 6), input_partitions=3 │ AggregateExec: mode=Partial, gby=[d_week_seq@2 as d_week_seq, ss_store_sk@0 as ss_store_sk], aggr=[sum(CASE WHEN date_dim.d_day_name = Utf8("Sunday") THEN store_sales.ss_sales_price ELSE NULL END), sum(CASE WHEN date_dim.d_day_name = Utf8("Monday") THEN store_sales.ss_sales_price ELSE NULL END), sum(CASE WHEN date_dim.d_day_name = Utf8("Tuesday") THEN store_sales.ss_sales_price ELSE NULL END), sum(CASE WHEN date_dim.d_day_name = Utf8("Wednesday") THEN store_sales.ss_sales_price ELSE NULL END), sum(CASE WHEN date_dim.d_day_name = Utf8("Thursday") THEN store_sales.ss_sales_price ELSE NULL END), sum(CASE WHEN date_dim.d_day_name = Utf8("Friday") THEN store_sales.ss_sales_price ELSE NULL END), sum(CASE WHEN date_dim.d_day_name = Utf8("Saturday") THEN store_sales.ss_sales_price ELSE NULL END)] │ ProjectionExec: expr=[ss_store_sk@2 as ss_store_sk, ss_sales_price@3 as ss_sales_price, d_week_seq@0 as d_week_seq, d_day_name@1 as d_day_name] │ HashJoinExec: mode=CollectLeft, join_type=Inner, on=[(CAST(date_dim.d_date_sk AS Float64)@3, ss_sold_date_sk@0)], projection=[d_week_seq@1, d_day_name@2, ss_store_sk@5, ss_sales_price@6] │ CoalescePartitionsExec │ [Stage 2] => NetworkBroadcastExec: partitions_per_consumer=3, stage_partitions=9, input_tasks=2 - │ PartitionIsolatorExec: tasks=3 partitions=6 - │ DataSourceExec: file_groups={6 groups: [[/testdata/tpcds/plans_sf1_partitions4/store_sales/part-0.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_sales/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_sales/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_sales/part-2.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_sales/part-2.parquet:..], ...]}, projection=[ss_sold_date_sk, ss_store_sk, ss_sales_price], file_type=parquet, predicate=DynamicFilter [ empty ] + │ DistributedLeafExec: DataSourceExec: file_groups={3 groups: [[/testdata/tpcds/plans_sf1_partitions4/store_sales/part-0.parquet:.., /testdata/tpcds/plans_sf1_partitions4/store_sales/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_sales/part-1.parquet:.., /testdata/tpcds/plans_sf1_partitions4/store_sales/part-2.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_sales/part-2.parquet:.., /testdata/tpcds/plans_sf1_partitions4/store_sales/part-3.parquet:..]]}, projection=[ss_sold_date_sk, ss_store_sk, ss_sales_price], file_type=parquet, predicate=DynamicFilter [ empty ] └────────────────────────────────────────────────── ┌───── Stage 2 ── Tasks: t0:[p0..p8] t1:[p9..p17] │ BroadcastExec: input_partitions=3, consumer_tasks=3, output_partitions=9 │ ProjectionExec: expr=[d_date_sk@0 as d_date_sk, d_week_seq@1 as d_week_seq, d_day_name@2 as d_day_name, CAST(d_date_sk@0 AS Float64) as CAST(date_dim.d_date_sk AS Float64)] │ RepartitionExec: partitioning=RoundRobinBatch(3), input_partitions=2 - │ PartitionIsolatorExec: tasks=2 partitions=4 - │ DataSourceExec: file_groups={4 groups: [[/testdata/tpcds/plans_sf1_partitions4/date_dim/part-0.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-2.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-3.parquet]]}, projection=[d_date_sk, d_week_seq, d_day_name], file_type=parquet + │ DistributedLeafExec: DataSourceExec: file_groups={2 groups: [[/testdata/tpcds/plans_sf1_partitions4/date_dim/part-0.parquet, /testdata/tpcds/plans_sf1_partitions4/date_dim/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-2.parquet, /testdata/tpcds/plans_sf1_partitions4/date_dim/part-3.parquet]]}, projection=[d_date_sk, d_week_seq, d_day_name], file_type=parquet └────────────────────────────────────────────────── ┌───── Stage 9 ── Tasks: t0:[p0..p5] t1:[p6..p11] │ BroadcastExec: input_partitions=3, consumer_tasks=2, output_partitions=6 @@ -6374,25 +5890,22 @@ mod tests { └────────────────────────────────────────────────── ┌───── Stage 6 ── Tasks: t0:[p0..p3] t1:[p4..p7] │ BroadcastExec: input_partitions=2, consumer_tasks=2, output_partitions=4 - │ PartitionIsolatorExec: tasks=2 partitions=4 - │ DataSourceExec: file_groups={4 groups: [[/testdata/tpcds/plans_sf1_partitions4/store/part-0.parquet], [/testdata/tpcds/plans_sf1_partitions4/store/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/store/part-2.parquet], [/testdata/tpcds/plans_sf1_partitions4/store/part-3.parquet]]}, projection=[s_store_sk, s_store_id, CAST(s_store_sk@0 AS Float64) as CAST(store.s_store_sk AS Float64)], file_type=parquet + │ DistributedLeafExec: DataSourceExec: file_groups={2 groups: [[/testdata/tpcds/plans_sf1_partitions4/store/part-0.parquet, /testdata/tpcds/plans_sf1_partitions4/store/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/store/part-2.parquet, /testdata/tpcds/plans_sf1_partitions4/store/part-3.parquet]]}, projection=[s_store_sk, s_store_id, CAST(s_store_sk@0 AS Float64) as CAST(store.s_store_sk AS Float64)], file_type=parquet └────────────────────────────────────────────────── ┌───── Stage 8 ── Tasks: t0:[p0..p5] t1:[p0..p5] t2:[p0..p5] - │ RepartitionExec: partitioning=Hash([d_week_seq@0, ss_store_sk@1], 6), input_partitions=2 + │ RepartitionExec: partitioning=Hash([d_week_seq@0, ss_store_sk@1], 6), input_partitions=3 │ AggregateExec: mode=Partial, gby=[d_week_seq@2 as d_week_seq, ss_store_sk@0 as ss_store_sk], aggr=[sum(CASE WHEN date_dim.d_day_name = Utf8("Sunday") THEN store_sales.ss_sales_price ELSE NULL END), sum(CASE WHEN date_dim.d_day_name = Utf8("Monday") THEN store_sales.ss_sales_price ELSE NULL END), sum(CASE WHEN date_dim.d_day_name = Utf8("Tuesday") THEN store_sales.ss_sales_price ELSE NULL END), sum(CASE WHEN date_dim.d_day_name = Utf8("Wednesday") THEN store_sales.ss_sales_price ELSE NULL END), sum(CASE WHEN date_dim.d_day_name = Utf8("Thursday") THEN store_sales.ss_sales_price ELSE NULL END), sum(CASE WHEN date_dim.d_day_name = Utf8("Friday") THEN store_sales.ss_sales_price ELSE NULL END), sum(CASE WHEN date_dim.d_day_name = Utf8("Saturday") THEN store_sales.ss_sales_price ELSE NULL END)] │ ProjectionExec: expr=[ss_store_sk@2 as ss_store_sk, ss_sales_price@3 as ss_sales_price, d_week_seq@0 as d_week_seq, d_day_name@1 as d_day_name] │ HashJoinExec: mode=CollectLeft, join_type=Inner, on=[(CAST(date_dim.d_date_sk AS Float64)@3, ss_sold_date_sk@0)], projection=[d_week_seq@1, d_day_name@2, ss_store_sk@5, ss_sales_price@6] │ CoalescePartitionsExec │ [Stage 7] => NetworkBroadcastExec: partitions_per_consumer=3, stage_partitions=9, input_tasks=2 - │ PartitionIsolatorExec: tasks=3 partitions=6 - │ DataSourceExec: file_groups={6 groups: [[/testdata/tpcds/plans_sf1_partitions4/store_sales/part-0.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_sales/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_sales/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_sales/part-2.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_sales/part-2.parquet:..], ...]}, projection=[ss_sold_date_sk, ss_store_sk, ss_sales_price], file_type=parquet, predicate=DynamicFilter [ empty ] + │ DistributedLeafExec: DataSourceExec: file_groups={3 groups: [[/testdata/tpcds/plans_sf1_partitions4/store_sales/part-0.parquet:.., /testdata/tpcds/plans_sf1_partitions4/store_sales/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_sales/part-1.parquet:.., /testdata/tpcds/plans_sf1_partitions4/store_sales/part-2.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_sales/part-2.parquet:.., /testdata/tpcds/plans_sf1_partitions4/store_sales/part-3.parquet:..]]}, projection=[ss_sold_date_sk, ss_store_sk, ss_sales_price], file_type=parquet, predicate=DynamicFilter [ empty ] └────────────────────────────────────────────────── ┌───── Stage 7 ── Tasks: t0:[p0..p8] t1:[p9..p17] │ BroadcastExec: input_partitions=3, consumer_tasks=3, output_partitions=9 │ ProjectionExec: expr=[d_date_sk@0 as d_date_sk, d_week_seq@1 as d_week_seq, d_day_name@2 as d_day_name, CAST(d_date_sk@0 AS Float64) as CAST(date_dim.d_date_sk AS Float64)] │ RepartitionExec: partitioning=RoundRobinBatch(3), input_partitions=2 - │ PartitionIsolatorExec: tasks=2 partitions=4 - │ DataSourceExec: file_groups={4 groups: [[/testdata/tpcds/plans_sf1_partitions4/date_dim/part-0.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-2.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-3.parquet]]}, projection=[d_date_sk, d_week_seq, d_day_name], file_type=parquet + │ DistributedLeafExec: DataSourceExec: file_groups={2 groups: [[/testdata/tpcds/plans_sf1_partitions4/date_dim/part-0.parquet, /testdata/tpcds/plans_sf1_partitions4/date_dim/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-2.parquet, /testdata/tpcds/plans_sf1_partitions4/date_dim/part-3.parquet]]}, projection=[d_date_sk, d_week_seq, d_day_name], file_type=parquet └────────────────────────────────────────────────── "#); Ok(()) @@ -6400,7 +5913,7 @@ mod tests { #[tokio::test] async fn test_tpcds_60() -> Result<()> { let display = test_tpcds_query("q60").await?; - assert_snapshot!(display, @" + assert_snapshot!(display, @r" ┌───── DistributedExec ── Tasks: t0:[p0] │ SortPreservingMergeExec: [i_item_id@0 ASC NULLS LAST, total_sales@1 ASC NULLS LAST], fetch=100 │ SortExec: TopK(fetch=100), expr=[i_item_id@0 ASC NULLS LAST, total_sales@1 ASC NULLS LAST], preserve_partitioning=[true] @@ -6443,105 +5956,93 @@ mod tests { │ CoalescePartitionsExec │ [Stage 3] => NetworkBroadcastExec: partitions_per_consumer=3, stage_partitions=9, input_tasks=3 │ RepartitionExec: partitioning=RoundRobinBatch(3), input_partitions=2 - │ PartitionIsolatorExec: tasks=3 partitions=4 - │ DataSourceExec: file_groups={4 groups: [[/testdata/tpcds/plans_sf1_partitions4/item/part-0.parquet], [/testdata/tpcds/plans_sf1_partitions4/item/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/item/part-2.parquet], [/testdata/tpcds/plans_sf1_partitions4/item/part-3.parquet]]}, projection=[i_item_sk, i_item_id], file_type=parquet, predicate=DynamicFilter [ empty ] + │ DistributedLeafExec: DataSourceExec: file_groups={2 groups: [[/testdata/tpcds/plans_sf1_partitions4/item/part-0.parquet, /testdata/tpcds/plans_sf1_partitions4/item/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/item/part-2.parquet, /testdata/tpcds/plans_sf1_partitions4/item/part-3.parquet]]}, projection=[i_item_sk, i_item_id], file_type=parquet, predicate=DynamicFilter [ empty ] └────────────────────────────────────────────────── ┌───── Stage 3 ── Tasks: t0:[p0..p8] t1:[p9..p17] t2:[p18..p26] │ BroadcastExec: input_partitions=3, consumer_tasks=3, output_partitions=9 │ HashJoinExec: mode=CollectLeft, join_type=Inner, on=[(ss_addr_sk@1, CAST(customer_address.ca_address_sk AS Float64)@1)], projection=[ss_item_sk@0, ss_ext_sales_price@2] │ CoalescePartitionsExec - │ [Stage 2] => NetworkBroadcastExec: partitions_per_consumer=2, stage_partitions=6, input_tasks=3 + │ [Stage 2] => NetworkBroadcastExec: partitions_per_consumer=3, stage_partitions=9, input_tasks=3 │ ProjectionExec: expr=[ca_address_sk@0 as ca_address_sk, CAST(ca_address_sk@0 AS Float64) as CAST(customer_address.ca_address_sk AS Float64)] │ FilterExec: ca_gmt_offset@1 = Some(-500),4,2, projection=[ca_address_sk@0] │ RepartitionExec: partitioning=RoundRobinBatch(3), input_partitions=2 - │ PartitionIsolatorExec: tasks=3 partitions=4 - │ DataSourceExec: file_groups={4 groups: [[/testdata/tpcds/plans_sf1_partitions4/customer_address/part-0.parquet], [/testdata/tpcds/plans_sf1_partitions4/customer_address/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/customer_address/part-2.parquet], [/testdata/tpcds/plans_sf1_partitions4/customer_address/part-3.parquet]]}, projection=[ca_address_sk, ca_gmt_offset], file_type=parquet, predicate=ca_gmt_offset@11 = Some(-500),4,2 AND DynamicFilter [ empty ], pruning_predicate=ca_gmt_offset_null_count@2 != row_count@3 AND ca_gmt_offset_min@0 <= Some(-500),4,2 AND Some(-500),4,2 <= ca_gmt_offset_max@1, required_guarantees=[ca_gmt_offset in (Some(-500),4,2)] + │ DistributedLeafExec: DataSourceExec: file_groups={2 groups: [[/testdata/tpcds/plans_sf1_partitions4/customer_address/part-0.parquet, /testdata/tpcds/plans_sf1_partitions4/customer_address/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/customer_address/part-2.parquet, /testdata/tpcds/plans_sf1_partitions4/customer_address/part-3.parquet]]}, projection=[ca_address_sk, ca_gmt_offset], file_type=parquet, predicate=ca_gmt_offset@11 = Some(-500),4,2 AND DynamicFilter [ empty ], pruning_predicate=ca_gmt_offset_null_count@2 != row_count@3 AND ca_gmt_offset_min@0 <= Some(-500),4,2 AND Some(-500),4,2 <= ca_gmt_offset_max@1, required_guarantees=[ca_gmt_offset in (Some(-500),4,2)] └────────────────────────────────────────────────── - ┌───── Stage 2 ── Tasks: t0:[p0..p5] t1:[p6..p11] t2:[p12..p17] - │ BroadcastExec: input_partitions=2, consumer_tasks=3, output_partitions=6 + ┌───── Stage 2 ── Tasks: t0:[p0..p8] t1:[p9..p17] t2:[p18..p26] + │ BroadcastExec: input_partitions=3, consumer_tasks=3, output_partitions=9 │ HashJoinExec: mode=CollectLeft, join_type=Inner, on=[(CAST(date_dim.d_date_sk AS Float64)@1, ss_sold_date_sk@0)], projection=[ss_item_sk@3, ss_addr_sk@4, ss_ext_sales_price@5] │ CoalescePartitionsExec │ [Stage 1] => NetworkBroadcastExec: partitions_per_consumer=3, stage_partitions=9, input_tasks=2 - │ PartitionIsolatorExec: tasks=3 partitions=6 - │ DataSourceExec: file_groups={6 groups: [[/testdata/tpcds/plans_sf1_partitions4/store_sales/part-0.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_sales/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_sales/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_sales/part-2.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_sales/part-2.parquet:..], ...]}, projection=[ss_sold_date_sk, ss_item_sk, ss_addr_sk, ss_ext_sales_price], file_type=parquet, predicate=DynamicFilter [ empty ] + │ DistributedLeafExec: DataSourceExec: file_groups={3 groups: [[/testdata/tpcds/plans_sf1_partitions4/store_sales/part-0.parquet:.., /testdata/tpcds/plans_sf1_partitions4/store_sales/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_sales/part-1.parquet:.., /testdata/tpcds/plans_sf1_partitions4/store_sales/part-2.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_sales/part-2.parquet:.., /testdata/tpcds/plans_sf1_partitions4/store_sales/part-3.parquet:..]]}, projection=[ss_sold_date_sk, ss_item_sk, ss_addr_sk, ss_ext_sales_price], file_type=parquet, predicate=DynamicFilter [ empty ] └────────────────────────────────────────────────── ┌───── Stage 1 ── Tasks: t0:[p0..p8] t1:[p9..p17] │ BroadcastExec: input_partitions=3, consumer_tasks=3, output_partitions=9 │ ProjectionExec: expr=[d_date_sk@0 as d_date_sk, CAST(d_date_sk@0 AS Float64) as CAST(date_dim.d_date_sk AS Float64)] │ FilterExec: d_year@1 = 1998 AND d_moy@2 = 9, projection=[d_date_sk@0] │ RepartitionExec: partitioning=RoundRobinBatch(3), input_partitions=2 - │ PartitionIsolatorExec: tasks=2 partitions=4 - │ DataSourceExec: file_groups={4 groups: [[/testdata/tpcds/plans_sf1_partitions4/date_dim/part-0.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-2.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-3.parquet]]}, projection=[d_date_sk, d_year, d_moy], file_type=parquet, predicate=d_year@6 = 1998 AND d_moy@8 = 9, pruning_predicate=d_year_null_count@2 != row_count@3 AND d_year_min@0 <= 1998 AND 1998 <= d_year_max@1 AND d_moy_null_count@6 != row_count@3 AND d_moy_min@4 <= 9 AND 9 <= d_moy_max@5, required_guarantees=[d_moy in (9), d_year in (1998)] + │ DistributedLeafExec: DataSourceExec: file_groups={2 groups: [[/testdata/tpcds/plans_sf1_partitions4/date_dim/part-0.parquet, /testdata/tpcds/plans_sf1_partitions4/date_dim/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-2.parquet, /testdata/tpcds/plans_sf1_partitions4/date_dim/part-3.parquet]]}, projection=[d_date_sk, d_year, d_moy], file_type=parquet, predicate=d_year@6 = 1998 AND d_moy@8 = 9, pruning_predicate=d_year_null_count@2 != row_count@3 AND d_year_min@0 <= 1998 AND 1998 <= d_year_max@1 AND d_moy_null_count@6 != row_count@3 AND d_moy_min@4 <= 9 AND 9 <= d_moy_max@5, required_guarantees=[d_moy in (9), d_year in (1998)] └────────────────────────────────────────────────── ┌───── Stage 8 ── Tasks: t0:[p0..p2] t1:[p3..p5] t2:[p6..p8] │ HashJoinExec: mode=CollectLeft, join_type=Inner, on=[(cs_item_sk@0, i_item_sk@0)], projection=[cs_ext_sales_price@1, i_item_id@3] │ CoalescePartitionsExec │ [Stage 7] => NetworkBroadcastExec: partitions_per_consumer=3, stage_partitions=9, input_tasks=3 │ RepartitionExec: partitioning=RoundRobinBatch(3), input_partitions=2 - │ PartitionIsolatorExec: tasks=3 partitions=4 - │ DataSourceExec: file_groups={4 groups: [[/testdata/tpcds/plans_sf1_partitions4/item/part-0.parquet], [/testdata/tpcds/plans_sf1_partitions4/item/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/item/part-2.parquet], [/testdata/tpcds/plans_sf1_partitions4/item/part-3.parquet]]}, projection=[i_item_sk, i_item_id], file_type=parquet, predicate=DynamicFilter [ empty ] + │ DistributedLeafExec: DataSourceExec: file_groups={2 groups: [[/testdata/tpcds/plans_sf1_partitions4/item/part-0.parquet, /testdata/tpcds/plans_sf1_partitions4/item/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/item/part-2.parquet, /testdata/tpcds/plans_sf1_partitions4/item/part-3.parquet]]}, projection=[i_item_sk, i_item_id], file_type=parquet, predicate=DynamicFilter [ empty ] └────────────────────────────────────────────────── ┌───── Stage 7 ── Tasks: t0:[p0..p8] t1:[p9..p17] t2:[p18..p26] │ BroadcastExec: input_partitions=3, consumer_tasks=3, output_partitions=9 │ HashJoinExec: mode=CollectLeft, join_type=Inner, on=[(cs_bill_addr_sk@0, CAST(customer_address.ca_address_sk AS Float64)@1)], projection=[cs_item_sk@1, cs_ext_sales_price@2] │ CoalescePartitionsExec - │ [Stage 6] => NetworkBroadcastExec: partitions_per_consumer=2, stage_partitions=6, input_tasks=3 + │ [Stage 6] => NetworkBroadcastExec: partitions_per_consumer=3, stage_partitions=9, input_tasks=3 │ ProjectionExec: expr=[ca_address_sk@0 as ca_address_sk, CAST(ca_address_sk@0 AS Float64) as CAST(customer_address.ca_address_sk AS Float64)] │ FilterExec: ca_gmt_offset@1 = Some(-500),4,2, projection=[ca_address_sk@0] │ RepartitionExec: partitioning=RoundRobinBatch(3), input_partitions=2 - │ PartitionIsolatorExec: tasks=3 partitions=4 - │ DataSourceExec: file_groups={4 groups: [[/testdata/tpcds/plans_sf1_partitions4/customer_address/part-0.parquet], [/testdata/tpcds/plans_sf1_partitions4/customer_address/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/customer_address/part-2.parquet], [/testdata/tpcds/plans_sf1_partitions4/customer_address/part-3.parquet]]}, projection=[ca_address_sk, ca_gmt_offset], file_type=parquet, predicate=ca_gmt_offset@11 = Some(-500),4,2 AND DynamicFilter [ empty ], pruning_predicate=ca_gmt_offset_null_count@2 != row_count@3 AND ca_gmt_offset_min@0 <= Some(-500),4,2 AND Some(-500),4,2 <= ca_gmt_offset_max@1, required_guarantees=[ca_gmt_offset in (Some(-500),4,2)] + │ DistributedLeafExec: DataSourceExec: file_groups={2 groups: [[/testdata/tpcds/plans_sf1_partitions4/customer_address/part-0.parquet, /testdata/tpcds/plans_sf1_partitions4/customer_address/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/customer_address/part-2.parquet, /testdata/tpcds/plans_sf1_partitions4/customer_address/part-3.parquet]]}, projection=[ca_address_sk, ca_gmt_offset], file_type=parquet, predicate=ca_gmt_offset@11 = Some(-500),4,2 AND DynamicFilter [ empty ], pruning_predicate=ca_gmt_offset_null_count@2 != row_count@3 AND ca_gmt_offset_min@0 <= Some(-500),4,2 AND Some(-500),4,2 <= ca_gmt_offset_max@1, required_guarantees=[ca_gmt_offset in (Some(-500),4,2)] └────────────────────────────────────────────────── - ┌───── Stage 6 ── Tasks: t0:[p0..p5] t1:[p6..p11] t2:[p12..p17] - │ BroadcastExec: input_partitions=2, consumer_tasks=3, output_partitions=6 + ┌───── Stage 6 ── Tasks: t0:[p0..p8] t1:[p9..p17] t2:[p18..p26] + │ BroadcastExec: input_partitions=3, consumer_tasks=3, output_partitions=9 │ HashJoinExec: mode=CollectLeft, join_type=Inner, on=[(CAST(date_dim.d_date_sk AS Float64)@1, cs_sold_date_sk@0)], projection=[cs_bill_addr_sk@3, cs_item_sk@4, cs_ext_sales_price@5] │ CoalescePartitionsExec │ [Stage 5] => NetworkBroadcastExec: partitions_per_consumer=3, stage_partitions=9, input_tasks=2 - │ PartitionIsolatorExec: tasks=3 partitions=6 - │ DataSourceExec: file_groups={6 groups: [[/testdata/tpcds/plans_sf1_partitions4/catalog_sales/part-0.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/catalog_sales/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/catalog_sales/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/catalog_sales/part-2.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/catalog_sales/part-2.parquet:..], ...]}, projection=[cs_sold_date_sk, cs_bill_addr_sk, cs_item_sk, cs_ext_sales_price], file_type=parquet, predicate=DynamicFilter [ empty ] + │ DistributedLeafExec: DataSourceExec: file_groups={3 groups: [[/testdata/tpcds/plans_sf1_partitions4/catalog_sales/part-0.parquet:.., /testdata/tpcds/plans_sf1_partitions4/catalog_sales/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/catalog_sales/part-1.parquet:.., /testdata/tpcds/plans_sf1_partitions4/catalog_sales/part-2.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/catalog_sales/part-2.parquet:.., /testdata/tpcds/plans_sf1_partitions4/catalog_sales/part-3.parquet:..]]}, projection=[cs_sold_date_sk, cs_bill_addr_sk, cs_item_sk, cs_ext_sales_price], file_type=parquet, predicate=DynamicFilter [ empty ] └────────────────────────────────────────────────── ┌───── Stage 5 ── Tasks: t0:[p0..p8] t1:[p9..p17] │ BroadcastExec: input_partitions=3, consumer_tasks=3, output_partitions=9 │ ProjectionExec: expr=[d_date_sk@0 as d_date_sk, CAST(d_date_sk@0 AS Float64) as CAST(date_dim.d_date_sk AS Float64)] │ FilterExec: d_year@1 = 1998 AND d_moy@2 = 9, projection=[d_date_sk@0] │ RepartitionExec: partitioning=RoundRobinBatch(3), input_partitions=2 - │ PartitionIsolatorExec: tasks=2 partitions=4 - │ DataSourceExec: file_groups={4 groups: [[/testdata/tpcds/plans_sf1_partitions4/date_dim/part-0.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-2.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-3.parquet]]}, projection=[d_date_sk, d_year, d_moy], file_type=parquet, predicate=d_year@6 = 1998 AND d_moy@8 = 9, pruning_predicate=d_year_null_count@2 != row_count@3 AND d_year_min@0 <= 1998 AND 1998 <= d_year_max@1 AND d_moy_null_count@6 != row_count@3 AND d_moy_min@4 <= 9 AND 9 <= d_moy_max@5, required_guarantees=[d_moy in (9), d_year in (1998)] + │ DistributedLeafExec: DataSourceExec: file_groups={2 groups: [[/testdata/tpcds/plans_sf1_partitions4/date_dim/part-0.parquet, /testdata/tpcds/plans_sf1_partitions4/date_dim/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-2.parquet, /testdata/tpcds/plans_sf1_partitions4/date_dim/part-3.parquet]]}, projection=[d_date_sk, d_year, d_moy], file_type=parquet, predicate=d_year@6 = 1998 AND d_moy@8 = 9, pruning_predicate=d_year_null_count@2 != row_count@3 AND d_year_min@0 <= 1998 AND 1998 <= d_year_max@1 AND d_moy_null_count@6 != row_count@3 AND d_moy_min@4 <= 9 AND 9 <= d_moy_max@5, required_guarantees=[d_moy in (9), d_year in (1998)] └────────────────────────────────────────────────── ┌───── Stage 12 ── Tasks: t0:[p0..p2] t1:[p3..p5] t2:[p6..p8] │ HashJoinExec: mode=CollectLeft, join_type=Inner, on=[(ws_item_sk@0, i_item_sk@0)], projection=[ws_ext_sales_price@1, i_item_id@3] │ CoalescePartitionsExec │ [Stage 11] => NetworkBroadcastExec: partitions_per_consumer=3, stage_partitions=9, input_tasks=3 │ RepartitionExec: partitioning=RoundRobinBatch(3), input_partitions=2 - │ PartitionIsolatorExec: tasks=3 partitions=4 - │ DataSourceExec: file_groups={4 groups: [[/testdata/tpcds/plans_sf1_partitions4/item/part-0.parquet], [/testdata/tpcds/plans_sf1_partitions4/item/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/item/part-2.parquet], [/testdata/tpcds/plans_sf1_partitions4/item/part-3.parquet]]}, projection=[i_item_sk, i_item_id], file_type=parquet, predicate=DynamicFilter [ empty ] + │ DistributedLeafExec: DataSourceExec: file_groups={2 groups: [[/testdata/tpcds/plans_sf1_partitions4/item/part-0.parquet, /testdata/tpcds/plans_sf1_partitions4/item/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/item/part-2.parquet, /testdata/tpcds/plans_sf1_partitions4/item/part-3.parquet]]}, projection=[i_item_sk, i_item_id], file_type=parquet, predicate=DynamicFilter [ empty ] └────────────────────────────────────────────────── ┌───── Stage 11 ── Tasks: t0:[p0..p8] t1:[p9..p17] t2:[p18..p26] │ BroadcastExec: input_partitions=3, consumer_tasks=3, output_partitions=9 │ HashJoinExec: mode=CollectLeft, join_type=Inner, on=[(ws_bill_addr_sk@1, CAST(customer_address.ca_address_sk AS Float64)@1)], projection=[ws_item_sk@0, ws_ext_sales_price@2] │ CoalescePartitionsExec - │ [Stage 10] => NetworkBroadcastExec: partitions_per_consumer=2, stage_partitions=6, input_tasks=3 + │ [Stage 10] => NetworkBroadcastExec: partitions_per_consumer=3, stage_partitions=9, input_tasks=3 │ ProjectionExec: expr=[ca_address_sk@0 as ca_address_sk, CAST(ca_address_sk@0 AS Float64) as CAST(customer_address.ca_address_sk AS Float64)] │ FilterExec: ca_gmt_offset@1 = Some(-500),4,2, projection=[ca_address_sk@0] │ RepartitionExec: partitioning=RoundRobinBatch(3), input_partitions=2 - │ PartitionIsolatorExec: tasks=3 partitions=4 - │ DataSourceExec: file_groups={4 groups: [[/testdata/tpcds/plans_sf1_partitions4/customer_address/part-0.parquet], [/testdata/tpcds/plans_sf1_partitions4/customer_address/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/customer_address/part-2.parquet], [/testdata/tpcds/plans_sf1_partitions4/customer_address/part-3.parquet]]}, projection=[ca_address_sk, ca_gmt_offset], file_type=parquet, predicate=ca_gmt_offset@11 = Some(-500),4,2 AND DynamicFilter [ empty ], pruning_predicate=ca_gmt_offset_null_count@2 != row_count@3 AND ca_gmt_offset_min@0 <= Some(-500),4,2 AND Some(-500),4,2 <= ca_gmt_offset_max@1, required_guarantees=[ca_gmt_offset in (Some(-500),4,2)] + │ DistributedLeafExec: DataSourceExec: file_groups={2 groups: [[/testdata/tpcds/plans_sf1_partitions4/customer_address/part-0.parquet, /testdata/tpcds/plans_sf1_partitions4/customer_address/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/customer_address/part-2.parquet, /testdata/tpcds/plans_sf1_partitions4/customer_address/part-3.parquet]]}, projection=[ca_address_sk, ca_gmt_offset], file_type=parquet, predicate=ca_gmt_offset@11 = Some(-500),4,2 AND DynamicFilter [ empty ], pruning_predicate=ca_gmt_offset_null_count@2 != row_count@3 AND ca_gmt_offset_min@0 <= Some(-500),4,2 AND Some(-500),4,2 <= ca_gmt_offset_max@1, required_guarantees=[ca_gmt_offset in (Some(-500),4,2)] └────────────────────────────────────────────────── - ┌───── Stage 10 ── Tasks: t0:[p0..p5] t1:[p6..p11] t2:[p12..p17] - │ BroadcastExec: input_partitions=2, consumer_tasks=3, output_partitions=6 + ┌───── Stage 10 ── Tasks: t0:[p0..p8] t1:[p9..p17] t2:[p18..p26] + │ BroadcastExec: input_partitions=3, consumer_tasks=3, output_partitions=9 │ HashJoinExec: mode=CollectLeft, join_type=Inner, on=[(CAST(date_dim.d_date_sk AS Float64)@1, ws_sold_date_sk@0)], projection=[ws_item_sk@3, ws_bill_addr_sk@4, ws_ext_sales_price@5] │ CoalescePartitionsExec │ [Stage 9] => NetworkBroadcastExec: partitions_per_consumer=3, stage_partitions=9, input_tasks=2 - │ PartitionIsolatorExec: tasks=3 partitions=6 - │ DataSourceExec: file_groups={6 groups: [[/testdata/tpcds/plans_sf1_partitions4/web_sales/part-0.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/web_sales/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/web_sales/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/web_sales/part-2.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/web_sales/part-2.parquet:..], ...]}, projection=[ws_sold_date_sk, ws_item_sk, ws_bill_addr_sk, ws_ext_sales_price], file_type=parquet, predicate=DynamicFilter [ empty ] + │ DistributedLeafExec: DataSourceExec: file_groups={3 groups: [[/testdata/tpcds/plans_sf1_partitions4/web_sales/part-0.parquet:.., /testdata/tpcds/plans_sf1_partitions4/web_sales/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/web_sales/part-1.parquet:.., /testdata/tpcds/plans_sf1_partitions4/web_sales/part-2.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/web_sales/part-2.parquet:.., /testdata/tpcds/plans_sf1_partitions4/web_sales/part-3.parquet:..]]}, projection=[ws_sold_date_sk, ws_item_sk, ws_bill_addr_sk, ws_ext_sales_price], file_type=parquet, predicate=DynamicFilter [ empty ] └────────────────────────────────────────────────── ┌───── Stage 9 ── Tasks: t0:[p0..p8] t1:[p9..p17] │ BroadcastExec: input_partitions=3, consumer_tasks=3, output_partitions=9 │ ProjectionExec: expr=[d_date_sk@0 as d_date_sk, CAST(d_date_sk@0 AS Float64) as CAST(date_dim.d_date_sk AS Float64)] │ FilterExec: d_year@1 = 1998 AND d_moy@2 = 9, projection=[d_date_sk@0] │ RepartitionExec: partitioning=RoundRobinBatch(3), input_partitions=2 - │ PartitionIsolatorExec: tasks=2 partitions=4 - │ DataSourceExec: file_groups={4 groups: [[/testdata/tpcds/plans_sf1_partitions4/date_dim/part-0.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-2.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-3.parquet]]}, projection=[d_date_sk, d_year, d_moy], file_type=parquet, predicate=d_year@6 = 1998 AND d_moy@8 = 9, pruning_predicate=d_year_null_count@2 != row_count@3 AND d_year_min@0 <= 1998 AND 1998 <= d_year_max@1 AND d_moy_null_count@6 != row_count@3 AND d_moy_min@4 <= 9 AND 9 <= d_moy_max@5, required_guarantees=[d_moy in (9), d_year in (1998)] + │ DistributedLeafExec: DataSourceExec: file_groups={2 groups: [[/testdata/tpcds/plans_sf1_partitions4/date_dim/part-0.parquet, /testdata/tpcds/plans_sf1_partitions4/date_dim/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-2.parquet, /testdata/tpcds/plans_sf1_partitions4/date_dim/part-3.parquet]]}, projection=[d_date_sk, d_year, d_moy], file_type=parquet, predicate=d_year@6 = 1998 AND d_moy@8 = 9, pruning_predicate=d_year_null_count@2 != row_count@3 AND d_year_min@0 <= 1998 AND 1998 <= d_year_max@1 AND d_moy_null_count@6 != row_count@3 AND d_moy_min@4 <= 9 AND 9 <= d_moy_max@5, required_guarantees=[d_moy in (9), d_year in (1998)] └────────────────────────────────────────────────── "); Ok(()) @@ -6549,7 +6050,7 @@ mod tests { #[tokio::test] async fn test_tpcds_61() -> Result<()> { let display = test_tpcds_query("q61").await?; - assert_snapshot!(display, @" + assert_snapshot!(display, @r" ┌───── DistributedExec ── Tasks: t0:[p0] │ SortExec: TopK(fetch=100), expr=[total@1 ASC NULLS LAST], preserve_partitioning=[false] │ ProjectionExec: expr=[promotions@0 as promotions, total@1 as total, CAST(promotions@0 AS Decimal128(15, 4)) / CAST(total@1 AS Decimal128(15, 4)) * Some(100),20,0 as promotional_sales.promotions / all_sales.total * Int64(100)] @@ -6570,8 +6071,7 @@ mod tests { │ [Stage 6] => NetworkBroadcastExec: partitions_per_consumer=3, stage_partitions=9, input_tasks=3 │ FilterExec: i_category@1 = Jewelry, projection=[i_item_sk@0] │ RepartitionExec: partitioning=RoundRobinBatch(3), input_partitions=2 - │ PartitionIsolatorExec: tasks=3 partitions=4 - │ DataSourceExec: file_groups={4 groups: [[/testdata/tpcds/plans_sf1_partitions4/item/part-0.parquet], [/testdata/tpcds/plans_sf1_partitions4/item/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/item/part-2.parquet], [/testdata/tpcds/plans_sf1_partitions4/item/part-3.parquet]]}, projection=[i_item_sk, i_category], file_type=parquet, predicate=i_category@12 = Jewelry AND DynamicFilter [ empty ], pruning_predicate=i_category_null_count@2 != row_count@3 AND i_category_min@0 <= Jewelry AND Jewelry <= i_category_max@1, required_guarantees=[i_category in (Jewelry)] + │ DistributedLeafExec: DataSourceExec: file_groups={2 groups: [[/testdata/tpcds/plans_sf1_partitions4/item/part-0.parquet, /testdata/tpcds/plans_sf1_partitions4/item/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/item/part-2.parquet, /testdata/tpcds/plans_sf1_partitions4/item/part-3.parquet]]}, projection=[i_item_sk, i_category], file_type=parquet, predicate=i_category@12 = Jewelry AND DynamicFilter [ empty ], pruning_predicate=i_category_null_count@2 != row_count@3 AND i_category_min@0 <= Jewelry AND Jewelry <= i_category_max@1, required_guarantees=[i_category in (Jewelry)] └────────────────────────────────────────────────── ┌───── Stage 6 ── Tasks: t0:[p0..p8] t1:[p9..p17] t2:[p18..p26] │ BroadcastExec: input_partitions=3, consumer_tasks=3, output_partitions=9 @@ -6580,8 +6080,7 @@ mod tests { │ [Stage 5] => NetworkBroadcastExec: partitions_per_consumer=3, stage_partitions=9, input_tasks=3 │ FilterExec: ca_gmt_offset@1 = Some(-500),4,2, projection=[ca_address_sk@0] │ RepartitionExec: partitioning=RoundRobinBatch(3), input_partitions=2 - │ PartitionIsolatorExec: tasks=3 partitions=4 - │ DataSourceExec: file_groups={4 groups: [[/testdata/tpcds/plans_sf1_partitions4/customer_address/part-0.parquet], [/testdata/tpcds/plans_sf1_partitions4/customer_address/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/customer_address/part-2.parquet], [/testdata/tpcds/plans_sf1_partitions4/customer_address/part-3.parquet]]}, projection=[ca_address_sk, ca_gmt_offset], file_type=parquet, predicate=ca_gmt_offset@11 = Some(-500),4,2 AND DynamicFilter [ empty ], pruning_predicate=ca_gmt_offset_null_count@2 != row_count@3 AND ca_gmt_offset_min@0 <= Some(-500),4,2 AND Some(-500),4,2 <= ca_gmt_offset_max@1, required_guarantees=[ca_gmt_offset in (Some(-500),4,2)] + │ DistributedLeafExec: DataSourceExec: file_groups={2 groups: [[/testdata/tpcds/plans_sf1_partitions4/customer_address/part-0.parquet, /testdata/tpcds/plans_sf1_partitions4/customer_address/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/customer_address/part-2.parquet, /testdata/tpcds/plans_sf1_partitions4/customer_address/part-3.parquet]]}, projection=[ca_address_sk, ca_gmt_offset], file_type=parquet, predicate=ca_gmt_offset@11 = Some(-500),4,2 AND DynamicFilter [ empty ], pruning_predicate=ca_gmt_offset_null_count@2 != row_count@3 AND ca_gmt_offset_min@0 <= Some(-500),4,2 AND Some(-500),4,2 <= ca_gmt_offset_max@1, required_guarantees=[ca_gmt_offset in (Some(-500),4,2)] └────────────────────────────────────────────────── ┌───── Stage 5 ── Tasks: t0:[p0..p8] t1:[p9..p17] t2:[p18..p26] │ BroadcastExec: input_partitions=3, consumer_tasks=3, output_partitions=9 @@ -6590,8 +6089,7 @@ mod tests { │ [Stage 4] => NetworkBroadcastExec: partitions_per_consumer=3, stage_partitions=9, input_tasks=3 │ ProjectionExec: expr=[c_customer_sk@0 as c_customer_sk, c_current_addr_sk@1 as c_current_addr_sk, CAST(c_customer_sk@0 AS Float64) as CAST(customer.c_customer_sk AS Float64)] │ RepartitionExec: partitioning=RoundRobinBatch(3), input_partitions=2 - │ PartitionIsolatorExec: tasks=3 partitions=4 - │ DataSourceExec: file_groups={4 groups: [[/testdata/tpcds/plans_sf1_partitions4/customer/part-0.parquet], [/testdata/tpcds/plans_sf1_partitions4/customer/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/customer/part-2.parquet], [/testdata/tpcds/plans_sf1_partitions4/customer/part-3.parquet]]}, projection=[c_customer_sk, c_current_addr_sk], file_type=parquet, predicate=DynamicFilter [ empty ] + │ DistributedLeafExec: DataSourceExec: file_groups={2 groups: [[/testdata/tpcds/plans_sf1_partitions4/customer/part-0.parquet, /testdata/tpcds/plans_sf1_partitions4/customer/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/customer/part-2.parquet, /testdata/tpcds/plans_sf1_partitions4/customer/part-3.parquet]]}, projection=[c_customer_sk, c_current_addr_sk], file_type=parquet, predicate=DynamicFilter [ empty ] └────────────────────────────────────────────────── ┌───── Stage 4 ── Tasks: t0:[p0..p8] t1:[p9..p17] t2:[p18..p26] │ BroadcastExec: input_partitions=3, consumer_tasks=3, output_partitions=9 @@ -6601,35 +6099,31 @@ mod tests { │ ProjectionExec: expr=[d_date_sk@0 as d_date_sk, CAST(d_date_sk@0 AS Float64) as CAST(date_dim.d_date_sk AS Float64)] │ FilterExec: d_year@1 = 1998 AND d_moy@2 = 11, projection=[d_date_sk@0] │ RepartitionExec: partitioning=RoundRobinBatch(3), input_partitions=2 - │ PartitionIsolatorExec: tasks=3 partitions=4 - │ DataSourceExec: file_groups={4 groups: [[/testdata/tpcds/plans_sf1_partitions4/date_dim/part-0.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-2.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-3.parquet]]}, projection=[d_date_sk, d_year, d_moy], file_type=parquet, predicate=d_year@6 = 1998 AND d_moy@8 = 11 AND DynamicFilter [ empty ], pruning_predicate=d_year_null_count@2 != row_count@3 AND d_year_min@0 <= 1998 AND 1998 <= d_year_max@1 AND d_moy_null_count@6 != row_count@3 AND d_moy_min@4 <= 11 AND 11 <= d_moy_max@5, required_guarantees=[d_moy in (11), d_year in (1998)] + │ DistributedLeafExec: DataSourceExec: file_groups={2 groups: [[/testdata/tpcds/plans_sf1_partitions4/date_dim/part-0.parquet, /testdata/tpcds/plans_sf1_partitions4/date_dim/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-2.parquet, /testdata/tpcds/plans_sf1_partitions4/date_dim/part-3.parquet]]}, projection=[d_date_sk, d_year, d_moy], file_type=parquet, predicate=d_year@6 = 1998 AND d_moy@8 = 11 AND DynamicFilter [ empty ], pruning_predicate=d_year_null_count@2 != row_count@3 AND d_year_min@0 <= 1998 AND 1998 <= d_year_max@1 AND d_moy_null_count@6 != row_count@3 AND d_moy_min@4 <= 11 AND 11 <= d_moy_max@5, required_guarantees=[d_moy in (11), d_year in (1998)] └────────────────────────────────────────────────── ┌───── Stage 3 ── Tasks: t0:[p0..p8] t1:[p9..p17] t2:[p18..p26] │ BroadcastExec: input_partitions=3, consumer_tasks=3, output_partitions=9 │ HashJoinExec: mode=CollectLeft, join_type=Inner, on=[(ss_promo_sk@3, CAST(promotion.p_promo_sk AS Float64)@1)], projection=[ss_sold_date_sk@0, ss_item_sk@1, ss_customer_sk@2, ss_ext_sales_price@4] │ CoalescePartitionsExec - │ [Stage 2] => NetworkBroadcastExec: partitions_per_consumer=2, stage_partitions=6, input_tasks=3 + │ [Stage 2] => NetworkBroadcastExec: partitions_per_consumer=3, stage_partitions=9, input_tasks=3 │ ProjectionExec: expr=[p_promo_sk@0 as p_promo_sk, CAST(p_promo_sk@0 AS Float64) as CAST(promotion.p_promo_sk AS Float64)] │ FilterExec: p_channel_dmail@1 = Y OR p_channel_email@2 = Y OR p_channel_tv@3 = Y, projection=[p_promo_sk@0] │ RepartitionExec: partitioning=RoundRobinBatch(3), input_partitions=2 - │ PartitionIsolatorExec: tasks=3 partitions=4 - │ DataSourceExec: file_groups={4 groups: [[/testdata/tpcds/plans_sf1_partitions4/promotion/part-0.parquet], [/testdata/tpcds/plans_sf1_partitions4/promotion/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/promotion/part-2.parquet], [/testdata/tpcds/plans_sf1_partitions4/promotion/part-3.parquet]]}, projection=[p_promo_sk, p_channel_dmail, p_channel_email, p_channel_tv], file_type=parquet, predicate=(p_channel_dmail@8 = Y OR p_channel_email@9 = Y OR p_channel_tv@11 = Y) AND DynamicFilter [ empty ], pruning_predicate=p_channel_dmail_null_count@2 != row_count@3 AND p_channel_dmail_min@0 <= Y AND Y <= p_channel_dmail_max@1 OR p_channel_email_null_count@6 != row_count@3 AND p_channel_email_min@4 <= Y AND Y <= p_channel_email_max@5 OR p_channel_tv_null_count@9 != row_count@3 AND p_channel_tv_min@7 <= Y AND Y <= p_channel_tv_max@8, required_guarantees=[] + │ DistributedLeafExec: DataSourceExec: file_groups={2 groups: [[/testdata/tpcds/plans_sf1_partitions4/promotion/part-0.parquet, /testdata/tpcds/plans_sf1_partitions4/promotion/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/promotion/part-2.parquet, /testdata/tpcds/plans_sf1_partitions4/promotion/part-3.parquet]]}, projection=[p_promo_sk, p_channel_dmail, p_channel_email, p_channel_tv], file_type=parquet, predicate=(p_channel_dmail@8 = Y OR p_channel_email@9 = Y OR p_channel_tv@11 = Y) AND DynamicFilter [ empty ], pruning_predicate=p_channel_dmail_null_count@2 != row_count@3 AND p_channel_dmail_min@0 <= Y AND Y <= p_channel_dmail_max@1 OR p_channel_email_null_count@6 != row_count@3 AND p_channel_email_min@4 <= Y AND Y <= p_channel_email_max@5 OR p_channel_tv_null_count@9 != row_count@3 AND p_channel_tv_min@7 <= Y AND Y <= p_channel_tv_max@8, required_guarantees=[] └────────────────────────────────────────────────── - ┌───── Stage 2 ── Tasks: t0:[p0..p5] t1:[p6..p11] t2:[p12..p17] - │ BroadcastExec: input_partitions=2, consumer_tasks=3, output_partitions=6 + ┌───── Stage 2 ── Tasks: t0:[p0..p8] t1:[p9..p17] t2:[p18..p26] + │ BroadcastExec: input_partitions=3, consumer_tasks=3, output_partitions=9 │ HashJoinExec: mode=CollectLeft, join_type=Inner, on=[(CAST(store.s_store_sk AS Float64)@1, ss_store_sk@3)], projection=[ss_sold_date_sk@2, ss_item_sk@3, ss_customer_sk@4, ss_promo_sk@6, ss_ext_sales_price@7] │ CoalescePartitionsExec │ [Stage 1] => NetworkBroadcastExec: partitions_per_consumer=3, stage_partitions=9, input_tasks=2 - │ PartitionIsolatorExec: tasks=3 partitions=6 - │ DataSourceExec: file_groups={6 groups: [[/testdata/tpcds/plans_sf1_partitions4/store_sales/part-0.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_sales/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_sales/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_sales/part-2.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_sales/part-2.parquet:..], ...]}, projection=[ss_sold_date_sk, ss_item_sk, ss_customer_sk, ss_store_sk, ss_promo_sk, ss_ext_sales_price], file_type=parquet, predicate=DynamicFilter [ empty ] + │ DistributedLeafExec: DataSourceExec: file_groups={3 groups: [[/testdata/tpcds/plans_sf1_partitions4/store_sales/part-0.parquet:.., /testdata/tpcds/plans_sf1_partitions4/store_sales/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_sales/part-1.parquet:.., /testdata/tpcds/plans_sf1_partitions4/store_sales/part-2.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_sales/part-2.parquet:.., /testdata/tpcds/plans_sf1_partitions4/store_sales/part-3.parquet:..]]}, projection=[ss_sold_date_sk, ss_item_sk, ss_customer_sk, ss_store_sk, ss_promo_sk, ss_ext_sales_price], file_type=parquet, predicate=DynamicFilter [ empty ] └────────────────────────────────────────────────── ┌───── Stage 1 ── Tasks: t0:[p0..p8] t1:[p9..p17] │ BroadcastExec: input_partitions=3, consumer_tasks=3, output_partitions=9 │ ProjectionExec: expr=[s_store_sk@0 as s_store_sk, CAST(s_store_sk@0 AS Float64) as CAST(store.s_store_sk AS Float64)] │ FilterExec: s_gmt_offset@1 = Some(-500),3,2, projection=[s_store_sk@0] │ RepartitionExec: partitioning=RoundRobinBatch(3), input_partitions=2 - │ PartitionIsolatorExec: tasks=2 partitions=4 - │ DataSourceExec: file_groups={4 groups: [[/testdata/tpcds/plans_sf1_partitions4/store/part-0.parquet], [/testdata/tpcds/plans_sf1_partitions4/store/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/store/part-2.parquet], [/testdata/tpcds/plans_sf1_partitions4/store/part-3.parquet]]}, projection=[s_store_sk, s_gmt_offset], file_type=parquet, predicate=s_gmt_offset@27 = Some(-500),3,2, pruning_predicate=s_gmt_offset_null_count@2 != row_count@3 AND s_gmt_offset_min@0 <= Some(-500),3,2 AND Some(-500),3,2 <= s_gmt_offset_max@1, required_guarantees=[s_gmt_offset in (Some(-500),3,2)] + │ DistributedLeafExec: DataSourceExec: file_groups={2 groups: [[/testdata/tpcds/plans_sf1_partitions4/store/part-0.parquet, /testdata/tpcds/plans_sf1_partitions4/store/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/store/part-2.parquet, /testdata/tpcds/plans_sf1_partitions4/store/part-3.parquet]]}, projection=[s_store_sk, s_gmt_offset], file_type=parquet, predicate=s_gmt_offset@27 = Some(-500),3,2, pruning_predicate=s_gmt_offset_null_count@2 != row_count@3 AND s_gmt_offset_min@0 <= Some(-500),3,2 AND Some(-500),3,2 <= s_gmt_offset_max@1, required_guarantees=[s_gmt_offset in (Some(-500),3,2)] └────────────────────────────────────────────────── ┌───── Stage 13 ── Tasks: t0:[p0..p2] t1:[p3..p5] t2:[p6..p8] │ AggregateExec: mode=Partial, gby=[], aggr=[sum(store_sales.ss_ext_sales_price)] @@ -6638,8 +6132,7 @@ mod tests { │ [Stage 12] => NetworkBroadcastExec: partitions_per_consumer=3, stage_partitions=9, input_tasks=3 │ FilterExec: i_category@1 = Jewelry, projection=[i_item_sk@0] │ RepartitionExec: partitioning=RoundRobinBatch(3), input_partitions=2 - │ PartitionIsolatorExec: tasks=3 partitions=4 - │ DataSourceExec: file_groups={4 groups: [[/testdata/tpcds/plans_sf1_partitions4/item/part-0.parquet], [/testdata/tpcds/plans_sf1_partitions4/item/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/item/part-2.parquet], [/testdata/tpcds/plans_sf1_partitions4/item/part-3.parquet]]}, projection=[i_item_sk, i_category], file_type=parquet, predicate=i_category@12 = Jewelry AND DynamicFilter [ empty ], pruning_predicate=i_category_null_count@2 != row_count@3 AND i_category_min@0 <= Jewelry AND Jewelry <= i_category_max@1, required_guarantees=[i_category in (Jewelry)] + │ DistributedLeafExec: DataSourceExec: file_groups={2 groups: [[/testdata/tpcds/plans_sf1_partitions4/item/part-0.parquet, /testdata/tpcds/plans_sf1_partitions4/item/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/item/part-2.parquet, /testdata/tpcds/plans_sf1_partitions4/item/part-3.parquet]]}, projection=[i_item_sk, i_category], file_type=parquet, predicate=i_category@12 = Jewelry AND DynamicFilter [ empty ], pruning_predicate=i_category_null_count@2 != row_count@3 AND i_category_min@0 <= Jewelry AND Jewelry <= i_category_max@1, required_guarantees=[i_category in (Jewelry)] └────────────────────────────────────────────────── ┌───── Stage 12 ── Tasks: t0:[p0..p8] t1:[p9..p17] t2:[p18..p26] │ BroadcastExec: input_partitions=3, consumer_tasks=3, output_partitions=9 @@ -6648,8 +6141,7 @@ mod tests { │ [Stage 11] => NetworkBroadcastExec: partitions_per_consumer=3, stage_partitions=9, input_tasks=3 │ FilterExec: ca_gmt_offset@1 = Some(-500),4,2, projection=[ca_address_sk@0] │ RepartitionExec: partitioning=RoundRobinBatch(3), input_partitions=2 - │ PartitionIsolatorExec: tasks=3 partitions=4 - │ DataSourceExec: file_groups={4 groups: [[/testdata/tpcds/plans_sf1_partitions4/customer_address/part-0.parquet], [/testdata/tpcds/plans_sf1_partitions4/customer_address/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/customer_address/part-2.parquet], [/testdata/tpcds/plans_sf1_partitions4/customer_address/part-3.parquet]]}, projection=[ca_address_sk, ca_gmt_offset], file_type=parquet, predicate=ca_gmt_offset@11 = Some(-500),4,2 AND DynamicFilter [ empty ], pruning_predicate=ca_gmt_offset_null_count@2 != row_count@3 AND ca_gmt_offset_min@0 <= Some(-500),4,2 AND Some(-500),4,2 <= ca_gmt_offset_max@1, required_guarantees=[ca_gmt_offset in (Some(-500),4,2)] + │ DistributedLeafExec: DataSourceExec: file_groups={2 groups: [[/testdata/tpcds/plans_sf1_partitions4/customer_address/part-0.parquet, /testdata/tpcds/plans_sf1_partitions4/customer_address/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/customer_address/part-2.parquet, /testdata/tpcds/plans_sf1_partitions4/customer_address/part-3.parquet]]}, projection=[ca_address_sk, ca_gmt_offset], file_type=parquet, predicate=ca_gmt_offset@11 = Some(-500),4,2 AND DynamicFilter [ empty ], pruning_predicate=ca_gmt_offset_null_count@2 != row_count@3 AND ca_gmt_offset_min@0 <= Some(-500),4,2 AND Some(-500),4,2 <= ca_gmt_offset_max@1, required_guarantees=[ca_gmt_offset in (Some(-500),4,2)] └────────────────────────────────────────────────── ┌───── Stage 11 ── Tasks: t0:[p0..p8] t1:[p9..p17] t2:[p18..p26] │ BroadcastExec: input_partitions=3, consumer_tasks=3, output_partitions=9 @@ -6658,35 +6150,31 @@ mod tests { │ [Stage 10] => NetworkBroadcastExec: partitions_per_consumer=3, stage_partitions=9, input_tasks=3 │ ProjectionExec: expr=[c_customer_sk@0 as c_customer_sk, c_current_addr_sk@1 as c_current_addr_sk, CAST(c_customer_sk@0 AS Float64) as CAST(customer.c_customer_sk AS Float64)] │ RepartitionExec: partitioning=RoundRobinBatch(3), input_partitions=2 - │ PartitionIsolatorExec: tasks=3 partitions=4 - │ DataSourceExec: file_groups={4 groups: [[/testdata/tpcds/plans_sf1_partitions4/customer/part-0.parquet], [/testdata/tpcds/plans_sf1_partitions4/customer/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/customer/part-2.parquet], [/testdata/tpcds/plans_sf1_partitions4/customer/part-3.parquet]]}, projection=[c_customer_sk, c_current_addr_sk], file_type=parquet, predicate=DynamicFilter [ empty ] + │ DistributedLeafExec: DataSourceExec: file_groups={2 groups: [[/testdata/tpcds/plans_sf1_partitions4/customer/part-0.parquet, /testdata/tpcds/plans_sf1_partitions4/customer/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/customer/part-2.parquet, /testdata/tpcds/plans_sf1_partitions4/customer/part-3.parquet]]}, projection=[c_customer_sk, c_current_addr_sk], file_type=parquet, predicate=DynamicFilter [ empty ] └────────────────────────────────────────────────── ┌───── Stage 10 ── Tasks: t0:[p0..p8] t1:[p9..p17] t2:[p18..p26] │ BroadcastExec: input_partitions=3, consumer_tasks=3, output_partitions=9 │ HashJoinExec: mode=CollectLeft, join_type=Inner, on=[(ss_sold_date_sk@0, CAST(date_dim.d_date_sk AS Float64)@1)], projection=[ss_item_sk@1, ss_customer_sk@2, ss_ext_sales_price@3] │ CoalescePartitionsExec - │ [Stage 9] => NetworkBroadcastExec: partitions_per_consumer=2, stage_partitions=6, input_tasks=3 + │ [Stage 9] => NetworkBroadcastExec: partitions_per_consumer=3, stage_partitions=9, input_tasks=3 │ ProjectionExec: expr=[d_date_sk@0 as d_date_sk, CAST(d_date_sk@0 AS Float64) as CAST(date_dim.d_date_sk AS Float64)] │ FilterExec: d_year@1 = 1998 AND d_moy@2 = 11, projection=[d_date_sk@0] │ RepartitionExec: partitioning=RoundRobinBatch(3), input_partitions=2 - │ PartitionIsolatorExec: tasks=3 partitions=4 - │ DataSourceExec: file_groups={4 groups: [[/testdata/tpcds/plans_sf1_partitions4/date_dim/part-0.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-2.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-3.parquet]]}, projection=[d_date_sk, d_year, d_moy], file_type=parquet, predicate=d_year@6 = 1998 AND d_moy@8 = 11 AND DynamicFilter [ empty ], pruning_predicate=d_year_null_count@2 != row_count@3 AND d_year_min@0 <= 1998 AND 1998 <= d_year_max@1 AND d_moy_null_count@6 != row_count@3 AND d_moy_min@4 <= 11 AND 11 <= d_moy_max@5, required_guarantees=[d_moy in (11), d_year in (1998)] + │ DistributedLeafExec: DataSourceExec: file_groups={2 groups: [[/testdata/tpcds/plans_sf1_partitions4/date_dim/part-0.parquet, /testdata/tpcds/plans_sf1_partitions4/date_dim/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-2.parquet, /testdata/tpcds/plans_sf1_partitions4/date_dim/part-3.parquet]]}, projection=[d_date_sk, d_year, d_moy], file_type=parquet, predicate=d_year@6 = 1998 AND d_moy@8 = 11 AND DynamicFilter [ empty ], pruning_predicate=d_year_null_count@2 != row_count@3 AND d_year_min@0 <= 1998 AND 1998 <= d_year_max@1 AND d_moy_null_count@6 != row_count@3 AND d_moy_min@4 <= 11 AND 11 <= d_moy_max@5, required_guarantees=[d_moy in (11), d_year in (1998)] └────────────────────────────────────────────────── - ┌───── Stage 9 ── Tasks: t0:[p0..p5] t1:[p6..p11] t2:[p12..p17] - │ BroadcastExec: input_partitions=2, consumer_tasks=3, output_partitions=6 + ┌───── Stage 9 ── Tasks: t0:[p0..p8] t1:[p9..p17] t2:[p18..p26] + │ BroadcastExec: input_partitions=3, consumer_tasks=3, output_partitions=9 │ HashJoinExec: mode=CollectLeft, join_type=Inner, on=[(CAST(store.s_store_sk AS Float64)@1, ss_store_sk@3)], projection=[ss_sold_date_sk@2, ss_item_sk@3, ss_customer_sk@4, ss_ext_sales_price@6] │ CoalescePartitionsExec │ [Stage 8] => NetworkBroadcastExec: partitions_per_consumer=3, stage_partitions=9, input_tasks=2 - │ PartitionIsolatorExec: tasks=3 partitions=6 - │ DataSourceExec: file_groups={6 groups: [[/testdata/tpcds/plans_sf1_partitions4/store_sales/part-0.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_sales/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_sales/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_sales/part-2.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_sales/part-2.parquet:..], ...]}, projection=[ss_sold_date_sk, ss_item_sk, ss_customer_sk, ss_store_sk, ss_ext_sales_price], file_type=parquet, predicate=DynamicFilter [ empty ] + │ DistributedLeafExec: DataSourceExec: file_groups={3 groups: [[/testdata/tpcds/plans_sf1_partitions4/store_sales/part-0.parquet:.., /testdata/tpcds/plans_sf1_partitions4/store_sales/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_sales/part-1.parquet:.., /testdata/tpcds/plans_sf1_partitions4/store_sales/part-2.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_sales/part-2.parquet:.., /testdata/tpcds/plans_sf1_partitions4/store_sales/part-3.parquet:..]]}, projection=[ss_sold_date_sk, ss_item_sk, ss_customer_sk, ss_store_sk, ss_ext_sales_price], file_type=parquet, predicate=DynamicFilter [ empty ] └────────────────────────────────────────────────── ┌───── Stage 8 ── Tasks: t0:[p0..p8] t1:[p9..p17] │ BroadcastExec: input_partitions=3, consumer_tasks=3, output_partitions=9 │ ProjectionExec: expr=[s_store_sk@0 as s_store_sk, CAST(s_store_sk@0 AS Float64) as CAST(store.s_store_sk AS Float64)] │ FilterExec: s_gmt_offset@1 = Some(-500),3,2, projection=[s_store_sk@0] │ RepartitionExec: partitioning=RoundRobinBatch(3), input_partitions=2 - │ PartitionIsolatorExec: tasks=2 partitions=4 - │ DataSourceExec: file_groups={4 groups: [[/testdata/tpcds/plans_sf1_partitions4/store/part-0.parquet], [/testdata/tpcds/plans_sf1_partitions4/store/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/store/part-2.parquet], [/testdata/tpcds/plans_sf1_partitions4/store/part-3.parquet]]}, projection=[s_store_sk, s_gmt_offset], file_type=parquet, predicate=s_gmt_offset@27 = Some(-500),3,2, pruning_predicate=s_gmt_offset_null_count@2 != row_count@3 AND s_gmt_offset_min@0 <= Some(-500),3,2 AND Some(-500),3,2 <= s_gmt_offset_max@1, required_guarantees=[s_gmt_offset in (Some(-500),3,2)] + │ DistributedLeafExec: DataSourceExec: file_groups={2 groups: [[/testdata/tpcds/plans_sf1_partitions4/store/part-0.parquet, /testdata/tpcds/plans_sf1_partitions4/store/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/store/part-2.parquet, /testdata/tpcds/plans_sf1_partitions4/store/part-3.parquet]]}, projection=[s_store_sk, s_gmt_offset], file_type=parquet, predicate=s_gmt_offset@27 = Some(-500),3,2, pruning_predicate=s_gmt_offset_null_count@2 != row_count@3 AND s_gmt_offset_min@0 <= Some(-500),3,2 AND Some(-500),3,2 <= s_gmt_offset_max@1, required_guarantees=[s_gmt_offset in (Some(-500),3,2)] └────────────────────────────────────────────────── "); Ok(()) @@ -6694,7 +6182,7 @@ mod tests { #[tokio::test] async fn test_tpcds_62() -> Result<()> { let display = test_tpcds_query("q62").await?; - assert_snapshot!(display, @" + assert_snapshot!(display, @r" ┌───── DistributedExec ── Tasks: t0:[p0] │ SortPreservingMergeExec: [w_substr@0 ASC, sm_type@1 ASC, web_name@2 ASC], fetch=100 │ [Stage 6] => NetworkCoalesceExec: output_partitions=6, input_tasks=2 @@ -6715,38 +6203,33 @@ mod tests { │ ProjectionExec: expr=[d_date_sk@0 as d_date_sk, CAST(d_date_sk@0 AS Float64) as CAST(date_dim.d_date_sk AS Float64)] │ FilterExec: d_month_seq@1 >= 1200 AND d_month_seq@1 <= 1211, projection=[d_date_sk@0] │ RepartitionExec: partitioning=RoundRobinBatch(3), input_partitions=2 - │ PartitionIsolatorExec: tasks=3 partitions=4 - │ DataSourceExec: file_groups={4 groups: [[/testdata/tpcds/plans_sf1_partitions4/date_dim/part-0.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-2.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-3.parquet]]}, projection=[d_date_sk, d_month_seq], file_type=parquet, predicate=d_month_seq@3 >= 1200 AND d_month_seq@3 <= 1211 AND DynamicFilter [ empty ], pruning_predicate=d_month_seq_null_count@1 != row_count@2 AND d_month_seq_max@0 >= 1200 AND d_month_seq_null_count@1 != row_count@2 AND d_month_seq_min@3 <= 1211, required_guarantees=[] + │ DistributedLeafExec: DataSourceExec: file_groups={2 groups: [[/testdata/tpcds/plans_sf1_partitions4/date_dim/part-0.parquet, /testdata/tpcds/plans_sf1_partitions4/date_dim/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-2.parquet, /testdata/tpcds/plans_sf1_partitions4/date_dim/part-3.parquet]]}, projection=[d_date_sk, d_month_seq], file_type=parquet, predicate=d_month_seq@3 >= 1200 AND d_month_seq@3 <= 1211 AND DynamicFilter [ empty ], pruning_predicate=d_month_seq_null_count@1 != row_count@2 AND d_month_seq_max@0 >= 1200 AND d_month_seq_null_count@1 != row_count@2 AND d_month_seq_min@3 <= 1211, required_guarantees=[] └────────────────────────────────────────────────── ┌───── Stage 4 ── Tasks: t0:[p0..p5] t1:[p6..p11] t2:[p12..p17] │ BroadcastExec: input_partitions=2, consumer_tasks=3, output_partitions=6 │ HashJoinExec: mode=CollectLeft, join_type=Inner, on=[(ws_web_site_sk@2, CAST(web_site.web_site_sk AS Float64)@2)], projection=[ws_sold_date_sk@0, ws_ship_date_sk@1, w_substr@3, sm_type@4, web_name@6] │ CoalescePartitionsExec │ [Stage 3] => NetworkBroadcastExec: partitions_per_consumer=2, stage_partitions=6, input_tasks=3 - │ PartitionIsolatorExec: tasks=3 partitions=4 - │ DataSourceExec: file_groups={4 groups: [[/testdata/tpcds/plans_sf1_partitions4/web_site/part-0.parquet], [/testdata/tpcds/plans_sf1_partitions4/web_site/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/web_site/part-2.parquet], [/testdata/tpcds/plans_sf1_partitions4/web_site/part-3.parquet]]}, projection=[web_site_sk, web_name, CAST(web_site_sk@0 AS Float64) as CAST(web_site.web_site_sk AS Float64)], file_type=parquet, predicate=DynamicFilter [ empty ] + │ DistributedLeafExec: DataSourceExec: file_groups={2 groups: [[/testdata/tpcds/plans_sf1_partitions4/web_site/part-0.parquet, /testdata/tpcds/plans_sf1_partitions4/web_site/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/web_site/part-2.parquet, /testdata/tpcds/plans_sf1_partitions4/web_site/part-3.parquet]]}, projection=[web_site_sk, web_name, CAST(web_site_sk@0 AS Float64) as CAST(web_site.web_site_sk AS Float64)], file_type=parquet, predicate=DynamicFilter [ empty ] └────────────────────────────────────────────────── ┌───── Stage 3 ── Tasks: t0:[p0..p5] t1:[p6..p11] t2:[p12..p17] │ BroadcastExec: input_partitions=2, consumer_tasks=3, output_partitions=6 │ HashJoinExec: mode=CollectLeft, join_type=Inner, on=[(ws_ship_mode_sk@3, CAST(ship_mode.sm_ship_mode_sk AS Float64)@2)], projection=[ws_sold_date_sk@0, ws_ship_date_sk@1, ws_web_site_sk@2, w_substr@4, sm_type@6] │ CoalescePartitionsExec - │ [Stage 2] => NetworkBroadcastExec: partitions_per_consumer=2, stage_partitions=6, input_tasks=3 - │ PartitionIsolatorExec: tasks=3 partitions=4 - │ DataSourceExec: file_groups={4 groups: [[/testdata/tpcds/plans_sf1_partitions4/ship_mode/part-0.parquet], [/testdata/tpcds/plans_sf1_partitions4/ship_mode/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/ship_mode/part-2.parquet], [/testdata/tpcds/plans_sf1_partitions4/ship_mode/part-3.parquet]]}, projection=[sm_ship_mode_sk, sm_type, CAST(sm_ship_mode_sk@0 AS Float64) as CAST(ship_mode.sm_ship_mode_sk AS Float64)], file_type=parquet, predicate=DynamicFilter [ empty ] + │ [Stage 2] => NetworkBroadcastExec: partitions_per_consumer=3, stage_partitions=9, input_tasks=3 + │ DistributedLeafExec: DataSourceExec: file_groups={2 groups: [[/testdata/tpcds/plans_sf1_partitions4/ship_mode/part-0.parquet, /testdata/tpcds/plans_sf1_partitions4/ship_mode/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/ship_mode/part-2.parquet, /testdata/tpcds/plans_sf1_partitions4/ship_mode/part-3.parquet]]}, projection=[sm_ship_mode_sk, sm_type, CAST(sm_ship_mode_sk@0 AS Float64) as CAST(ship_mode.sm_ship_mode_sk AS Float64)], file_type=parquet, predicate=DynamicFilter [ empty ] └────────────────────────────────────────────────── - ┌───── Stage 2 ── Tasks: t0:[p0..p5] t1:[p6..p11] t2:[p12..p17] - │ BroadcastExec: input_partitions=2, consumer_tasks=3, output_partitions=6 + ┌───── Stage 2 ── Tasks: t0:[p0..p8] t1:[p9..p17] t2:[p18..p26] + │ BroadcastExec: input_partitions=3, consumer_tasks=3, output_partitions=9 │ ProjectionExec: expr=[ws_sold_date_sk@1 as ws_sold_date_sk, ws_ship_date_sk@2 as ws_ship_date_sk, ws_web_site_sk@3 as ws_web_site_sk, ws_ship_mode_sk@4 as ws_ship_mode_sk, w_substr@0 as w_substr] │ HashJoinExec: mode=CollectLeft, join_type=Inner, on=[(CAST(sq1.w_warehouse_sk AS Float64)@2, ws_warehouse_sk@4)], projection=[w_substr@0, ws_sold_date_sk@3, ws_ship_date_sk@4, ws_web_site_sk@5, ws_ship_mode_sk@6] │ CoalescePartitionsExec │ [Stage 1] => NetworkBroadcastExec: partitions_per_consumer=2, stage_partitions=6, input_tasks=2 - │ PartitionIsolatorExec: tasks=3 partitions=6 - │ DataSourceExec: file_groups={6 groups: [[/testdata/tpcds/plans_sf1_partitions4/web_sales/part-0.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/web_sales/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/web_sales/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/web_sales/part-2.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/web_sales/part-2.parquet:..], ...]}, projection=[ws_sold_date_sk, ws_ship_date_sk, ws_web_site_sk, ws_ship_mode_sk, ws_warehouse_sk], file_type=parquet, predicate=DynamicFilter [ empty ] + │ DistributedLeafExec: DataSourceExec: file_groups={3 groups: [[/testdata/tpcds/plans_sf1_partitions4/web_sales/part-0.parquet:.., /testdata/tpcds/plans_sf1_partitions4/web_sales/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/web_sales/part-1.parquet:.., /testdata/tpcds/plans_sf1_partitions4/web_sales/part-2.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/web_sales/part-2.parquet:.., /testdata/tpcds/plans_sf1_partitions4/web_sales/part-3.parquet:..]]}, projection=[ws_sold_date_sk, ws_ship_date_sk, ws_web_site_sk, ws_ship_mode_sk, ws_warehouse_sk], file_type=parquet, predicate=DynamicFilter [ empty ] └────────────────────────────────────────────────── ┌───── Stage 1 ── Tasks: t0:[p0..p5] t1:[p6..p11] │ BroadcastExec: input_partitions=2, consumer_tasks=3, output_partitions=6 - │ PartitionIsolatorExec: tasks=2 partitions=4 - │ DataSourceExec: file_groups={4 groups: [[/testdata/tpcds/plans_sf1_partitions4/warehouse/part-0.parquet], [/testdata/tpcds/plans_sf1_partitions4/warehouse/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/warehouse/part-2.parquet], [/testdata/tpcds/plans_sf1_partitions4/warehouse/part-3.parquet]]}, projection=[substr(w_warehouse_name@2, 1, 20) as w_substr, w_warehouse_sk, CAST(w_warehouse_sk@0 AS Float64) as CAST(sq1.w_warehouse_sk AS Float64)], file_type=parquet + │ DistributedLeafExec: DataSourceExec: file_groups={2 groups: [[/testdata/tpcds/plans_sf1_partitions4/warehouse/part-0.parquet, /testdata/tpcds/plans_sf1_partitions4/warehouse/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/warehouse/part-2.parquet, /testdata/tpcds/plans_sf1_partitions4/warehouse/part-3.parquet]]}, projection=[substr(w_warehouse_name@2, 1, 20) as w_substr, w_warehouse_sk, CAST(w_warehouse_sk@0 AS Float64) as CAST(sq1.w_warehouse_sk AS Float64)], file_type=parquet └────────────────────────────────────────────────── "); Ok(()) @@ -6771,7 +6254,7 @@ mod tests { │ [Stage 4] => NetworkShuffleExec: output_partitions=3, input_tasks=3 └────────────────────────────────────────────────── ┌───── Stage 4 ── Tasks: t0:[p0..p5] t1:[p0..p5] t2:[p0..p5] - │ RepartitionExec: partitioning=Hash([i_manager_id@0, d_moy@1], 6), input_partitions=2 + │ RepartitionExec: partitioning=Hash([i_manager_id@0, d_moy@1], 6), input_partitions=3 │ AggregateExec: mode=Partial, gby=[i_manager_id@0 as i_manager_id, d_moy@2 as d_moy], aggr=[sum(store_sales.ss_sales_price)] │ HashJoinExec: mode=CollectLeft, join_type=Inner, on=[(CAST(store.s_store_sk AS Float64)@1, ss_store_sk@1)], projection=[i_manager_id@2, ss_sales_price@4, d_moy@5] │ CoalescePartitionsExec @@ -6783,28 +6266,24 @@ mod tests { │ HashJoinExec: mode=CollectLeft, join_type=Inner, on=[(i_item_sk@0, ss_item_sk@1)], projection=[i_manager_id@1, ss_sold_date_sk@2, ss_store_sk@4, ss_sales_price@5] │ CoalescePartitionsExec │ [Stage 3] => NetworkBroadcastExec: partitions_per_consumer=3, stage_partitions=9, input_tasks=2 - │ PartitionIsolatorExec: tasks=3 partitions=6 - │ DataSourceExec: file_groups={6 groups: [[/testdata/tpcds/plans_sf1_partitions4/store_sales/part-0.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_sales/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_sales/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_sales/part-2.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_sales/part-2.parquet:..], ...]}, projection=[ss_sold_date_sk, ss_item_sk, ss_store_sk, ss_sales_price], file_type=parquet, predicate=DynamicFilter [ empty ] AND DynamicFilter [ empty ] AND DynamicFilter [ empty ] + │ DistributedLeafExec: DataSourceExec: file_groups={3 groups: [[/testdata/tpcds/plans_sf1_partitions4/store_sales/part-0.parquet:.., /testdata/tpcds/plans_sf1_partitions4/store_sales/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_sales/part-1.parquet:.., /testdata/tpcds/plans_sf1_partitions4/store_sales/part-2.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_sales/part-2.parquet:.., /testdata/tpcds/plans_sf1_partitions4/store_sales/part-3.parquet:..]]}, projection=[ss_sold_date_sk, ss_item_sk, ss_store_sk, ss_sales_price], file_type=parquet, predicate=DynamicFilter [ empty ] AND DynamicFilter [ empty ] AND DynamicFilter [ empty ] └────────────────────────────────────────────────── ┌───── Stage 1 ── Tasks: t0:[p0..p5] t1:[p6..p11] │ BroadcastExec: input_partitions=2, consumer_tasks=3, output_partitions=6 - │ PartitionIsolatorExec: tasks=2 partitions=4 - │ DataSourceExec: file_groups={4 groups: [[/testdata/tpcds/plans_sf1_partitions4/store/part-0.parquet], [/testdata/tpcds/plans_sf1_partitions4/store/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/store/part-2.parquet], [/testdata/tpcds/plans_sf1_partitions4/store/part-3.parquet]]}, projection=[s_store_sk, CAST(s_store_sk@0 AS Float64) as CAST(store.s_store_sk AS Float64)], file_type=parquet + │ DistributedLeafExec: DataSourceExec: file_groups={2 groups: [[/testdata/tpcds/plans_sf1_partitions4/store/part-0.parquet, /testdata/tpcds/plans_sf1_partitions4/store/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/store/part-2.parquet, /testdata/tpcds/plans_sf1_partitions4/store/part-3.parquet]]}, projection=[s_store_sk, CAST(s_store_sk@0 AS Float64) as CAST(store.s_store_sk AS Float64)], file_type=parquet └────────────────────────────────────────────────── ┌───── Stage 2 ── Tasks: t0:[p0..p8] t1:[p9..p17] │ BroadcastExec: input_partitions=3, consumer_tasks=3, output_partitions=9 │ ProjectionExec: expr=[d_date_sk@0 as d_date_sk, d_moy@1 as d_moy, CAST(d_date_sk@0 AS Float64) as CAST(date_dim.d_date_sk AS Float64)] │ FilterExec: d_month_seq@1 IN (SET) ([1200, 1201, 1202, 1203, 1204, 1205, 1206, 1207, 1208, 1209, 1210, 1211]), projection=[d_date_sk@0, d_moy@2] │ RepartitionExec: partitioning=RoundRobinBatch(3), input_partitions=2 - │ PartitionIsolatorExec: tasks=2 partitions=4 - │ DataSourceExec: file_groups={4 groups: [[/testdata/tpcds/plans_sf1_partitions4/date_dim/part-0.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-2.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-3.parquet]]}, projection=[d_date_sk, d_month_seq, d_moy], file_type=parquet, predicate=d_month_seq@3 IN (SET) ([1200, 1201, 1202, 1203, 1204, 1205, 1206, 1207, 1208, 1209, 1210, 1211]), pruning_predicate=d_month_seq_null_count@2 != row_count@3 AND d_month_seq_min@0 <= 1200 AND 1200 <= d_month_seq_max@1 OR d_month_seq_null_count@2 != row_count@3 AND d_month_seq_min@0 <= 1201 AND 1201 <= d_month_seq_max@1 OR d_month_seq_null_count@2 != row_count@3 AND d_month_seq_min@0 <= 1202 AND 1202 <= d_month_seq_max@1 OR d_month_seq_null_count@2 != row_count@3 AND d_month_seq_min@0 <= 1203 AND 1203 <= d_month_seq_max@1 OR d_month_seq_null_count@2 != row_count@3 AND d_month_seq_min@0 <= 1204 AND 1204 <= d_month_seq_max@1 OR d_month_seq_null_count@2 != row_count@3 AND d_month_seq_min@0 <= 1205 AND 1205 <= d_month_seq_max@1 OR d_month_seq_null_count@2 != row_count@3 AND d_month_seq_min@0 <= 1206 AND 1206 <= d_month_seq_max@1 OR d_month_seq_null_count@2 != row_count@3 AND d_month_seq_min@0 <= 1207 AND 1207 <= d_month_seq_max@1 OR d_month_seq_null_count@2 != row_count@3 AND d_month_seq_min@0 <= 1208 AND 1208 <= d_month_seq_max@1 OR d_month_seq_null_count@2 != row_count@3 AND d_month_seq_min@0 <= 1209 AND 1209 <= d_month_seq_max@1 OR d_month_seq_null_count@2 != row_count@3 AND d_month_seq_min@0 <= 1210 AND 1210 <= d_month_seq_max@1 OR d_month_seq_null_count@2 != row_count@3 AND d_month_seq_min@0 <= 1211 AND 1211 <= d_month_seq_max@1, required_guarantees=[d_month_seq in (1200, 1201, 1202, 1203, 1204, 1205, 1206, 1207, 1208, 1209, 1210, 1211)] + │ DistributedLeafExec: DataSourceExec: file_groups={2 groups: [[/testdata/tpcds/plans_sf1_partitions4/date_dim/part-0.parquet, /testdata/tpcds/plans_sf1_partitions4/date_dim/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-2.parquet, /testdata/tpcds/plans_sf1_partitions4/date_dim/part-3.parquet]]}, projection=[d_date_sk, d_month_seq, d_moy], file_type=parquet, predicate=d_month_seq@3 IN (SET) ([1200, 1201, 1202, 1203, 1204, 1205, 1206, 1207, 1208, 1209, 1210, 1211]), pruning_predicate=d_month_seq_null_count@2 != row_count@3 AND d_month_seq_min@0 <= 1200 AND 1200 <= d_month_seq_max@1 OR d_month_seq_null_count@2 != row_count@3 AND d_month_seq_min@0 <= 1201 AND 1201 <= d_month_seq_max@1 OR d_month_seq_null_count@2 != row_count@3 AND d_month_seq_min@0 <= 1202 AND 1202 <= d_month_seq_max@1 OR d_month_seq_null_count@2 != row_count@3 AND d_month_seq_min@0 <= 1203 AND 1203 <= d_month_seq_max@1 OR d_month_seq_null_count@2 != row_count@3 AND d_month_seq_min@0 <= 1204 AND 1204 <= d_month_seq_max@1 OR d_month_seq_null_count@2 != row_count@3 AND d_month_seq_min@0 <= 1205 AND 1205 <= d_month_seq_max@1 OR d_month_seq_null_count@2 != row_count@3 AND d_month_seq_min@0 <= 1206 AND 1206 <= d_month_seq_max@1 OR d_month_seq_null_count@2 != row_count@3 AND d_month_seq_min@0 <= 1207 AND 1207 <= d_month_seq_max@1 OR d_month_seq_null_count@2 != row_count@3 AND d_month_seq_min@0 <= 1208 AND 1208 <= d_month_seq_max@1 OR d_month_seq_null_count@2 != row_count@3 AND d_month_seq_min@0 <= 1209 AND 1209 <= d_month_seq_max@1 OR d_month_seq_null_count@2 != row_count@3 AND d_month_seq_min@0 <= 1210 AND 1210 <= d_month_seq_max@1 OR d_month_seq_null_count@2 != row_count@3 AND d_month_seq_min@0 <= 1211 AND 1211 <= d_month_seq_max@1, required_guarantees=[d_month_seq in (1200, 1201, 1202, 1203, 1204, 1205, 1206, 1207, 1208, 1209, 1210, 1211)] └────────────────────────────────────────────────── ┌───── Stage 3 ── Tasks: t0:[p0..p8] t1:[p9..p17] │ BroadcastExec: input_partitions=3, consumer_tasks=3, output_partitions=9 │ FilterExec: (i_category@3 = Books OR i_category@3 = Children OR i_category@3 = Electronics) AND i_class@2 IN (SET) ([personal, portable, reference, self-help]) AND i_brand@1 IN (SET) ([scholaramalgamalg #14, scholaramalgamalg #7, exportiunivamalg #9, scholaramalgamalg #9]) OR (i_category@3 = Women OR i_category@3 = Music OR i_category@3 = Men) AND i_class@2 IN (SET) ([accessories, classical, fragrances, pants]) AND i_brand@1 IN (SET) ([amalgimporto #1, edu packscholar #1, exportiimporto #1, importoamalg #1]), projection=[i_item_sk@0, i_manager_id@4] │ RepartitionExec: partitioning=RoundRobinBatch(3), input_partitions=2 - │ PartitionIsolatorExec: tasks=2 partitions=4 - │ DataSourceExec: file_groups={4 groups: [[/testdata/tpcds/plans_sf1_partitions4/item/part-0.parquet], [/testdata/tpcds/plans_sf1_partitions4/item/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/item/part-2.parquet], [/testdata/tpcds/plans_sf1_partitions4/item/part-3.parquet]]}, projection=[i_item_sk, i_brand, i_class, i_category, i_manager_id], file_type=parquet, predicate=(i_category@12 = Books OR i_category@12 = Children OR i_category@12 = Electronics) AND i_class@10 IN (SET) ([personal, portable, reference, self-help]) AND i_brand@8 IN (SET) ([scholaramalgamalg #14, scholaramalgamalg #7, exportiunivamalg #9, scholaramalgamalg #9]) OR (i_category@12 = Women OR i_category@12 = Music OR i_category@12 = Men) AND i_class@10 IN (SET) ([accessories, classical, fragrances, pants]) AND i_brand@8 IN (SET) ([amalgimporto #1, edu packscholar #1, exportiimporto #1, importoamalg #1]), pruning_predicate=(i_category_null_count@2 != row_count@3 AND i_category_min@0 <= Books AND Books <= i_category_max@1 OR i_category_null_count@2 != row_count@3 AND i_category_min@0 <= Children AND Children <= i_category_max@1 OR i_category_null_count@2 != row_count@3 AND i_category_min@0 <= Electronics AND Electronics <= i_category_max@1) AND (i_class_null_count@6 != row_count@3 AND i_class_min@4 <= personal AND personal <= i_class_max@5 OR i_class_null_count@6 != row_count@3 AND i_class_min@4 <= portable AND portable <= i_class_max@5 OR i_class_null_count@6 != row_count@3 AND i_class_min@4 <= reference AND reference <= i_class_max@5 OR i_class_null_count@6 != row_count@3 AND i_class_min@4 <= self-help AND self-help <= i_class_max@5) AND (i_brand_null_count@9 != row_count@3 AND i_brand_min@7 <= scholaramalgamalg #14 AND scholaramalgamalg #14 <= i_brand_max@8 OR i_brand_null_count@9 != row_count@3 AND i_brand_min@7 <= scholaramalgamalg #7 AND scholaramalgamalg #7 <= i_brand_max@8 OR i_brand_null_count@9 != row_count@3 AND i_brand_min@7 <= exportiunivamalg #9 AND exportiunivamalg #9 <= i_brand_max@8 OR i_brand_null_count@9 != row_count@3 AND i_brand_min@7 <= scholaramalgamalg #9 AND scholaramalgamalg #9 <= i_brand_max@8) OR (i_category_null_count@2 != row_count@3 AND i_category_min@0 <= Women AND Women <= i_category_max@1 OR i_category_null_count@2 != row_count@3 AND i_category_min@0 <= Music AND Music <= i_category_max@1 OR i_category_null_count@2 != row_count@3 AND i_category_min@0 <= Men AND Men <= i_category_max@1) AND (i_class_null_count@6 != row_count@3 AND i_class_min@4 <= accessories AND accessories <= i_class_max@5 OR i_class_null_count@6 != row_count@3 AND i_class_min@4 <= classical AND classical <= i_class_max@5 OR i_class_null_count@6 != row_count@3 AND i_class_min@4 <= fragrances AND fragrances <= i_class_max@5 OR i_class_null_count@6 != row_count@3 AND i_class_min@4 <= pants AND pants <= i_class_max@5) AND (i_brand_null_count@9 != row_count@3 AND i_brand_min@7 <= amalgimporto #1 AND amalgimporto #1 <= i_brand_max@8 OR i_brand_null_count@9 != row_count@3 AND i_brand_min@7 <= edu packscholar #1 AND edu packscholar #1 <= i_brand_max@8 OR i_brand_null_count@9 != row_count@3 AND i_brand_min@7 <= exportiimporto #1 AND exportiimporto #1 <= i_brand_max@8 OR i_brand_null_count@9 != row_count@3 AND i_brand_min@7 <= importoamalg #1 AND importoamalg #1 <= i_brand_max@8), required_guarantees=[i_brand in (amalgimporto #1, edu packscholar #1, exportiimporto #1, exportiunivamalg #9, importoamalg #1, scholaramalgamalg #14, scholaramalgamalg #7, scholaramalgamalg #9), i_class in (accessories, classical, fragrances, pants, personal, portable, reference, self-help)] + │ DistributedLeafExec: DataSourceExec: file_groups={2 groups: [[/testdata/tpcds/plans_sf1_partitions4/item/part-0.parquet, /testdata/tpcds/plans_sf1_partitions4/item/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/item/part-2.parquet, /testdata/tpcds/plans_sf1_partitions4/item/part-3.parquet]]}, projection=[i_item_sk, i_brand, i_class, i_category, i_manager_id], file_type=parquet, predicate=(i_category@12 = Books OR i_category@12 = Children OR i_category@12 = Electronics) AND i_class@10 IN (SET) ([personal, portable, reference, self-help]) AND i_brand@8 IN (SET) ([scholaramalgamalg #14, scholaramalgamalg #7, exportiunivamalg #9, scholaramalgamalg #9]) OR (i_category@12 = Women OR i_category@12 = Music OR i_category@12 = Men) AND i_class@10 IN (SET) ([accessories, classical, fragrances, pants]) AND i_brand@8 IN (SET) ([amalgimporto #1, edu packscholar #1, exportiimporto #1, importoamalg #1]), pruning_predicate=(i_category_null_count@2 != row_count@3 AND i_category_min@0 <= Books AND Books <= i_category_max@1 OR i_category_null_count@2 != row_count@3 AND i_category_min@0 <= Children AND Children <= i_category_max@1 OR i_category_null_count@2 != row_count@3 AND i_category_min@0 <= Electronics AND Electronics <= i_category_max@1) AND (i_class_null_count@6 != row_count@3 AND i_class_min@4 <= personal AND personal <= i_class_max@5 OR i_class_null_count@6 != row_count@3 AND i_class_min@4 <= portable AND portable <= i_class_max@5 OR i_class_null_count@6 != row_count@3 AND i_class_min@4 <= reference AND reference <= i_class_max@5 OR i_class_null_count@6 != row_count@3 AND i_class_min@4 <= self-help AND self-help <= i_class_max@5) AND (i_brand_null_count@9 != row_count@3 AND i_brand_min@7 <= scholaramalgamalg #14 AND scholaramalgamalg #14 <= i_brand_max@8 OR i_brand_null_count@9 != row_count@3 AND i_brand_min@7 <= scholaramalgamalg #7 AND scholaramalgamalg #7 <= i_brand_max@8 OR i_brand_null_count@9 != row_count@3 AND i_brand_min@7 <= exportiunivamalg #9 AND exportiunivamalg #9 <= i_brand_max@8 OR i_brand_null_count@9 != row_count@3 AND i_brand_min@7 <= scholaramalgamalg #9 AND scholaramalgamalg #9 <= i_brand_max@8) OR (i_category_null_count@2 != row_count@3 AND i_category_min@0 <= Women AND Women <= i_category_max@1 OR i_category_null_count@2 != row_count@3 AND i_category_min@0 <= Music AND Music <= i_category_max@1 OR i_category_null_count@2 != row_count@3 AND i_category_min@0 <= Men AND Men <= i_category_max@1) AND (i_class_null_count@6 != row_count@3 AND i_class_min@4 <= accessories AND accessories <= i_class_max@5 OR i_class_null_count@6 != row_count@3 AND i_class_min@4 <= classical AND classical <= i_class_max@5 OR i_class_null_count@6 != row_count@3 AND i_class_min@4 <= fragrances AND fragrances <= i_class_max@5 OR i_class_null_count@6 != row_count@3 AND i_class_min@4 <= pants AND pants <= i_class_max@5) AND (i_brand_null_count@9 != row_count@3 AND i_brand_min@7 <= amalgimporto #1 AND amalgimporto #1 <= i_brand_max@8 OR i_brand_null_count@9 != row_count@3 AND i_brand_min@7 <= edu packscholar #1 AND edu packscholar #1 <= i_brand_max@8 OR i_brand_null_count@9 != row_count@3 AND i_brand_min@7 <= exportiimporto #1 AND exportiimporto #1 <= i_brand_max@8 OR i_brand_null_count@9 != row_count@3 AND i_brand_min@7 <= importoamalg #1 AND importoamalg #1 <= i_brand_max@8), required_guarantees=[i_brand in (amalgimporto #1, edu packscholar #1, exportiimporto #1, exportiunivamalg #9, importoamalg #1, scholaramalgamalg #14, scholaramalgamalg #7, scholaramalgamalg #9), i_class in (accessories, classical, fragrances, pants, personal, portable, reference, self-help)] └────────────────────────────────────────────────── "#); Ok(()) @@ -6812,7 +6291,7 @@ mod tests { #[tokio::test] async fn test_tpcds_64() -> Result<()> { let display = test_tpcds_query("q64").await?; - assert_snapshot!(display, @" + assert_snapshot!(display, @r" ┌───── DistributedExec ── Tasks: t0:[p0] │ ProjectionExec: expr=[product_name@0 as product_name, store_name@1 as store_name, store_zip@2 as store_zip, b_street_number@3 as b_street_number, b_street_name@4 as b_street_name, b_city@5 as b_city, b_zip@6 as b_zip, c_street_number@7 as c_street_number, c_street_name@8 as c_street_name, c_city@9 as c_city, c_zip@10 as c_zip, cs1syear@11 as cs1syear, cs1cnt@12 as cs1cnt, s11@13 as s11, s21@14 as s21, s31@15 as s31, s12@16 as s12, s22@17 as s22, s32@18 as s32, syear@19 as syear, cnt@20 as cnt] │ SortPreservingMergeExec: [product_name@0 ASC NULLS LAST, store_name@1 ASC NULLS LAST, cnt@20 ASC NULLS LAST, s1@21 ASC NULLS LAST, s1@22 ASC NULLS LAST] @@ -6843,16 +6322,14 @@ mod tests { │ [Stage 21] => NetworkBroadcastExec: partitions_per_consumer=2, stage_partitions=6, input_tasks=3 │ FilterExec: i_color@2 IN (SET) ([purple, burlywood, indian, spring, floral, medium]) AND i_current_price@1 >= Some(6500),4,2 AND i_current_price@1 <= Some(7400),4,2, projection=[i_item_sk@0, i_product_name@3] │ RepartitionExec: partitioning=RoundRobinBatch(3), input_partitions=2 - │ PartitionIsolatorExec: tasks=3 partitions=4 - │ DataSourceExec: file_groups={4 groups: [[/testdata/tpcds/plans_sf1_partitions4/item/part-0.parquet], [/testdata/tpcds/plans_sf1_partitions4/item/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/item/part-2.parquet], [/testdata/tpcds/plans_sf1_partitions4/item/part-3.parquet]]}, projection=[i_item_sk, i_current_price, i_color, i_product_name], file_type=parquet, predicate=i_color@17 IN (SET) ([purple, burlywood, indian, spring, floral, medium]) AND i_current_price@5 >= Some(6500),4,2 AND i_current_price@5 <= Some(7400),4,2 AND DynamicFilter [ empty ], pruning_predicate=(i_color_null_count@2 != row_count@3 AND i_color_min@0 <= purple AND purple <= i_color_max@1 OR i_color_null_count@2 != row_count@3 AND i_color_min@0 <= burlywood AND burlywood <= i_color_max@1 OR i_color_null_count@2 != row_count@3 AND i_color_min@0 <= indian AND indian <= i_color_max@1 OR i_color_null_count@2 != row_count@3 AND i_color_min@0 <= spring AND spring <= i_color_max@1 OR i_color_null_count@2 != row_count@3 AND i_color_min@0 <= floral AND floral <= i_color_max@1 OR i_color_null_count@2 != row_count@3 AND i_color_min@0 <= medium AND medium <= i_color_max@1) AND i_current_price_null_count@5 != row_count@3 AND i_current_price_max@4 >= Some(6500),4,2 AND i_current_price_null_count@5 != row_count@3 AND i_current_price_min@6 <= Some(7400),4,2, required_guarantees=[i_color in (burlywood, floral, indian, medium, purple, spring)] + │ DistributedLeafExec: DataSourceExec: file_groups={2 groups: [[/testdata/tpcds/plans_sf1_partitions4/item/part-0.parquet, /testdata/tpcds/plans_sf1_partitions4/item/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/item/part-2.parquet, /testdata/tpcds/plans_sf1_partitions4/item/part-3.parquet]]}, projection=[i_item_sk, i_current_price, i_color, i_product_name], file_type=parquet, predicate=i_color@17 IN (SET) ([purple, burlywood, indian, spring, floral, medium]) AND i_current_price@5 >= Some(6500),4,2 AND i_current_price@5 <= Some(7400),4,2 AND DynamicFilter [ empty ], pruning_predicate=(i_color_null_count@2 != row_count@3 AND i_color_min@0 <= purple AND purple <= i_color_max@1 OR i_color_null_count@2 != row_count@3 AND i_color_min@0 <= burlywood AND burlywood <= i_color_max@1 OR i_color_null_count@2 != row_count@3 AND i_color_min@0 <= indian AND indian <= i_color_max@1 OR i_color_null_count@2 != row_count@3 AND i_color_min@0 <= spring AND spring <= i_color_max@1 OR i_color_null_count@2 != row_count@3 AND i_color_min@0 <= floral AND floral <= i_color_max@1 OR i_color_null_count@2 != row_count@3 AND i_color_min@0 <= medium AND medium <= i_color_max@1) AND i_current_price_null_count@5 != row_count@3 AND i_current_price_max@4 >= Some(6500),4,2 AND i_current_price_null_count@5 != row_count@3 AND i_current_price_min@6 <= Some(7400),4,2, required_guarantees=[i_color in (burlywood, floral, indian, medium, purple, spring)] └────────────────────────────────────────────────── ┌───── Stage 21 ── Tasks: t0:[p0..p5] t1:[p6..p11] t2:[p12..p17] │ BroadcastExec: input_partitions=2, consumer_tasks=3, output_partitions=6 │ HashJoinExec: mode=CollectLeft, join_type=Inner, on=[(hd_income_band_sk@9, ib_income_band_sk@0)], projection=[ss_item_sk@0, ss_wholesale_cost@1, ss_list_price@2, ss_coupon_amt@3, d_year@4, s_store_name@5, s_zip@6, d_year@7, d_year@8, ca_street_number@10, ca_street_name@11, ca_city@12, ca_zip@13, ca_street_number@14, ca_street_name@15, ca_city@16, ca_zip@17] │ CoalescePartitionsExec │ [Stage 20] => NetworkBroadcastExec: partitions_per_consumer=3, stage_partitions=9, input_tasks=3 - │ PartitionIsolatorExec: tasks=3 partitions=4 - │ DataSourceExec: file_groups={4 groups: [[/testdata/tpcds/plans_sf1_partitions4/income_band/part-0.parquet], [/testdata/tpcds/plans_sf1_partitions4/income_band/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/income_band/part-2.parquet], [/testdata/tpcds/plans_sf1_partitions4/income_band/part-3.parquet]]}, projection=[ib_income_band_sk], file_type=parquet, predicate=DynamicFilter [ empty ] + │ DistributedLeafExec: DataSourceExec: file_groups={2 groups: [[/testdata/tpcds/plans_sf1_partitions4/income_band/part-0.parquet, /testdata/tpcds/plans_sf1_partitions4/income_band/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/income_band/part-2.parquet, /testdata/tpcds/plans_sf1_partitions4/income_band/part-3.parquet]]}, projection=[ib_income_band_sk], file_type=parquet, predicate=DynamicFilter [ empty ] └────────────────────────────────────────────────── ┌───── Stage 20 ── Tasks: t0:[p0..p8] t1:[p9..p17] t2:[p18..p26] │ BroadcastExec: input_partitions=3, consumer_tasks=3, output_partitions=9 @@ -6863,13 +6340,11 @@ mod tests { │ CoalescePartitionsExec │ [Stage 19] => NetworkBroadcastExec: partitions_per_consumer=3, stage_partitions=9, input_tasks=3 │ RepartitionExec: partitioning=RoundRobinBatch(3), input_partitions=2 - │ PartitionIsolatorExec: tasks=3 partitions=4 - │ DataSourceExec: file_groups={4 groups: [[/testdata/tpcds/plans_sf1_partitions4/customer_address/part-0.parquet], [/testdata/tpcds/plans_sf1_partitions4/customer_address/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/customer_address/part-2.parquet], [/testdata/tpcds/plans_sf1_partitions4/customer_address/part-3.parquet]]}, projection=[ca_address_sk, ca_street_number, ca_street_name, ca_city, ca_zip], file_type=parquet, predicate=DynamicFilter [ empty ] + │ DistributedLeafExec: DataSourceExec: file_groups={2 groups: [[/testdata/tpcds/plans_sf1_partitions4/customer_address/part-0.parquet, /testdata/tpcds/plans_sf1_partitions4/customer_address/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/customer_address/part-2.parquet, /testdata/tpcds/plans_sf1_partitions4/customer_address/part-3.parquet]]}, projection=[ca_address_sk, ca_street_number, ca_street_name, ca_city, ca_zip], file_type=parquet, predicate=DynamicFilter [ empty ] └────────────────────────────────────────────────── ┌───── Stage 1 ── Tasks: t0:[p0..p5] t1:[p6..p11] │ BroadcastExec: input_partitions=2, consumer_tasks=3, output_partitions=6 - │ PartitionIsolatorExec: tasks=2 partitions=4 - │ DataSourceExec: file_groups={4 groups: [[/testdata/tpcds/plans_sf1_partitions4/income_band/part-0.parquet], [/testdata/tpcds/plans_sf1_partitions4/income_band/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/income_band/part-2.parquet], [/testdata/tpcds/plans_sf1_partitions4/income_band/part-3.parquet]]}, projection=[ib_income_band_sk], file_type=parquet + │ DistributedLeafExec: DataSourceExec: file_groups={2 groups: [[/testdata/tpcds/plans_sf1_partitions4/income_band/part-0.parquet, /testdata/tpcds/plans_sf1_partitions4/income_band/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/income_band/part-2.parquet, /testdata/tpcds/plans_sf1_partitions4/income_band/part-3.parquet]]}, projection=[ib_income_band_sk], file_type=parquet └────────────────────────────────────────────────── ┌───── Stage 19 ── Tasks: t0:[p0..p8] t1:[p9..p17] t2:[p18..p26] │ BroadcastExec: input_partitions=3, consumer_tasks=3, output_partitions=9 @@ -6878,48 +6353,42 @@ mod tests { │ [Stage 18] => NetworkBroadcastExec: partitions_per_consumer=2, stage_partitions=6, input_tasks=3 │ ProjectionExec: expr=[ca_address_sk@0 as ca_address_sk, ca_street_number@1 as ca_street_number, ca_street_name@2 as ca_street_name, ca_city@3 as ca_city, ca_zip@4 as ca_zip, CAST(ca_address_sk@0 AS Float64) as CAST(ad1.ca_address_sk AS Float64)] │ RepartitionExec: partitioning=RoundRobinBatch(3), input_partitions=2 - │ PartitionIsolatorExec: tasks=3 partitions=4 - │ DataSourceExec: file_groups={4 groups: [[/testdata/tpcds/plans_sf1_partitions4/customer_address/part-0.parquet], [/testdata/tpcds/plans_sf1_partitions4/customer_address/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/customer_address/part-2.parquet], [/testdata/tpcds/plans_sf1_partitions4/customer_address/part-3.parquet]]}, projection=[ca_address_sk, ca_street_number, ca_street_name, ca_city, ca_zip], file_type=parquet, predicate=DynamicFilter [ empty ] + │ DistributedLeafExec: DataSourceExec: file_groups={2 groups: [[/testdata/tpcds/plans_sf1_partitions4/customer_address/part-0.parquet, /testdata/tpcds/plans_sf1_partitions4/customer_address/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/customer_address/part-2.parquet, /testdata/tpcds/plans_sf1_partitions4/customer_address/part-3.parquet]]}, projection=[ca_address_sk, ca_street_number, ca_street_name, ca_city, ca_zip], file_type=parquet, predicate=DynamicFilter [ empty ] └────────────────────────────────────────────────── ┌───── Stage 18 ── Tasks: t0:[p0..p5] t1:[p6..p11] t2:[p12..p17] │ BroadcastExec: input_partitions=2, consumer_tasks=3, output_partitions=6 │ HashJoinExec: mode=CollectLeft, join_type=Inner, on=[(c_current_hdemo_sk@8, CAST(hd2.hd_demo_sk AS Float64)@2)], projection=[ss_item_sk@0, ss_addr_sk@1, ss_wholesale_cost@2, ss_list_price@3, ss_coupon_amt@4, d_year@5, s_store_name@6, s_zip@7, c_current_addr_sk@9, d_year@10, d_year@11, hd_income_band_sk@12, hd_income_band_sk@14] │ CoalescePartitionsExec │ [Stage 17] => NetworkBroadcastExec: partitions_per_consumer=2, stage_partitions=6, input_tasks=3 - │ PartitionIsolatorExec: tasks=3 partitions=4 - │ DataSourceExec: file_groups={4 groups: [[/testdata/tpcds/plans_sf1_partitions4/household_demographics/part-0.parquet], [/testdata/tpcds/plans_sf1_partitions4/household_demographics/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/household_demographics/part-2.parquet], [/testdata/tpcds/plans_sf1_partitions4/household_demographics/part-3.parquet]]}, projection=[hd_demo_sk, hd_income_band_sk, CAST(hd_demo_sk@0 AS Float64) as CAST(hd2.hd_demo_sk AS Float64)], file_type=parquet, predicate=DynamicFilter [ empty ] + │ DistributedLeafExec: DataSourceExec: file_groups={2 groups: [[/testdata/tpcds/plans_sf1_partitions4/household_demographics/part-0.parquet, /testdata/tpcds/plans_sf1_partitions4/household_demographics/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/household_demographics/part-2.parquet, /testdata/tpcds/plans_sf1_partitions4/household_demographics/part-3.parquet]]}, projection=[hd_demo_sk, hd_income_band_sk, CAST(hd_demo_sk@0 AS Float64) as CAST(hd2.hd_demo_sk AS Float64)], file_type=parquet, predicate=DynamicFilter [ empty ] └────────────────────────────────────────────────── ┌───── Stage 17 ── Tasks: t0:[p0..p5] t1:[p6..p11] t2:[p12..p17] │ BroadcastExec: input_partitions=2, consumer_tasks=3, output_partitions=6 │ HashJoinExec: mode=CollectLeft, join_type=Inner, on=[(ss_hdemo_sk@1, CAST(hd1.hd_demo_sk AS Float64)@2)], projection=[ss_item_sk@0, ss_addr_sk@2, ss_wholesale_cost@3, ss_list_price@4, ss_coupon_amt@5, d_year@6, s_store_name@7, s_zip@8, c_current_hdemo_sk@9, c_current_addr_sk@10, d_year@11, d_year@12, hd_income_band_sk@14] │ CoalescePartitionsExec - │ [Stage 16] => NetworkBroadcastExec: partitions_per_consumer=2, stage_partitions=6, input_tasks=3 - │ PartitionIsolatorExec: tasks=3 partitions=4 - │ DataSourceExec: file_groups={4 groups: [[/testdata/tpcds/plans_sf1_partitions4/household_demographics/part-0.parquet], [/testdata/tpcds/plans_sf1_partitions4/household_demographics/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/household_demographics/part-2.parquet], [/testdata/tpcds/plans_sf1_partitions4/household_demographics/part-3.parquet]]}, projection=[hd_demo_sk, hd_income_band_sk, CAST(hd_demo_sk@0 AS Float64) as CAST(hd1.hd_demo_sk AS Float64)], file_type=parquet, predicate=DynamicFilter [ empty ] AND DynamicFilter [ empty ] + │ [Stage 16] => NetworkBroadcastExec: partitions_per_consumer=3, stage_partitions=9, input_tasks=3 + │ DistributedLeafExec: DataSourceExec: file_groups={2 groups: [[/testdata/tpcds/plans_sf1_partitions4/household_demographics/part-0.parquet, /testdata/tpcds/plans_sf1_partitions4/household_demographics/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/household_demographics/part-2.parquet, /testdata/tpcds/plans_sf1_partitions4/household_demographics/part-3.parquet]]}, projection=[hd_demo_sk, hd_income_band_sk, CAST(hd_demo_sk@0 AS Float64) as CAST(hd1.hd_demo_sk AS Float64)], file_type=parquet, predicate=DynamicFilter [ empty ] AND DynamicFilter [ empty ] └────────────────────────────────────────────────── - ┌───── Stage 16 ── Tasks: t0:[p0..p5] t1:[p6..p11] t2:[p12..p17] - │ BroadcastExec: input_partitions=2, consumer_tasks=3, output_partitions=6 + ┌───── Stage 16 ── Tasks: t0:[p0..p8] t1:[p9..p17] t2:[p18..p26] + │ BroadcastExec: input_partitions=3, consumer_tasks=3, output_partitions=9 │ HashJoinExec: mode=CollectLeft, join_type=Inner, on=[(CAST(promotion.p_promo_sk AS Float64)@1, ss_promo_sk@3)], projection=[ss_item_sk@2, ss_hdemo_sk@3, ss_addr_sk@4, ss_wholesale_cost@6, ss_list_price@7, ss_coupon_amt@8, d_year@9, s_store_name@10, s_zip@11, c_current_hdemo_sk@12, c_current_addr_sk@13, d_year@14, d_year@15] │ CoalescePartitionsExec │ [Stage 2] => NetworkBroadcastExec: partitions_per_consumer=2, stage_partitions=6, input_tasks=2 │ HashJoinExec: mode=CollectLeft, join_type=Inner, on=[(c_current_cdemo_sk@10, CAST(cd2.cd_demo_sk AS Float64)@2)], filter=cd_marital_status@1 != cd_marital_status@0, projection=[ss_item_sk@0, ss_hdemo_sk@1, ss_addr_sk@2, ss_promo_sk@3, ss_wholesale_cost@4, ss_list_price@5, ss_coupon_amt@6, d_year@7, s_store_name@8, s_zip@9, c_current_hdemo_sk@11, c_current_addr_sk@12, d_year@13, d_year@14] │ CoalescePartitionsExec - │ [Stage 15] => NetworkBroadcastExec: partitions_per_consumer=2, stage_partitions=6, input_tasks=3 - │ PartitionIsolatorExec: tasks=3 partitions=6 - │ DataSourceExec: file_groups={6 groups: [[/testdata/tpcds/plans_sf1_partitions4/customer_demographics/part-0.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/customer_demographics/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/customer_demographics/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/customer_demographics/part-2.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/customer_demographics/part-2.parquet:..], ...]}, projection=[cd_demo_sk, cd_marital_status, CAST(cd_demo_sk@0 AS Float64) as CAST(cd2.cd_demo_sk AS Float64)], file_type=parquet, predicate=DynamicFilter [ empty ] + │ [Stage 15] => NetworkBroadcastExec: partitions_per_consumer=3, stage_partitions=9, input_tasks=3 + │ DistributedLeafExec: DataSourceExec: file_groups={3 groups: [[/testdata/tpcds/plans_sf1_partitions4/customer_demographics/part-0.parquet:.., /testdata/tpcds/plans_sf1_partitions4/customer_demographics/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/customer_demographics/part-1.parquet:.., /testdata/tpcds/plans_sf1_partitions4/customer_demographics/part-2.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/customer_demographics/part-2.parquet:.., /testdata/tpcds/plans_sf1_partitions4/customer_demographics/part-3.parquet:..]]}, projection=[cd_demo_sk, cd_marital_status, CAST(cd_demo_sk@0 AS Float64) as CAST(cd2.cd_demo_sk AS Float64)], file_type=parquet, predicate=DynamicFilter [ empty ] └────────────────────────────────────────────────── ┌───── Stage 2 ── Tasks: t0:[p0..p5] t1:[p6..p11] │ BroadcastExec: input_partitions=2, consumer_tasks=3, output_partitions=6 - │ PartitionIsolatorExec: tasks=2 partitions=4 - │ DataSourceExec: file_groups={4 groups: [[/testdata/tpcds/plans_sf1_partitions4/promotion/part-0.parquet], [/testdata/tpcds/plans_sf1_partitions4/promotion/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/promotion/part-2.parquet], [/testdata/tpcds/plans_sf1_partitions4/promotion/part-3.parquet]]}, projection=[p_promo_sk, CAST(p_promo_sk@0 AS Float64) as CAST(promotion.p_promo_sk AS Float64)], file_type=parquet + │ DistributedLeafExec: DataSourceExec: file_groups={2 groups: [[/testdata/tpcds/plans_sf1_partitions4/promotion/part-0.parquet, /testdata/tpcds/plans_sf1_partitions4/promotion/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/promotion/part-2.parquet, /testdata/tpcds/plans_sf1_partitions4/promotion/part-3.parquet]]}, projection=[p_promo_sk, CAST(p_promo_sk@0 AS Float64) as CAST(promotion.p_promo_sk AS Float64)], file_type=parquet └────────────────────────────────────────────────── - ┌───── Stage 15 ── Tasks: t0:[p0..p5] t1:[p6..p11] t2:[p12..p17] - │ BroadcastExec: input_partitions=2, consumer_tasks=3, output_partitions=6 + ┌───── Stage 15 ── Tasks: t0:[p0..p8] t1:[p9..p17] t2:[p18..p26] + │ BroadcastExec: input_partitions=3, consumer_tasks=3, output_partitions=9 │ HashJoinExec: mode=CollectLeft, join_type=Inner, on=[(ss_cdemo_sk@1, CAST(cd1.cd_demo_sk AS Float64)@2)], projection=[ss_item_sk@0, ss_hdemo_sk@2, ss_addr_sk@3, ss_promo_sk@4, ss_wholesale_cost@5, ss_list_price@6, ss_coupon_amt@7, d_year@8, s_store_name@9, s_zip@10, c_current_cdemo_sk@11, c_current_hdemo_sk@12, c_current_addr_sk@13, d_year@14, d_year@15, cd_marital_status@17] │ CoalescePartitionsExec │ [Stage 14] => NetworkBroadcastExec: partitions_per_consumer=3, stage_partitions=9, input_tasks=3 - │ PartitionIsolatorExec: tasks=3 partitions=6 - │ DataSourceExec: file_groups={6 groups: [[/testdata/tpcds/plans_sf1_partitions4/customer_demographics/part-0.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/customer_demographics/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/customer_demographics/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/customer_demographics/part-2.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/customer_demographics/part-2.parquet:..], ...]}, projection=[cd_demo_sk, cd_marital_status, CAST(cd_demo_sk@0 AS Float64) as CAST(cd1.cd_demo_sk AS Float64)], file_type=parquet, predicate=DynamicFilter [ empty ] + │ DistributedLeafExec: DataSourceExec: file_groups={3 groups: [[/testdata/tpcds/plans_sf1_partitions4/customer_demographics/part-0.parquet:.., /testdata/tpcds/plans_sf1_partitions4/customer_demographics/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/customer_demographics/part-1.parquet:.., /testdata/tpcds/plans_sf1_partitions4/customer_demographics/part-2.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/customer_demographics/part-2.parquet:.., /testdata/tpcds/plans_sf1_partitions4/customer_demographics/part-3.parquet:..]]}, projection=[cd_demo_sk, cd_marital_status, CAST(cd_demo_sk@0 AS Float64) as CAST(cd1.cd_demo_sk AS Float64)], file_type=parquet, predicate=DynamicFilter [ empty ] └────────────────────────────────────────────────── ┌───── Stage 14 ── Tasks: t0:[p0..p8] t1:[p9..p17] t2:[p18..p26] │ BroadcastExec: input_partitions=3, consumer_tasks=3, output_partitions=9 @@ -6928,8 +6397,7 @@ mod tests { │ [Stage 13] => NetworkBroadcastExec: partitions_per_consumer=3, stage_partitions=9, input_tasks=3 │ ProjectionExec: expr=[d_date_sk@0 as d_date_sk, d_year@1 as d_year, CAST(d_date_sk@0 AS Float64) as CAST(d3.d_date_sk AS Float64)] │ RepartitionExec: partitioning=RoundRobinBatch(3), input_partitions=2 - │ PartitionIsolatorExec: tasks=3 partitions=4 - │ DataSourceExec: file_groups={4 groups: [[/testdata/tpcds/plans_sf1_partitions4/date_dim/part-0.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-2.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-3.parquet]]}, projection=[d_date_sk, d_year], file_type=parquet, predicate=DynamicFilter [ empty ] + │ DistributedLeafExec: DataSourceExec: file_groups={2 groups: [[/testdata/tpcds/plans_sf1_partitions4/date_dim/part-0.parquet, /testdata/tpcds/plans_sf1_partitions4/date_dim/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-2.parquet, /testdata/tpcds/plans_sf1_partitions4/date_dim/part-3.parquet]]}, projection=[d_date_sk, d_year], file_type=parquet, predicate=DynamicFilter [ empty ] └────────────────────────────────────────────────── ┌───── Stage 13 ── Tasks: t0:[p0..p8] t1:[p9..p17] t2:[p18..p26] │ BroadcastExec: input_partitions=3, consumer_tasks=3, output_partitions=9 @@ -6938,8 +6406,7 @@ mod tests { │ [Stage 12] => NetworkBroadcastExec: partitions_per_consumer=3, stage_partitions=9, input_tasks=3 │ ProjectionExec: expr=[d_date_sk@0 as d_date_sk, d_year@1 as d_year, CAST(d_date_sk@0 AS Float64) as CAST(d2.d_date_sk AS Float64)] │ RepartitionExec: partitioning=RoundRobinBatch(3), input_partitions=2 - │ PartitionIsolatorExec: tasks=3 partitions=4 - │ DataSourceExec: file_groups={4 groups: [[/testdata/tpcds/plans_sf1_partitions4/date_dim/part-0.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-2.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-3.parquet]]}, projection=[d_date_sk, d_year], file_type=parquet, predicate=DynamicFilter [ empty ] + │ DistributedLeafExec: DataSourceExec: file_groups={2 groups: [[/testdata/tpcds/plans_sf1_partitions4/date_dim/part-0.parquet, /testdata/tpcds/plans_sf1_partitions4/date_dim/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-2.parquet, /testdata/tpcds/plans_sf1_partitions4/date_dim/part-3.parquet]]}, projection=[d_date_sk, d_year], file_type=parquet, predicate=DynamicFilter [ empty ] └────────────────────────────────────────────────── ┌───── Stage 12 ── Tasks: t0:[p0..p8] t1:[p9..p17] t2:[p18..p26] │ BroadcastExec: input_partitions=3, consumer_tasks=3, output_partitions=9 @@ -6948,8 +6415,7 @@ mod tests { │ [Stage 11] => NetworkBroadcastExec: partitions_per_consumer=3, stage_partitions=9, input_tasks=3 │ ProjectionExec: expr=[c_customer_sk@0 as c_customer_sk, c_current_cdemo_sk@1 as c_current_cdemo_sk, c_current_hdemo_sk@2 as c_current_hdemo_sk, c_current_addr_sk@3 as c_current_addr_sk, c_first_shipto_date_sk@4 as c_first_shipto_date_sk, c_first_sales_date_sk@5 as c_first_sales_date_sk, CAST(c_customer_sk@0 AS Float64) as CAST(customer.c_customer_sk AS Float64)] │ RepartitionExec: partitioning=RoundRobinBatch(3), input_partitions=2 - │ PartitionIsolatorExec: tasks=3 partitions=4 - │ DataSourceExec: file_groups={4 groups: [[/testdata/tpcds/plans_sf1_partitions4/customer/part-0.parquet], [/testdata/tpcds/plans_sf1_partitions4/customer/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/customer/part-2.parquet], [/testdata/tpcds/plans_sf1_partitions4/customer/part-3.parquet]]}, projection=[c_customer_sk, c_current_cdemo_sk, c_current_hdemo_sk, c_current_addr_sk, c_first_shipto_date_sk, c_first_sales_date_sk], file_type=parquet, predicate=DynamicFilter [ empty ] + │ DistributedLeafExec: DataSourceExec: file_groups={2 groups: [[/testdata/tpcds/plans_sf1_partitions4/customer/part-0.parquet, /testdata/tpcds/plans_sf1_partitions4/customer/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/customer/part-2.parquet, /testdata/tpcds/plans_sf1_partitions4/customer/part-3.parquet]]}, projection=[c_customer_sk, c_current_cdemo_sk, c_current_hdemo_sk, c_current_addr_sk, c_first_shipto_date_sk, c_first_sales_date_sk], file_type=parquet, predicate=DynamicFilter [ empty ] └────────────────────────────────────────────────── ┌───── Stage 11 ── Tasks: t0:[p0..p8] t1:[p9..p17] t2:[p18..p26] │ BroadcastExec: input_partitions=3, consumer_tasks=3, output_partitions=9 @@ -6970,16 +6436,14 @@ mod tests { └────────────────────────────────────────────────── ┌───── Stage 3 ── Tasks: t0:[p0..p5] t1:[p6..p11] │ BroadcastExec: input_partitions=2, consumer_tasks=3, output_partitions=6 - │ PartitionIsolatorExec: tasks=2 partitions=4 - │ DataSourceExec: file_groups={4 groups: [[/testdata/tpcds/plans_sf1_partitions4/store/part-0.parquet], [/testdata/tpcds/plans_sf1_partitions4/store/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/store/part-2.parquet], [/testdata/tpcds/plans_sf1_partitions4/store/part-3.parquet]]}, projection=[s_store_sk, s_store_name, s_zip, CAST(s_store_sk@0 AS Float64) as CAST(store.s_store_sk AS Float64)], file_type=parquet + │ DistributedLeafExec: DataSourceExec: file_groups={2 groups: [[/testdata/tpcds/plans_sf1_partitions4/store/part-0.parquet, /testdata/tpcds/plans_sf1_partitions4/store/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/store/part-2.parquet, /testdata/tpcds/plans_sf1_partitions4/store/part-3.parquet]]}, projection=[s_store_sk, s_store_name, s_zip, CAST(s_store_sk@0 AS Float64) as CAST(store.s_store_sk AS Float64)], file_type=parquet └────────────────────────────────────────────────── ┌───── Stage 4 ── Tasks: t0:[p0..p8] t1:[p9..p17] │ BroadcastExec: input_partitions=3, consumer_tasks=3, output_partitions=9 │ ProjectionExec: expr=[d_date_sk@0 as d_date_sk, d_year@1 as d_year, CAST(d_date_sk@0 AS Float64) as CAST(d1.d_date_sk AS Float64)] │ FilterExec: d_year@1 = 1999 │ RepartitionExec: partitioning=RoundRobinBatch(3), input_partitions=2 - │ PartitionIsolatorExec: tasks=2 partitions=4 - │ DataSourceExec: file_groups={4 groups: [[/testdata/tpcds/plans_sf1_partitions4/date_dim/part-0.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-2.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-3.parquet]]}, projection=[d_date_sk, d_year], file_type=parquet, predicate=d_year@6 = 1999, pruning_predicate=d_year_null_count@2 != row_count@3 AND d_year_min@0 <= 1999 AND 1999 <= d_year_max@1, required_guarantees=[d_year in (1999)] + │ DistributedLeafExec: DataSourceExec: file_groups={2 groups: [[/testdata/tpcds/plans_sf1_partitions4/date_dim/part-0.parquet, /testdata/tpcds/plans_sf1_partitions4/date_dim/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-2.parquet, /testdata/tpcds/plans_sf1_partitions4/date_dim/part-3.parquet]]}, projection=[d_date_sk, d_year], file_type=parquet, predicate=d_year@6 = 1999, pruning_predicate=d_year_null_count@2 != row_count@3 AND d_year_min@0 <= 1999 AND 1999 <= d_year_max@1, required_guarantees=[d_year in (1999)] └────────────────────────────────────────────────── ┌───── Stage 8 ── Tasks: t0:[p0..p8] t1:[p9..p17] │ BroadcastExec: input_partitions=3, consumer_tasks=3, output_partitions=9 @@ -6996,24 +6460,20 @@ mod tests { │ [Stage 6] => NetworkShuffleExec: output_partitions=3, input_tasks=3 └────────────────────────────────────────────────── ┌───── Stage 5 ── Tasks: t0:[p0..p8] t1:[p0..p8] t2:[p0..p8] - │ RepartitionExec: partitioning=Hash([cr_item_sk@0, cr_order_number@1], 9), input_partitions=2 - │ PartitionIsolatorExec: tasks=3 partitions=6 - │ DataSourceExec: file_groups={6 groups: [[/testdata/tpcds/plans_sf1_partitions4/catalog_returns/part-0.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/catalog_returns/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/catalog_returns/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/catalog_returns/part-2.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/catalog_returns/part-2.parquet:..], ...]}, projection=[cr_item_sk, cr_order_number, cr_refunded_cash, cr_reversed_charge, cr_store_credit], file_type=parquet + │ RepartitionExec: partitioning=Hash([cr_item_sk@0, cr_order_number@1], 9), input_partitions=3 + │ DistributedLeafExec: DataSourceExec: file_groups={3 groups: [[/testdata/tpcds/plans_sf1_partitions4/catalog_returns/part-0.parquet:.., /testdata/tpcds/plans_sf1_partitions4/catalog_returns/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/catalog_returns/part-1.parquet:.., /testdata/tpcds/plans_sf1_partitions4/catalog_returns/part-2.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/catalog_returns/part-2.parquet:.., /testdata/tpcds/plans_sf1_partitions4/catalog_returns/part-3.parquet:..]]}, projection=[cr_item_sk, cr_order_number, cr_refunded_cash, cr_reversed_charge, cr_store_credit], file_type=parquet └────────────────────────────────────────────────── ┌───── Stage 6 ── Tasks: t0:[p0..p8] t1:[p0..p8] t2:[p0..p8] - │ RepartitionExec: partitioning=Hash([cs_item_sk@0, cs_order_number@1], 9), input_partitions=2 - │ PartitionIsolatorExec: tasks=3 partitions=6 - │ DataSourceExec: file_groups={6 groups: [[/testdata/tpcds/plans_sf1_partitions4/catalog_sales/part-0.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/catalog_sales/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/catalog_sales/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/catalog_sales/part-2.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/catalog_sales/part-2.parquet:..], ...]}, projection=[cs_item_sk, cs_order_number, cs_ext_list_price], file_type=parquet, predicate=DynamicFilter [ empty ] + │ RepartitionExec: partitioning=Hash([cs_item_sk@0, cs_order_number@1], 9), input_partitions=3 + │ DistributedLeafExec: DataSourceExec: file_groups={3 groups: [[/testdata/tpcds/plans_sf1_partitions4/catalog_sales/part-0.parquet:.., /testdata/tpcds/plans_sf1_partitions4/catalog_sales/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/catalog_sales/part-1.parquet:.., /testdata/tpcds/plans_sf1_partitions4/catalog_sales/part-2.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/catalog_sales/part-2.parquet:.., /testdata/tpcds/plans_sf1_partitions4/catalog_sales/part-3.parquet:..]]}, projection=[cs_item_sk, cs_order_number, cs_ext_list_price], file_type=parquet, predicate=DynamicFilter [ empty ] └────────────────────────────────────────────────── ┌───── Stage 9 ── Tasks: t0:[p0..p8] t1:[p0..p8] t2:[p0..p8] - │ RepartitionExec: partitioning=Hash([sr_item_sk@0, sr_ticket_number@1], 9), input_partitions=2 - │ PartitionIsolatorExec: tasks=3 partitions=6 - │ DataSourceExec: file_groups={6 groups: [[/testdata/tpcds/plans_sf1_partitions4/store_returns/part-0.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_returns/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_returns/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_returns/part-2.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_returns/part-2.parquet:..], ...]}, projection=[sr_item_sk, sr_ticket_number], file_type=parquet + │ RepartitionExec: partitioning=Hash([sr_item_sk@0, sr_ticket_number@1], 9), input_partitions=3 + │ DistributedLeafExec: DataSourceExec: file_groups={3 groups: [[/testdata/tpcds/plans_sf1_partitions4/store_returns/part-0.parquet:.., /testdata/tpcds/plans_sf1_partitions4/store_returns/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_returns/part-1.parquet:.., /testdata/tpcds/plans_sf1_partitions4/store_returns/part-2.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_returns/part-2.parquet:.., /testdata/tpcds/plans_sf1_partitions4/store_returns/part-3.parquet:..]]}, projection=[sr_item_sk, sr_ticket_number], file_type=parquet └────────────────────────────────────────────────── ┌───── Stage 10 ── Tasks: t0:[p0..p8] t1:[p0..p8] t2:[p0..p8] - │ RepartitionExec: partitioning=Hash([ss_item_sk@1, ss_ticket_number@8], 9), input_partitions=2 - │ PartitionIsolatorExec: tasks=3 partitions=6 - │ DataSourceExec: file_groups={6 groups: [[/testdata/tpcds/plans_sf1_partitions4/store_sales/part-0.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_sales/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_sales/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_sales/part-2.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_sales/part-2.parquet:..], ...]}, projection=[ss_sold_date_sk, ss_item_sk, ss_customer_sk, ss_cdemo_sk, ss_hdemo_sk, ss_addr_sk, ss_store_sk, ss_promo_sk, ss_ticket_number, ss_wholesale_cost, ss_list_price, ss_coupon_amt], file_type=parquet, predicate=DynamicFilter [ empty ] AND DynamicFilter [ empty ] AND DynamicFilter [ empty ] AND DynamicFilter [ empty ] AND DynamicFilter [ empty ] + │ RepartitionExec: partitioning=Hash([ss_item_sk@1, ss_ticket_number@8], 9), input_partitions=3 + │ DistributedLeafExec: DataSourceExec: file_groups={3 groups: [[/testdata/tpcds/plans_sf1_partitions4/store_sales/part-0.parquet:.., /testdata/tpcds/plans_sf1_partitions4/store_sales/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_sales/part-1.parquet:.., /testdata/tpcds/plans_sf1_partitions4/store_sales/part-2.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_sales/part-2.parquet:.., /testdata/tpcds/plans_sf1_partitions4/store_sales/part-3.parquet:..]]}, projection=[ss_sold_date_sk, ss_item_sk, ss_customer_sk, ss_cdemo_sk, ss_hdemo_sk, ss_addr_sk, ss_store_sk, ss_promo_sk, ss_ticket_number, ss_wholesale_cost, ss_list_price, ss_coupon_amt], file_type=parquet, predicate=DynamicFilter [ empty ] AND DynamicFilter [ empty ] AND DynamicFilter [ empty ] AND DynamicFilter [ empty ] AND DynamicFilter [ empty ] └────────────────────────────────────────────────── ┌───── Stage 45 ── Tasks: t0:[p0..p5] t1:[p0..p5] t2:[p0..p5] │ RepartitionExec: partitioning=Hash([i_product_name@0, i_item_sk@1, s_store_name@2, s_zip@3, ca_street_number@4, ca_street_name@5, ca_city@6, ca_zip@7, ca_street_number@8, ca_street_name@9, ca_city@10, ca_zip@11, d_year@12, d_year@13, d_year@14], 6), input_partitions=3 @@ -7024,16 +6484,14 @@ mod tests { │ [Stage 44] => NetworkBroadcastExec: partitions_per_consumer=2, stage_partitions=6, input_tasks=3 │ FilterExec: i_color@2 IN (SET) ([purple, burlywood, indian, spring, floral, medium]) AND i_current_price@1 >= Some(6500),4,2 AND i_current_price@1 <= Some(7400),4,2, projection=[i_item_sk@0, i_product_name@3] │ RepartitionExec: partitioning=RoundRobinBatch(3), input_partitions=2 - │ PartitionIsolatorExec: tasks=3 partitions=4 - │ DataSourceExec: file_groups={4 groups: [[/testdata/tpcds/plans_sf1_partitions4/item/part-0.parquet], [/testdata/tpcds/plans_sf1_partitions4/item/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/item/part-2.parquet], [/testdata/tpcds/plans_sf1_partitions4/item/part-3.parquet]]}, projection=[i_item_sk, i_current_price, i_color, i_product_name], file_type=parquet, predicate=i_color@17 IN (SET) ([purple, burlywood, indian, spring, floral, medium]) AND i_current_price@5 >= Some(6500),4,2 AND i_current_price@5 <= Some(7400),4,2 AND DynamicFilter [ empty ], pruning_predicate=(i_color_null_count@2 != row_count@3 AND i_color_min@0 <= purple AND purple <= i_color_max@1 OR i_color_null_count@2 != row_count@3 AND i_color_min@0 <= burlywood AND burlywood <= i_color_max@1 OR i_color_null_count@2 != row_count@3 AND i_color_min@0 <= indian AND indian <= i_color_max@1 OR i_color_null_count@2 != row_count@3 AND i_color_min@0 <= spring AND spring <= i_color_max@1 OR i_color_null_count@2 != row_count@3 AND i_color_min@0 <= floral AND floral <= i_color_max@1 OR i_color_null_count@2 != row_count@3 AND i_color_min@0 <= medium AND medium <= i_color_max@1) AND i_current_price_null_count@5 != row_count@3 AND i_current_price_max@4 >= Some(6500),4,2 AND i_current_price_null_count@5 != row_count@3 AND i_current_price_min@6 <= Some(7400),4,2, required_guarantees=[i_color in (burlywood, floral, indian, medium, purple, spring)] + │ DistributedLeafExec: DataSourceExec: file_groups={2 groups: [[/testdata/tpcds/plans_sf1_partitions4/item/part-0.parquet, /testdata/tpcds/plans_sf1_partitions4/item/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/item/part-2.parquet, /testdata/tpcds/plans_sf1_partitions4/item/part-3.parquet]]}, projection=[i_item_sk, i_current_price, i_color, i_product_name], file_type=parquet, predicate=i_color@17 IN (SET) ([purple, burlywood, indian, spring, floral, medium]) AND i_current_price@5 >= Some(6500),4,2 AND i_current_price@5 <= Some(7400),4,2 AND DynamicFilter [ empty ], pruning_predicate=(i_color_null_count@2 != row_count@3 AND i_color_min@0 <= purple AND purple <= i_color_max@1 OR i_color_null_count@2 != row_count@3 AND i_color_min@0 <= burlywood AND burlywood <= i_color_max@1 OR i_color_null_count@2 != row_count@3 AND i_color_min@0 <= indian AND indian <= i_color_max@1 OR i_color_null_count@2 != row_count@3 AND i_color_min@0 <= spring AND spring <= i_color_max@1 OR i_color_null_count@2 != row_count@3 AND i_color_min@0 <= floral AND floral <= i_color_max@1 OR i_color_null_count@2 != row_count@3 AND i_color_min@0 <= medium AND medium <= i_color_max@1) AND i_current_price_null_count@5 != row_count@3 AND i_current_price_max@4 >= Some(6500),4,2 AND i_current_price_null_count@5 != row_count@3 AND i_current_price_min@6 <= Some(7400),4,2, required_guarantees=[i_color in (burlywood, floral, indian, medium, purple, spring)] └────────────────────────────────────────────────── ┌───── Stage 44 ── Tasks: t0:[p0..p5] t1:[p6..p11] t2:[p12..p17] │ BroadcastExec: input_partitions=2, consumer_tasks=3, output_partitions=6 │ HashJoinExec: mode=CollectLeft, join_type=Inner, on=[(hd_income_band_sk@9, ib_income_band_sk@0)], projection=[ss_item_sk@0, ss_wholesale_cost@1, ss_list_price@2, ss_coupon_amt@3, d_year@4, s_store_name@5, s_zip@6, d_year@7, d_year@8, ca_street_number@10, ca_street_name@11, ca_city@12, ca_zip@13, ca_street_number@14, ca_street_name@15, ca_city@16, ca_zip@17] │ CoalescePartitionsExec │ [Stage 43] => NetworkBroadcastExec: partitions_per_consumer=3, stage_partitions=9, input_tasks=3 - │ PartitionIsolatorExec: tasks=3 partitions=4 - │ DataSourceExec: file_groups={4 groups: [[/testdata/tpcds/plans_sf1_partitions4/income_band/part-0.parquet], [/testdata/tpcds/plans_sf1_partitions4/income_band/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/income_band/part-2.parquet], [/testdata/tpcds/plans_sf1_partitions4/income_band/part-3.parquet]]}, projection=[ib_income_band_sk], file_type=parquet, predicate=DynamicFilter [ empty ] + │ DistributedLeafExec: DataSourceExec: file_groups={2 groups: [[/testdata/tpcds/plans_sf1_partitions4/income_band/part-0.parquet, /testdata/tpcds/plans_sf1_partitions4/income_band/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/income_band/part-2.parquet, /testdata/tpcds/plans_sf1_partitions4/income_band/part-3.parquet]]}, projection=[ib_income_band_sk], file_type=parquet, predicate=DynamicFilter [ empty ] └────────────────────────────────────────────────── ┌───── Stage 43 ── Tasks: t0:[p0..p8] t1:[p9..p17] t2:[p18..p26] │ BroadcastExec: input_partitions=3, consumer_tasks=3, output_partitions=9 @@ -7044,13 +6502,11 @@ mod tests { │ CoalescePartitionsExec │ [Stage 42] => NetworkBroadcastExec: partitions_per_consumer=3, stage_partitions=9, input_tasks=3 │ RepartitionExec: partitioning=RoundRobinBatch(3), input_partitions=2 - │ PartitionIsolatorExec: tasks=3 partitions=4 - │ DataSourceExec: file_groups={4 groups: [[/testdata/tpcds/plans_sf1_partitions4/customer_address/part-0.parquet], [/testdata/tpcds/plans_sf1_partitions4/customer_address/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/customer_address/part-2.parquet], [/testdata/tpcds/plans_sf1_partitions4/customer_address/part-3.parquet]]}, projection=[ca_address_sk, ca_street_number, ca_street_name, ca_city, ca_zip], file_type=parquet, predicate=DynamicFilter [ empty ] + │ DistributedLeafExec: DataSourceExec: file_groups={2 groups: [[/testdata/tpcds/plans_sf1_partitions4/customer_address/part-0.parquet, /testdata/tpcds/plans_sf1_partitions4/customer_address/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/customer_address/part-2.parquet, /testdata/tpcds/plans_sf1_partitions4/customer_address/part-3.parquet]]}, projection=[ca_address_sk, ca_street_number, ca_street_name, ca_city, ca_zip], file_type=parquet, predicate=DynamicFilter [ empty ] └────────────────────────────────────────────────── ┌───── Stage 24 ── Tasks: t0:[p0..p5] t1:[p6..p11] │ BroadcastExec: input_partitions=2, consumer_tasks=3, output_partitions=6 - │ PartitionIsolatorExec: tasks=2 partitions=4 - │ DataSourceExec: file_groups={4 groups: [[/testdata/tpcds/plans_sf1_partitions4/income_band/part-0.parquet], [/testdata/tpcds/plans_sf1_partitions4/income_band/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/income_band/part-2.parquet], [/testdata/tpcds/plans_sf1_partitions4/income_band/part-3.parquet]]}, projection=[ib_income_band_sk], file_type=parquet + │ DistributedLeafExec: DataSourceExec: file_groups={2 groups: [[/testdata/tpcds/plans_sf1_partitions4/income_band/part-0.parquet, /testdata/tpcds/plans_sf1_partitions4/income_band/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/income_band/part-2.parquet, /testdata/tpcds/plans_sf1_partitions4/income_band/part-3.parquet]]}, projection=[ib_income_band_sk], file_type=parquet └────────────────────────────────────────────────── ┌───── Stage 42 ── Tasks: t0:[p0..p8] t1:[p9..p17] t2:[p18..p26] │ BroadcastExec: input_partitions=3, consumer_tasks=3, output_partitions=9 @@ -7059,48 +6515,42 @@ mod tests { │ [Stage 41] => NetworkBroadcastExec: partitions_per_consumer=2, stage_partitions=6, input_tasks=3 │ ProjectionExec: expr=[ca_address_sk@0 as ca_address_sk, ca_street_number@1 as ca_street_number, ca_street_name@2 as ca_street_name, ca_city@3 as ca_city, ca_zip@4 as ca_zip, CAST(ca_address_sk@0 AS Float64) as CAST(ad1.ca_address_sk AS Float64)] │ RepartitionExec: partitioning=RoundRobinBatch(3), input_partitions=2 - │ PartitionIsolatorExec: tasks=3 partitions=4 - │ DataSourceExec: file_groups={4 groups: [[/testdata/tpcds/plans_sf1_partitions4/customer_address/part-0.parquet], [/testdata/tpcds/plans_sf1_partitions4/customer_address/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/customer_address/part-2.parquet], [/testdata/tpcds/plans_sf1_partitions4/customer_address/part-3.parquet]]}, projection=[ca_address_sk, ca_street_number, ca_street_name, ca_city, ca_zip], file_type=parquet, predicate=DynamicFilter [ empty ] + │ DistributedLeafExec: DataSourceExec: file_groups={2 groups: [[/testdata/tpcds/plans_sf1_partitions4/customer_address/part-0.parquet, /testdata/tpcds/plans_sf1_partitions4/customer_address/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/customer_address/part-2.parquet, /testdata/tpcds/plans_sf1_partitions4/customer_address/part-3.parquet]]}, projection=[ca_address_sk, ca_street_number, ca_street_name, ca_city, ca_zip], file_type=parquet, predicate=DynamicFilter [ empty ] └────────────────────────────────────────────────── ┌───── Stage 41 ── Tasks: t0:[p0..p5] t1:[p6..p11] t2:[p12..p17] │ BroadcastExec: input_partitions=2, consumer_tasks=3, output_partitions=6 │ HashJoinExec: mode=CollectLeft, join_type=Inner, on=[(c_current_hdemo_sk@8, CAST(hd2.hd_demo_sk AS Float64)@2)], projection=[ss_item_sk@0, ss_addr_sk@1, ss_wholesale_cost@2, ss_list_price@3, ss_coupon_amt@4, d_year@5, s_store_name@6, s_zip@7, c_current_addr_sk@9, d_year@10, d_year@11, hd_income_band_sk@12, hd_income_band_sk@14] │ CoalescePartitionsExec │ [Stage 40] => NetworkBroadcastExec: partitions_per_consumer=2, stage_partitions=6, input_tasks=3 - │ PartitionIsolatorExec: tasks=3 partitions=4 - │ DataSourceExec: file_groups={4 groups: [[/testdata/tpcds/plans_sf1_partitions4/household_demographics/part-0.parquet], [/testdata/tpcds/plans_sf1_partitions4/household_demographics/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/household_demographics/part-2.parquet], [/testdata/tpcds/plans_sf1_partitions4/household_demographics/part-3.parquet]]}, projection=[hd_demo_sk, hd_income_band_sk, CAST(hd_demo_sk@0 AS Float64) as CAST(hd2.hd_demo_sk AS Float64)], file_type=parquet, predicate=DynamicFilter [ empty ] + │ DistributedLeafExec: DataSourceExec: file_groups={2 groups: [[/testdata/tpcds/plans_sf1_partitions4/household_demographics/part-0.parquet, /testdata/tpcds/plans_sf1_partitions4/household_demographics/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/household_demographics/part-2.parquet, /testdata/tpcds/plans_sf1_partitions4/household_demographics/part-3.parquet]]}, projection=[hd_demo_sk, hd_income_band_sk, CAST(hd_demo_sk@0 AS Float64) as CAST(hd2.hd_demo_sk AS Float64)], file_type=parquet, predicate=DynamicFilter [ empty ] └────────────────────────────────────────────────── ┌───── Stage 40 ── Tasks: t0:[p0..p5] t1:[p6..p11] t2:[p12..p17] │ BroadcastExec: input_partitions=2, consumer_tasks=3, output_partitions=6 │ HashJoinExec: mode=CollectLeft, join_type=Inner, on=[(ss_hdemo_sk@1, CAST(hd1.hd_demo_sk AS Float64)@2)], projection=[ss_item_sk@0, ss_addr_sk@2, ss_wholesale_cost@3, ss_list_price@4, ss_coupon_amt@5, d_year@6, s_store_name@7, s_zip@8, c_current_hdemo_sk@9, c_current_addr_sk@10, d_year@11, d_year@12, hd_income_band_sk@14] │ CoalescePartitionsExec - │ [Stage 39] => NetworkBroadcastExec: partitions_per_consumer=2, stage_partitions=6, input_tasks=3 - │ PartitionIsolatorExec: tasks=3 partitions=4 - │ DataSourceExec: file_groups={4 groups: [[/testdata/tpcds/plans_sf1_partitions4/household_demographics/part-0.parquet], [/testdata/tpcds/plans_sf1_partitions4/household_demographics/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/household_demographics/part-2.parquet], [/testdata/tpcds/plans_sf1_partitions4/household_demographics/part-3.parquet]]}, projection=[hd_demo_sk, hd_income_band_sk, CAST(hd_demo_sk@0 AS Float64) as CAST(hd1.hd_demo_sk AS Float64)], file_type=parquet, predicate=DynamicFilter [ empty ] AND DynamicFilter [ empty ] + │ [Stage 39] => NetworkBroadcastExec: partitions_per_consumer=3, stage_partitions=9, input_tasks=3 + │ DistributedLeafExec: DataSourceExec: file_groups={2 groups: [[/testdata/tpcds/plans_sf1_partitions4/household_demographics/part-0.parquet, /testdata/tpcds/plans_sf1_partitions4/household_demographics/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/household_demographics/part-2.parquet, /testdata/tpcds/plans_sf1_partitions4/household_demographics/part-3.parquet]]}, projection=[hd_demo_sk, hd_income_band_sk, CAST(hd_demo_sk@0 AS Float64) as CAST(hd1.hd_demo_sk AS Float64)], file_type=parquet, predicate=DynamicFilter [ empty ] AND DynamicFilter [ empty ] └────────────────────────────────────────────────── - ┌───── Stage 39 ── Tasks: t0:[p0..p5] t1:[p6..p11] t2:[p12..p17] - │ BroadcastExec: input_partitions=2, consumer_tasks=3, output_partitions=6 + ┌───── Stage 39 ── Tasks: t0:[p0..p8] t1:[p9..p17] t2:[p18..p26] + │ BroadcastExec: input_partitions=3, consumer_tasks=3, output_partitions=9 │ HashJoinExec: mode=CollectLeft, join_type=Inner, on=[(CAST(promotion.p_promo_sk AS Float64)@1, ss_promo_sk@3)], projection=[ss_item_sk@2, ss_hdemo_sk@3, ss_addr_sk@4, ss_wholesale_cost@6, ss_list_price@7, ss_coupon_amt@8, d_year@9, s_store_name@10, s_zip@11, c_current_hdemo_sk@12, c_current_addr_sk@13, d_year@14, d_year@15] │ CoalescePartitionsExec │ [Stage 25] => NetworkBroadcastExec: partitions_per_consumer=2, stage_partitions=6, input_tasks=2 │ HashJoinExec: mode=CollectLeft, join_type=Inner, on=[(c_current_cdemo_sk@10, CAST(cd2.cd_demo_sk AS Float64)@2)], filter=cd_marital_status@1 != cd_marital_status@0, projection=[ss_item_sk@0, ss_hdemo_sk@1, ss_addr_sk@2, ss_promo_sk@3, ss_wholesale_cost@4, ss_list_price@5, ss_coupon_amt@6, d_year@7, s_store_name@8, s_zip@9, c_current_hdemo_sk@11, c_current_addr_sk@12, d_year@13, d_year@14] │ CoalescePartitionsExec - │ [Stage 38] => NetworkBroadcastExec: partitions_per_consumer=2, stage_partitions=6, input_tasks=3 - │ PartitionIsolatorExec: tasks=3 partitions=6 - │ DataSourceExec: file_groups={6 groups: [[/testdata/tpcds/plans_sf1_partitions4/customer_demographics/part-0.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/customer_demographics/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/customer_demographics/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/customer_demographics/part-2.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/customer_demographics/part-2.parquet:..], ...]}, projection=[cd_demo_sk, cd_marital_status, CAST(cd_demo_sk@0 AS Float64) as CAST(cd2.cd_demo_sk AS Float64)], file_type=parquet, predicate=DynamicFilter [ empty ] + │ [Stage 38] => NetworkBroadcastExec: partitions_per_consumer=3, stage_partitions=9, input_tasks=3 + │ DistributedLeafExec: DataSourceExec: file_groups={3 groups: [[/testdata/tpcds/plans_sf1_partitions4/customer_demographics/part-0.parquet:.., /testdata/tpcds/plans_sf1_partitions4/customer_demographics/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/customer_demographics/part-1.parquet:.., /testdata/tpcds/plans_sf1_partitions4/customer_demographics/part-2.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/customer_demographics/part-2.parquet:.., /testdata/tpcds/plans_sf1_partitions4/customer_demographics/part-3.parquet:..]]}, projection=[cd_demo_sk, cd_marital_status, CAST(cd_demo_sk@0 AS Float64) as CAST(cd2.cd_demo_sk AS Float64)], file_type=parquet, predicate=DynamicFilter [ empty ] └────────────────────────────────────────────────── ┌───── Stage 25 ── Tasks: t0:[p0..p5] t1:[p6..p11] │ BroadcastExec: input_partitions=2, consumer_tasks=3, output_partitions=6 - │ PartitionIsolatorExec: tasks=2 partitions=4 - │ DataSourceExec: file_groups={4 groups: [[/testdata/tpcds/plans_sf1_partitions4/promotion/part-0.parquet], [/testdata/tpcds/plans_sf1_partitions4/promotion/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/promotion/part-2.parquet], [/testdata/tpcds/plans_sf1_partitions4/promotion/part-3.parquet]]}, projection=[p_promo_sk, CAST(p_promo_sk@0 AS Float64) as CAST(promotion.p_promo_sk AS Float64)], file_type=parquet + │ DistributedLeafExec: DataSourceExec: file_groups={2 groups: [[/testdata/tpcds/plans_sf1_partitions4/promotion/part-0.parquet, /testdata/tpcds/plans_sf1_partitions4/promotion/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/promotion/part-2.parquet, /testdata/tpcds/plans_sf1_partitions4/promotion/part-3.parquet]]}, projection=[p_promo_sk, CAST(p_promo_sk@0 AS Float64) as CAST(promotion.p_promo_sk AS Float64)], file_type=parquet └────────────────────────────────────────────────── - ┌───── Stage 38 ── Tasks: t0:[p0..p5] t1:[p6..p11] t2:[p12..p17] - │ BroadcastExec: input_partitions=2, consumer_tasks=3, output_partitions=6 + ┌───── Stage 38 ── Tasks: t0:[p0..p8] t1:[p9..p17] t2:[p18..p26] + │ BroadcastExec: input_partitions=3, consumer_tasks=3, output_partitions=9 │ HashJoinExec: mode=CollectLeft, join_type=Inner, on=[(ss_cdemo_sk@1, CAST(cd1.cd_demo_sk AS Float64)@2)], projection=[ss_item_sk@0, ss_hdemo_sk@2, ss_addr_sk@3, ss_promo_sk@4, ss_wholesale_cost@5, ss_list_price@6, ss_coupon_amt@7, d_year@8, s_store_name@9, s_zip@10, c_current_cdemo_sk@11, c_current_hdemo_sk@12, c_current_addr_sk@13, d_year@14, d_year@15, cd_marital_status@17] │ CoalescePartitionsExec │ [Stage 37] => NetworkBroadcastExec: partitions_per_consumer=3, stage_partitions=9, input_tasks=3 - │ PartitionIsolatorExec: tasks=3 partitions=6 - │ DataSourceExec: file_groups={6 groups: [[/testdata/tpcds/plans_sf1_partitions4/customer_demographics/part-0.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/customer_demographics/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/customer_demographics/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/customer_demographics/part-2.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/customer_demographics/part-2.parquet:..], ...]}, projection=[cd_demo_sk, cd_marital_status, CAST(cd_demo_sk@0 AS Float64) as CAST(cd1.cd_demo_sk AS Float64)], file_type=parquet, predicate=DynamicFilter [ empty ] + │ DistributedLeafExec: DataSourceExec: file_groups={3 groups: [[/testdata/tpcds/plans_sf1_partitions4/customer_demographics/part-0.parquet:.., /testdata/tpcds/plans_sf1_partitions4/customer_demographics/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/customer_demographics/part-1.parquet:.., /testdata/tpcds/plans_sf1_partitions4/customer_demographics/part-2.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/customer_demographics/part-2.parquet:.., /testdata/tpcds/plans_sf1_partitions4/customer_demographics/part-3.parquet:..]]}, projection=[cd_demo_sk, cd_marital_status, CAST(cd_demo_sk@0 AS Float64) as CAST(cd1.cd_demo_sk AS Float64)], file_type=parquet, predicate=DynamicFilter [ empty ] └────────────────────────────────────────────────── ┌───── Stage 37 ── Tasks: t0:[p0..p8] t1:[p9..p17] t2:[p18..p26] │ BroadcastExec: input_partitions=3, consumer_tasks=3, output_partitions=9 @@ -7109,8 +6559,7 @@ mod tests { │ [Stage 36] => NetworkBroadcastExec: partitions_per_consumer=3, stage_partitions=9, input_tasks=3 │ ProjectionExec: expr=[d_date_sk@0 as d_date_sk, d_year@1 as d_year, CAST(d_date_sk@0 AS Float64) as CAST(d3.d_date_sk AS Float64)] │ RepartitionExec: partitioning=RoundRobinBatch(3), input_partitions=2 - │ PartitionIsolatorExec: tasks=3 partitions=4 - │ DataSourceExec: file_groups={4 groups: [[/testdata/tpcds/plans_sf1_partitions4/date_dim/part-0.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-2.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-3.parquet]]}, projection=[d_date_sk, d_year], file_type=parquet, predicate=DynamicFilter [ empty ] + │ DistributedLeafExec: DataSourceExec: file_groups={2 groups: [[/testdata/tpcds/plans_sf1_partitions4/date_dim/part-0.parquet, /testdata/tpcds/plans_sf1_partitions4/date_dim/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-2.parquet, /testdata/tpcds/plans_sf1_partitions4/date_dim/part-3.parquet]]}, projection=[d_date_sk, d_year], file_type=parquet, predicate=DynamicFilter [ empty ] └────────────────────────────────────────────────── ┌───── Stage 36 ── Tasks: t0:[p0..p8] t1:[p9..p17] t2:[p18..p26] │ BroadcastExec: input_partitions=3, consumer_tasks=3, output_partitions=9 @@ -7119,8 +6568,7 @@ mod tests { │ [Stage 35] => NetworkBroadcastExec: partitions_per_consumer=3, stage_partitions=9, input_tasks=3 │ ProjectionExec: expr=[d_date_sk@0 as d_date_sk, d_year@1 as d_year, CAST(d_date_sk@0 AS Float64) as CAST(d2.d_date_sk AS Float64)] │ RepartitionExec: partitioning=RoundRobinBatch(3), input_partitions=2 - │ PartitionIsolatorExec: tasks=3 partitions=4 - │ DataSourceExec: file_groups={4 groups: [[/testdata/tpcds/plans_sf1_partitions4/date_dim/part-0.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-2.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-3.parquet]]}, projection=[d_date_sk, d_year], file_type=parquet, predicate=DynamicFilter [ empty ] + │ DistributedLeafExec: DataSourceExec: file_groups={2 groups: [[/testdata/tpcds/plans_sf1_partitions4/date_dim/part-0.parquet, /testdata/tpcds/plans_sf1_partitions4/date_dim/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-2.parquet, /testdata/tpcds/plans_sf1_partitions4/date_dim/part-3.parquet]]}, projection=[d_date_sk, d_year], file_type=parquet, predicate=DynamicFilter [ empty ] └────────────────────────────────────────────────── ┌───── Stage 35 ── Tasks: t0:[p0..p8] t1:[p9..p17] t2:[p18..p26] │ BroadcastExec: input_partitions=3, consumer_tasks=3, output_partitions=9 @@ -7129,8 +6577,7 @@ mod tests { │ [Stage 34] => NetworkBroadcastExec: partitions_per_consumer=3, stage_partitions=9, input_tasks=3 │ ProjectionExec: expr=[c_customer_sk@0 as c_customer_sk, c_current_cdemo_sk@1 as c_current_cdemo_sk, c_current_hdemo_sk@2 as c_current_hdemo_sk, c_current_addr_sk@3 as c_current_addr_sk, c_first_shipto_date_sk@4 as c_first_shipto_date_sk, c_first_sales_date_sk@5 as c_first_sales_date_sk, CAST(c_customer_sk@0 AS Float64) as CAST(customer.c_customer_sk AS Float64)] │ RepartitionExec: partitioning=RoundRobinBatch(3), input_partitions=2 - │ PartitionIsolatorExec: tasks=3 partitions=4 - │ DataSourceExec: file_groups={4 groups: [[/testdata/tpcds/plans_sf1_partitions4/customer/part-0.parquet], [/testdata/tpcds/plans_sf1_partitions4/customer/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/customer/part-2.parquet], [/testdata/tpcds/plans_sf1_partitions4/customer/part-3.parquet]]}, projection=[c_customer_sk, c_current_cdemo_sk, c_current_hdemo_sk, c_current_addr_sk, c_first_shipto_date_sk, c_first_sales_date_sk], file_type=parquet, predicate=DynamicFilter [ empty ] + │ DistributedLeafExec: DataSourceExec: file_groups={2 groups: [[/testdata/tpcds/plans_sf1_partitions4/customer/part-0.parquet, /testdata/tpcds/plans_sf1_partitions4/customer/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/customer/part-2.parquet, /testdata/tpcds/plans_sf1_partitions4/customer/part-3.parquet]]}, projection=[c_customer_sk, c_current_cdemo_sk, c_current_hdemo_sk, c_current_addr_sk, c_first_shipto_date_sk, c_first_sales_date_sk], file_type=parquet, predicate=DynamicFilter [ empty ] └────────────────────────────────────────────────── ┌───── Stage 34 ── Tasks: t0:[p0..p8] t1:[p9..p17] t2:[p18..p26] │ BroadcastExec: input_partitions=3, consumer_tasks=3, output_partitions=9 @@ -7151,16 +6598,14 @@ mod tests { └────────────────────────────────────────────────── ┌───── Stage 26 ── Tasks: t0:[p0..p5] t1:[p6..p11] │ BroadcastExec: input_partitions=2, consumer_tasks=3, output_partitions=6 - │ PartitionIsolatorExec: tasks=2 partitions=4 - │ DataSourceExec: file_groups={4 groups: [[/testdata/tpcds/plans_sf1_partitions4/store/part-0.parquet], [/testdata/tpcds/plans_sf1_partitions4/store/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/store/part-2.parquet], [/testdata/tpcds/plans_sf1_partitions4/store/part-3.parquet]]}, projection=[s_store_sk, s_store_name, s_zip, CAST(s_store_sk@0 AS Float64) as CAST(store.s_store_sk AS Float64)], file_type=parquet + │ DistributedLeafExec: DataSourceExec: file_groups={2 groups: [[/testdata/tpcds/plans_sf1_partitions4/store/part-0.parquet, /testdata/tpcds/plans_sf1_partitions4/store/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/store/part-2.parquet, /testdata/tpcds/plans_sf1_partitions4/store/part-3.parquet]]}, projection=[s_store_sk, s_store_name, s_zip, CAST(s_store_sk@0 AS Float64) as CAST(store.s_store_sk AS Float64)], file_type=parquet └────────────────────────────────────────────────── ┌───── Stage 27 ── Tasks: t0:[p0..p8] t1:[p9..p17] │ BroadcastExec: input_partitions=3, consumer_tasks=3, output_partitions=9 │ ProjectionExec: expr=[d_date_sk@0 as d_date_sk, d_year@1 as d_year, CAST(d_date_sk@0 AS Float64) as CAST(d1.d_date_sk AS Float64)] │ FilterExec: d_year@1 = 2000 │ RepartitionExec: partitioning=RoundRobinBatch(3), input_partitions=2 - │ PartitionIsolatorExec: tasks=2 partitions=4 - │ DataSourceExec: file_groups={4 groups: [[/testdata/tpcds/plans_sf1_partitions4/date_dim/part-0.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-2.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-3.parquet]]}, projection=[d_date_sk, d_year], file_type=parquet, predicate=d_year@6 = 2000, pruning_predicate=d_year_null_count@2 != row_count@3 AND d_year_min@0 <= 2000 AND 2000 <= d_year_max@1, required_guarantees=[d_year in (2000)] + │ DistributedLeafExec: DataSourceExec: file_groups={2 groups: [[/testdata/tpcds/plans_sf1_partitions4/date_dim/part-0.parquet, /testdata/tpcds/plans_sf1_partitions4/date_dim/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-2.parquet, /testdata/tpcds/plans_sf1_partitions4/date_dim/part-3.parquet]]}, projection=[d_date_sk, d_year], file_type=parquet, predicate=d_year@6 = 2000, pruning_predicate=d_year_null_count@2 != row_count@3 AND d_year_min@0 <= 2000 AND 2000 <= d_year_max@1, required_guarantees=[d_year in (2000)] └────────────────────────────────────────────────── ┌───── Stage 31 ── Tasks: t0:[p0..p8] t1:[p9..p17] │ BroadcastExec: input_partitions=3, consumer_tasks=3, output_partitions=9 @@ -7177,24 +6622,20 @@ mod tests { │ [Stage 29] => NetworkShuffleExec: output_partitions=3, input_tasks=3 └────────────────────────────────────────────────── ┌───── Stage 28 ── Tasks: t0:[p0..p8] t1:[p0..p8] t2:[p0..p8] - │ RepartitionExec: partitioning=Hash([cr_item_sk@0, cr_order_number@1], 9), input_partitions=2 - │ PartitionIsolatorExec: tasks=3 partitions=6 - │ DataSourceExec: file_groups={6 groups: [[/testdata/tpcds/plans_sf1_partitions4/catalog_returns/part-0.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/catalog_returns/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/catalog_returns/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/catalog_returns/part-2.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/catalog_returns/part-2.parquet:..], ...]}, projection=[cr_item_sk, cr_order_number, cr_refunded_cash, cr_reversed_charge, cr_store_credit], file_type=parquet + │ RepartitionExec: partitioning=Hash([cr_item_sk@0, cr_order_number@1], 9), input_partitions=3 + │ DistributedLeafExec: DataSourceExec: file_groups={3 groups: [[/testdata/tpcds/plans_sf1_partitions4/catalog_returns/part-0.parquet:.., /testdata/tpcds/plans_sf1_partitions4/catalog_returns/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/catalog_returns/part-1.parquet:.., /testdata/tpcds/plans_sf1_partitions4/catalog_returns/part-2.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/catalog_returns/part-2.parquet:.., /testdata/tpcds/plans_sf1_partitions4/catalog_returns/part-3.parquet:..]]}, projection=[cr_item_sk, cr_order_number, cr_refunded_cash, cr_reversed_charge, cr_store_credit], file_type=parquet └────────────────────────────────────────────────── ┌───── Stage 29 ── Tasks: t0:[p0..p8] t1:[p0..p8] t2:[p0..p8] - │ RepartitionExec: partitioning=Hash([cs_item_sk@0, cs_order_number@1], 9), input_partitions=2 - │ PartitionIsolatorExec: tasks=3 partitions=6 - │ DataSourceExec: file_groups={6 groups: [[/testdata/tpcds/plans_sf1_partitions4/catalog_sales/part-0.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/catalog_sales/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/catalog_sales/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/catalog_sales/part-2.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/catalog_sales/part-2.parquet:..], ...]}, projection=[cs_item_sk, cs_order_number, cs_ext_list_price], file_type=parquet, predicate=DynamicFilter [ empty ] + │ RepartitionExec: partitioning=Hash([cs_item_sk@0, cs_order_number@1], 9), input_partitions=3 + │ DistributedLeafExec: DataSourceExec: file_groups={3 groups: [[/testdata/tpcds/plans_sf1_partitions4/catalog_sales/part-0.parquet:.., /testdata/tpcds/plans_sf1_partitions4/catalog_sales/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/catalog_sales/part-1.parquet:.., /testdata/tpcds/plans_sf1_partitions4/catalog_sales/part-2.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/catalog_sales/part-2.parquet:.., /testdata/tpcds/plans_sf1_partitions4/catalog_sales/part-3.parquet:..]]}, projection=[cs_item_sk, cs_order_number, cs_ext_list_price], file_type=parquet, predicate=DynamicFilter [ empty ] └────────────────────────────────────────────────── ┌───── Stage 32 ── Tasks: t0:[p0..p8] t1:[p0..p8] t2:[p0..p8] - │ RepartitionExec: partitioning=Hash([sr_item_sk@0, sr_ticket_number@1], 9), input_partitions=2 - │ PartitionIsolatorExec: tasks=3 partitions=6 - │ DataSourceExec: file_groups={6 groups: [[/testdata/tpcds/plans_sf1_partitions4/store_returns/part-0.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_returns/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_returns/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_returns/part-2.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_returns/part-2.parquet:..], ...]}, projection=[sr_item_sk, sr_ticket_number], file_type=parquet + │ RepartitionExec: partitioning=Hash([sr_item_sk@0, sr_ticket_number@1], 9), input_partitions=3 + │ DistributedLeafExec: DataSourceExec: file_groups={3 groups: [[/testdata/tpcds/plans_sf1_partitions4/store_returns/part-0.parquet:.., /testdata/tpcds/plans_sf1_partitions4/store_returns/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_returns/part-1.parquet:.., /testdata/tpcds/plans_sf1_partitions4/store_returns/part-2.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_returns/part-2.parquet:.., /testdata/tpcds/plans_sf1_partitions4/store_returns/part-3.parquet:..]]}, projection=[sr_item_sk, sr_ticket_number], file_type=parquet └────────────────────────────────────────────────── ┌───── Stage 33 ── Tasks: t0:[p0..p8] t1:[p0..p8] t2:[p0..p8] - │ RepartitionExec: partitioning=Hash([ss_item_sk@1, ss_ticket_number@8], 9), input_partitions=2 - │ PartitionIsolatorExec: tasks=3 partitions=6 - │ DataSourceExec: file_groups={6 groups: [[/testdata/tpcds/plans_sf1_partitions4/store_sales/part-0.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_sales/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_sales/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_sales/part-2.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_sales/part-2.parquet:..], ...]}, projection=[ss_sold_date_sk, ss_item_sk, ss_customer_sk, ss_cdemo_sk, ss_hdemo_sk, ss_addr_sk, ss_store_sk, ss_promo_sk, ss_ticket_number, ss_wholesale_cost, ss_list_price, ss_coupon_amt], file_type=parquet, predicate=DynamicFilter [ empty ] AND DynamicFilter [ empty ] AND DynamicFilter [ empty ] AND DynamicFilter [ empty ] AND DynamicFilter [ empty ] + │ RepartitionExec: partitioning=Hash([ss_item_sk@1, ss_ticket_number@8], 9), input_partitions=3 + │ DistributedLeafExec: DataSourceExec: file_groups={3 groups: [[/testdata/tpcds/plans_sf1_partitions4/store_sales/part-0.parquet:.., /testdata/tpcds/plans_sf1_partitions4/store_sales/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_sales/part-1.parquet:.., /testdata/tpcds/plans_sf1_partitions4/store_sales/part-2.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_sales/part-2.parquet:.., /testdata/tpcds/plans_sf1_partitions4/store_sales/part-3.parquet:..]]}, projection=[ss_sold_date_sk, ss_item_sk, ss_customer_sk, ss_cdemo_sk, ss_hdemo_sk, ss_addr_sk, ss_store_sk, ss_promo_sk, ss_ticket_number, ss_wholesale_cost, ss_list_price, ss_coupon_amt], file_type=parquet, predicate=DynamicFilter [ empty ] AND DynamicFilter [ empty ] AND DynamicFilter [ empty ] AND DynamicFilter [ empty ] AND DynamicFilter [ empty ] └────────────────────────────────────────────────── "); Ok(()) @@ -7202,7 +6643,7 @@ mod tests { #[tokio::test] async fn test_tpcds_65() -> Result<()> { let display = test_tpcds_query("q65").await?; - assert_snapshot!(display, @" + assert_snapshot!(display, @r" ┌───── DistributedExec ── Tasks: t0:[p0] │ SortPreservingMergeExec: [s_store_name@0 ASC, i_item_desc@1 ASC], fetch=100 │ [Stage 9] => NetworkCoalesceExec: output_partitions=6, input_tasks=2 @@ -7223,8 +6664,7 @@ mod tests { │ CoalescePartitionsExec │ [Stage 4] => NetworkBroadcastExec: partitions_per_consumer=3, stage_partitions=6, input_tasks=2 │ RepartitionExec: partitioning=RoundRobinBatch(3), input_partitions=2 - │ PartitionIsolatorExec: tasks=2 partitions=4 - │ DataSourceExec: file_groups={4 groups: [[/testdata/tpcds/plans_sf1_partitions4/item/part-0.parquet], [/testdata/tpcds/plans_sf1_partitions4/item/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/item/part-2.parquet], [/testdata/tpcds/plans_sf1_partitions4/item/part-3.parquet]]}, projection=[i_item_sk, i_item_desc, i_current_price, i_wholesale_cost, i_brand], file_type=parquet, predicate=DynamicFilter [ empty ] + │ DistributedLeafExec: DataSourceExec: file_groups={2 groups: [[/testdata/tpcds/plans_sf1_partitions4/item/part-0.parquet, /testdata/tpcds/plans_sf1_partitions4/item/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/item/part-2.parquet, /testdata/tpcds/plans_sf1_partitions4/item/part-3.parquet]]}, projection=[i_item_sk, i_item_desc, i_current_price, i_wholesale_cost, i_brand], file_type=parquet, predicate=DynamicFilter [ empty ] └────────────────────────────────────────────────── ┌───── Stage 4 ── Tasks: t0:[p0..p5] t1:[p6..p11] │ BroadcastExec: input_partitions=3, consumer_tasks=2, output_partitions=6 @@ -7237,25 +6677,22 @@ mod tests { └────────────────────────────────────────────────── ┌───── Stage 1 ── Tasks: t0:[p0..p3] t1:[p4..p7] │ BroadcastExec: input_partitions=2, consumer_tasks=2, output_partitions=4 - │ PartitionIsolatorExec: tasks=2 partitions=4 - │ DataSourceExec: file_groups={4 groups: [[/testdata/tpcds/plans_sf1_partitions4/store/part-0.parquet], [/testdata/tpcds/plans_sf1_partitions4/store/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/store/part-2.parquet], [/testdata/tpcds/plans_sf1_partitions4/store/part-3.parquet]]}, projection=[s_store_sk, s_store_name, CAST(s_store_sk@0 AS Float64) as CAST(store.s_store_sk AS Float64)], file_type=parquet + │ DistributedLeafExec: DataSourceExec: file_groups={2 groups: [[/testdata/tpcds/plans_sf1_partitions4/store/part-0.parquet, /testdata/tpcds/plans_sf1_partitions4/store/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/store/part-2.parquet, /testdata/tpcds/plans_sf1_partitions4/store/part-3.parquet]]}, projection=[s_store_sk, s_store_name, CAST(s_store_sk@0 AS Float64) as CAST(store.s_store_sk AS Float64)], file_type=parquet └────────────────────────────────────────────────── ┌───── Stage 3 ── Tasks: t0:[p0..p5] t1:[p0..p5] t2:[p0..p5] - │ RepartitionExec: partitioning=Hash([ss_store_sk@0, ss_item_sk@1], 6), input_partitions=2 + │ RepartitionExec: partitioning=Hash([ss_store_sk@0, ss_item_sk@1], 6), input_partitions=3 │ AggregateExec: mode=Partial, gby=[ss_store_sk@1 as ss_store_sk, ss_item_sk@0 as ss_item_sk], aggr=[sum(store_sales.ss_sales_price)] │ HashJoinExec: mode=CollectLeft, join_type=Inner, on=[(CAST(date_dim.d_date_sk AS Float64)@1, ss_sold_date_sk@0)], projection=[ss_item_sk@3, ss_store_sk@4, ss_sales_price@5] │ CoalescePartitionsExec │ [Stage 2] => NetworkBroadcastExec: partitions_per_consumer=3, stage_partitions=9, input_tasks=2 - │ PartitionIsolatorExec: tasks=3 partitions=6 - │ DataSourceExec: file_groups={6 groups: [[/testdata/tpcds/plans_sf1_partitions4/store_sales/part-0.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_sales/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_sales/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_sales/part-2.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_sales/part-2.parquet:..], ...]}, projection=[ss_sold_date_sk, ss_item_sk, ss_store_sk, ss_sales_price], file_type=parquet, predicate=DynamicFilter [ empty ] + │ DistributedLeafExec: DataSourceExec: file_groups={3 groups: [[/testdata/tpcds/plans_sf1_partitions4/store_sales/part-0.parquet:.., /testdata/tpcds/plans_sf1_partitions4/store_sales/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_sales/part-1.parquet:.., /testdata/tpcds/plans_sf1_partitions4/store_sales/part-2.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_sales/part-2.parquet:.., /testdata/tpcds/plans_sf1_partitions4/store_sales/part-3.parquet:..]]}, projection=[ss_sold_date_sk, ss_item_sk, ss_store_sk, ss_sales_price], file_type=parquet, predicate=DynamicFilter [ empty ] └────────────────────────────────────────────────── ┌───── Stage 2 ── Tasks: t0:[p0..p8] t1:[p9..p17] │ BroadcastExec: input_partitions=3, consumer_tasks=3, output_partitions=9 │ ProjectionExec: expr=[d_date_sk@0 as d_date_sk, CAST(d_date_sk@0 AS Float64) as CAST(date_dim.d_date_sk AS Float64)] │ FilterExec: d_month_seq@1 >= 1176 AND d_month_seq@1 <= 1187, projection=[d_date_sk@0] │ RepartitionExec: partitioning=RoundRobinBatch(3), input_partitions=2 - │ PartitionIsolatorExec: tasks=2 partitions=4 - │ DataSourceExec: file_groups={4 groups: [[/testdata/tpcds/plans_sf1_partitions4/date_dim/part-0.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-2.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-3.parquet]]}, projection=[d_date_sk, d_month_seq], file_type=parquet, predicate=d_month_seq@3 >= 1176 AND d_month_seq@3 <= 1187, pruning_predicate=d_month_seq_null_count@1 != row_count@2 AND d_month_seq_max@0 >= 1176 AND d_month_seq_null_count@1 != row_count@2 AND d_month_seq_min@3 <= 1187, required_guarantees=[] + │ DistributedLeafExec: DataSourceExec: file_groups={2 groups: [[/testdata/tpcds/plans_sf1_partitions4/date_dim/part-0.parquet, /testdata/tpcds/plans_sf1_partitions4/date_dim/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-2.parquet, /testdata/tpcds/plans_sf1_partitions4/date_dim/part-3.parquet]]}, projection=[d_date_sk, d_month_seq], file_type=parquet, predicate=d_month_seq@3 >= 1176 AND d_month_seq@3 <= 1187, pruning_predicate=d_month_seq_null_count@1 != row_count@2 AND d_month_seq_max@0 >= 1176 AND d_month_seq_null_count@1 != row_count@2 AND d_month_seq_min@3 <= 1187, required_guarantees=[] └────────────────────────────────────────────────── ┌───── Stage 8 ── Tasks: t0:[p0..p5] t1:[p0..p5] │ RepartitionExec: partitioning=Hash([ss_store_sk@0], 6), input_partitions=3 @@ -7265,21 +6702,19 @@ mod tests { │ [Stage 7] => NetworkShuffleExec: output_partitions=3, input_tasks=3 └────────────────────────────────────────────────── ┌───── Stage 7 ── Tasks: t0:[p0..p5] t1:[p0..p5] t2:[p0..p5] - │ RepartitionExec: partitioning=Hash([ss_store_sk@0, ss_item_sk@1], 6), input_partitions=2 + │ RepartitionExec: partitioning=Hash([ss_store_sk@0, ss_item_sk@1], 6), input_partitions=3 │ AggregateExec: mode=Partial, gby=[ss_store_sk@1 as ss_store_sk, ss_item_sk@0 as ss_item_sk], aggr=[sum(store_sales.ss_sales_price)] │ HashJoinExec: mode=CollectLeft, join_type=Inner, on=[(CAST(date_dim.d_date_sk AS Float64)@1, ss_sold_date_sk@0)], projection=[ss_item_sk@3, ss_store_sk@4, ss_sales_price@5] │ CoalescePartitionsExec │ [Stage 6] => NetworkBroadcastExec: partitions_per_consumer=3, stage_partitions=9, input_tasks=2 - │ PartitionIsolatorExec: tasks=3 partitions=6 - │ DataSourceExec: file_groups={6 groups: [[/testdata/tpcds/plans_sf1_partitions4/store_sales/part-0.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_sales/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_sales/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_sales/part-2.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_sales/part-2.parquet:..], ...]}, projection=[ss_sold_date_sk, ss_item_sk, ss_store_sk, ss_sales_price], file_type=parquet, predicate=DynamicFilter [ empty ] + │ DistributedLeafExec: DataSourceExec: file_groups={3 groups: [[/testdata/tpcds/plans_sf1_partitions4/store_sales/part-0.parquet:.., /testdata/tpcds/plans_sf1_partitions4/store_sales/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_sales/part-1.parquet:.., /testdata/tpcds/plans_sf1_partitions4/store_sales/part-2.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_sales/part-2.parquet:.., /testdata/tpcds/plans_sf1_partitions4/store_sales/part-3.parquet:..]]}, projection=[ss_sold_date_sk, ss_item_sk, ss_store_sk, ss_sales_price], file_type=parquet, predicate=DynamicFilter [ empty ] └────────────────────────────────────────────────── ┌───── Stage 6 ── Tasks: t0:[p0..p8] t1:[p9..p17] │ BroadcastExec: input_partitions=3, consumer_tasks=3, output_partitions=9 │ ProjectionExec: expr=[d_date_sk@0 as d_date_sk, CAST(d_date_sk@0 AS Float64) as CAST(date_dim.d_date_sk AS Float64)] │ FilterExec: d_month_seq@1 >= 1176 AND d_month_seq@1 <= 1187, projection=[d_date_sk@0] │ RepartitionExec: partitioning=RoundRobinBatch(3), input_partitions=2 - │ PartitionIsolatorExec: tasks=2 partitions=4 - │ DataSourceExec: file_groups={4 groups: [[/testdata/tpcds/plans_sf1_partitions4/date_dim/part-0.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-2.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-3.parquet]]}, projection=[d_date_sk, d_month_seq], file_type=parquet, predicate=d_month_seq@3 >= 1176 AND d_month_seq@3 <= 1187, pruning_predicate=d_month_seq_null_count@1 != row_count@2 AND d_month_seq_max@0 >= 1176 AND d_month_seq_null_count@1 != row_count@2 AND d_month_seq_min@3 <= 1187, required_guarantees=[] + │ DistributedLeafExec: DataSourceExec: file_groups={2 groups: [[/testdata/tpcds/plans_sf1_partitions4/date_dim/part-0.parquet, /testdata/tpcds/plans_sf1_partitions4/date_dim/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-2.parquet, /testdata/tpcds/plans_sf1_partitions4/date_dim/part-3.parquet]]}, projection=[d_date_sk, d_month_seq], file_type=parquet, predicate=d_month_seq@3 >= 1176 AND d_month_seq@3 <= 1187, pruning_predicate=d_month_seq_null_count@1 != row_count@2 AND d_month_seq_max@0 >= 1176 AND d_month_seq_null_count@1 != row_count@2 AND d_month_seq_min@3 <= 1187, required_guarantees=[] └────────────────────────────────────────────────── "); Ok(()) @@ -7287,7 +6722,7 @@ mod tests { #[tokio::test] async fn test_tpcds_66() -> Result<()> { let display = test_tpcds_query("q66").await?; - assert_snapshot!(display, @" + assert_snapshot!(display, @r" ┌───── DistributedExec ── Tasks: t0:[p0] │ SortPreservingMergeExec: [w_warehouse_name@0 ASC], fetch=100 │ SortExec: TopK(fetch=100), expr=[w_warehouse_name@0 ASC], preserve_partitioning=[true] @@ -7320,41 +6755,36 @@ mod tests { │ ProjectionExec: expr=[t_time_sk@0 as t_time_sk, CAST(t_time_sk@0 AS Float64) as CAST(time_dim.t_time_sk AS Float64)] │ FilterExec: t_time@1 >= 30838 AND t_time@1 <= 59638, projection=[t_time_sk@0] │ RepartitionExec: partitioning=RoundRobinBatch(3), input_partitions=2 - │ PartitionIsolatorExec: tasks=3 partitions=4 - │ DataSourceExec: file_groups={4 groups: [[/testdata/tpcds/plans_sf1_partitions4/time_dim/part-0.parquet], [/testdata/tpcds/plans_sf1_partitions4/time_dim/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/time_dim/part-2.parquet], [/testdata/tpcds/plans_sf1_partitions4/time_dim/part-3.parquet]]}, projection=[t_time_sk, t_time], file_type=parquet, predicate=t_time@2 >= 30838 AND t_time@2 <= 59638 AND DynamicFilter [ empty ], pruning_predicate=t_time_null_count@1 != row_count@2 AND t_time_max@0 >= 30838 AND t_time_null_count@1 != row_count@2 AND t_time_min@3 <= 59638, required_guarantees=[] + │ DistributedLeafExec: DataSourceExec: file_groups={2 groups: [[/testdata/tpcds/plans_sf1_partitions4/time_dim/part-0.parquet, /testdata/tpcds/plans_sf1_partitions4/time_dim/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/time_dim/part-2.parquet, /testdata/tpcds/plans_sf1_partitions4/time_dim/part-3.parquet]]}, projection=[t_time_sk, t_time], file_type=parquet, predicate=t_time@2 >= 30838 AND t_time@2 <= 59638 AND DynamicFilter [ empty ], pruning_predicate=t_time_null_count@1 != row_count@2 AND t_time_max@0 >= 30838 AND t_time_null_count@1 != row_count@2 AND t_time_min@3 <= 59638, required_guarantees=[] └────────────────────────────────────────────────── ┌───── Stage 1 ── Tasks: t0:[p0..p8] t1:[p9..p17] │ BroadcastExec: input_partitions=3, consumer_tasks=3, output_partitions=9 │ ProjectionExec: expr=[sm_ship_mode_sk@0 as sm_ship_mode_sk, CAST(sm_ship_mode_sk@0 AS Float64) as CAST(ship_mode.sm_ship_mode_sk AS Float64)] │ FilterExec: sm_carrier@1 = DHL OR sm_carrier@1 = BARIAN, projection=[sm_ship_mode_sk@0] │ RepartitionExec: partitioning=RoundRobinBatch(3), input_partitions=2 - │ PartitionIsolatorExec: tasks=2 partitions=4 - │ DataSourceExec: file_groups={4 groups: [[/testdata/tpcds/plans_sf1_partitions4/ship_mode/part-0.parquet], [/testdata/tpcds/plans_sf1_partitions4/ship_mode/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/ship_mode/part-2.parquet], [/testdata/tpcds/plans_sf1_partitions4/ship_mode/part-3.parquet]]}, projection=[sm_ship_mode_sk, sm_carrier], file_type=parquet, predicate=sm_carrier@4 = DHL OR sm_carrier@4 = BARIAN, pruning_predicate=sm_carrier_null_count@2 != row_count@3 AND sm_carrier_min@0 <= DHL AND DHL <= sm_carrier_max@1 OR sm_carrier_null_count@2 != row_count@3 AND sm_carrier_min@0 <= BARIAN AND BARIAN <= sm_carrier_max@1, required_guarantees=[sm_carrier in (BARIAN, DHL)] + │ DistributedLeafExec: DataSourceExec: file_groups={2 groups: [[/testdata/tpcds/plans_sf1_partitions4/ship_mode/part-0.parquet, /testdata/tpcds/plans_sf1_partitions4/ship_mode/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/ship_mode/part-2.parquet, /testdata/tpcds/plans_sf1_partitions4/ship_mode/part-3.parquet]]}, projection=[sm_ship_mode_sk, sm_carrier], file_type=parquet, predicate=sm_carrier@4 = DHL OR sm_carrier@4 = BARIAN, pruning_predicate=sm_carrier_null_count@2 != row_count@3 AND sm_carrier_min@0 <= DHL AND DHL <= sm_carrier_max@1 OR sm_carrier_null_count@2 != row_count@3 AND sm_carrier_min@0 <= BARIAN AND BARIAN <= sm_carrier_max@1, required_guarantees=[sm_carrier in (BARIAN, DHL)] └────────────────────────────────────────────────── ┌───── Stage 4 ── Tasks: t0:[p0..p8] t1:[p9..p17] t2:[p18..p26] │ BroadcastExec: input_partitions=3, consumer_tasks=3, output_partitions=9 │ HashJoinExec: mode=CollectLeft, join_type=Inner, on=[(ws_sold_date_sk@0, CAST(date_dim.d_date_sk AS Float64)@3)], projection=[ws_sold_time_sk@1, ws_ship_mode_sk@2, ws_quantity@3, ws_ext_sales_price@4, ws_net_paid@5, w_warehouse_name@6, w_warehouse_sq_ft@7, w_city@8, w_county@9, w_state@10, w_country@11, d_year@13, d_moy@14] │ CoalescePartitionsExec - │ [Stage 3] => NetworkBroadcastExec: partitions_per_consumer=2, stage_partitions=6, input_tasks=3 + │ [Stage 3] => NetworkBroadcastExec: partitions_per_consumer=3, stage_partitions=9, input_tasks=3 │ ProjectionExec: expr=[d_date_sk@0 as d_date_sk, d_year@1 as d_year, d_moy@2 as d_moy, CAST(d_date_sk@0 AS Float64) as CAST(date_dim.d_date_sk AS Float64)] │ FilterExec: d_year@1 = 2001 │ RepartitionExec: partitioning=RoundRobinBatch(3), input_partitions=2 - │ PartitionIsolatorExec: tasks=3 partitions=4 - │ DataSourceExec: file_groups={4 groups: [[/testdata/tpcds/plans_sf1_partitions4/date_dim/part-0.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-2.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-3.parquet]]}, projection=[d_date_sk, d_year, d_moy], file_type=parquet, predicate=d_year@6 = 2001 AND DynamicFilter [ empty ], pruning_predicate=d_year_null_count@2 != row_count@3 AND d_year_min@0 <= 2001 AND 2001 <= d_year_max@1, required_guarantees=[d_year in (2001)] + │ DistributedLeafExec: DataSourceExec: file_groups={2 groups: [[/testdata/tpcds/plans_sf1_partitions4/date_dim/part-0.parquet, /testdata/tpcds/plans_sf1_partitions4/date_dim/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-2.parquet, /testdata/tpcds/plans_sf1_partitions4/date_dim/part-3.parquet]]}, projection=[d_date_sk, d_year, d_moy], file_type=parquet, predicate=d_year@6 = 2001 AND DynamicFilter [ empty ], pruning_predicate=d_year_null_count@2 != row_count@3 AND d_year_min@0 <= 2001 AND 2001 <= d_year_max@1, required_guarantees=[d_year in (2001)] └────────────────────────────────────────────────── - ┌───── Stage 3 ── Tasks: t0:[p0..p5] t1:[p6..p11] t2:[p12..p17] - │ BroadcastExec: input_partitions=2, consumer_tasks=3, output_partitions=6 + ┌───── Stage 3 ── Tasks: t0:[p0..p8] t1:[p9..p17] t2:[p18..p26] + │ BroadcastExec: input_partitions=3, consumer_tasks=3, output_partitions=9 │ ProjectionExec: expr=[ws_sold_date_sk@6 as ws_sold_date_sk, ws_sold_time_sk@7 as ws_sold_time_sk, ws_ship_mode_sk@8 as ws_ship_mode_sk, ws_quantity@9 as ws_quantity, ws_ext_sales_price@10 as ws_ext_sales_price, ws_net_paid@11 as ws_net_paid, w_warehouse_name@0 as w_warehouse_name, w_warehouse_sq_ft@1 as w_warehouse_sq_ft, w_city@2 as w_city, w_county@3 as w_county, w_state@4 as w_state, w_country@5 as w_country] │ HashJoinExec: mode=CollectLeft, join_type=Inner, on=[(CAST(warehouse.w_warehouse_sk AS Float64)@7, ws_warehouse_sk@3)], projection=[w_warehouse_name@1, w_warehouse_sq_ft@2, w_city@3, w_county@4, w_state@5, w_country@6, ws_sold_date_sk@8, ws_sold_time_sk@9, ws_ship_mode_sk@10, ws_quantity@12, ws_ext_sales_price@13, ws_net_paid@14] │ CoalescePartitionsExec │ [Stage 2] => NetworkBroadcastExec: partitions_per_consumer=2, stage_partitions=6, input_tasks=2 - │ PartitionIsolatorExec: tasks=3 partitions=6 - │ DataSourceExec: file_groups={6 groups: [[/testdata/tpcds/plans_sf1_partitions4/web_sales/part-0.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/web_sales/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/web_sales/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/web_sales/part-2.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/web_sales/part-2.parquet:..], ...]}, projection=[ws_sold_date_sk, ws_sold_time_sk, ws_ship_mode_sk, ws_warehouse_sk, ws_quantity, ws_ext_sales_price, ws_net_paid], file_type=parquet, predicate=DynamicFilter [ empty ] AND DynamicFilter [ empty ] + │ DistributedLeafExec: DataSourceExec: file_groups={3 groups: [[/testdata/tpcds/plans_sf1_partitions4/web_sales/part-0.parquet:.., /testdata/tpcds/plans_sf1_partitions4/web_sales/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/web_sales/part-1.parquet:.., /testdata/tpcds/plans_sf1_partitions4/web_sales/part-2.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/web_sales/part-2.parquet:.., /testdata/tpcds/plans_sf1_partitions4/web_sales/part-3.parquet:..]]}, projection=[ws_sold_date_sk, ws_sold_time_sk, ws_ship_mode_sk, ws_warehouse_sk, ws_quantity, ws_ext_sales_price, ws_net_paid], file_type=parquet, predicate=DynamicFilter [ empty ] AND DynamicFilter [ empty ] └────────────────────────────────────────────────── ┌───── Stage 2 ── Tasks: t0:[p0..p5] t1:[p6..p11] │ BroadcastExec: input_partitions=2, consumer_tasks=3, output_partitions=6 - │ PartitionIsolatorExec: tasks=2 partitions=4 - │ DataSourceExec: file_groups={4 groups: [[/testdata/tpcds/plans_sf1_partitions4/warehouse/part-0.parquet], [/testdata/tpcds/plans_sf1_partitions4/warehouse/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/warehouse/part-2.parquet], [/testdata/tpcds/plans_sf1_partitions4/warehouse/part-3.parquet]]}, projection=[w_warehouse_sk, w_warehouse_name, w_warehouse_sq_ft, w_city, w_county, w_state, w_country, CAST(w_warehouse_sk@0 AS Float64) as CAST(warehouse.w_warehouse_sk AS Float64)], file_type=parquet + │ DistributedLeafExec: DataSourceExec: file_groups={2 groups: [[/testdata/tpcds/plans_sf1_partitions4/warehouse/part-0.parquet, /testdata/tpcds/plans_sf1_partitions4/warehouse/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/warehouse/part-2.parquet, /testdata/tpcds/plans_sf1_partitions4/warehouse/part-3.parquet]]}, projection=[w_warehouse_sk, w_warehouse_name, w_warehouse_sq_ft, w_city, w_county, w_state, w_country, CAST(w_warehouse_sk@0 AS Float64) as CAST(warehouse.w_warehouse_sk AS Float64)], file_type=parquet └────────────────────────────────────────────────── ┌───── Stage 10 ── Tasks: t0:[p0..p5] t1:[p0..p5] t2:[p0..p5] │ RepartitionExec: partitioning=Hash([w_warehouse_name@0, w_warehouse_sq_ft@1, w_city@2, w_county@3, w_state@4, w_country@5, d_year@6], 6), input_partitions=3 @@ -7369,41 +6799,36 @@ mod tests { │ ProjectionExec: expr=[t_time_sk@0 as t_time_sk, CAST(t_time_sk@0 AS Float64) as CAST(time_dim.t_time_sk AS Float64)] │ FilterExec: t_time@1 >= 30838 AND t_time@1 <= 59638, projection=[t_time_sk@0] │ RepartitionExec: partitioning=RoundRobinBatch(3), input_partitions=2 - │ PartitionIsolatorExec: tasks=3 partitions=4 - │ DataSourceExec: file_groups={4 groups: [[/testdata/tpcds/plans_sf1_partitions4/time_dim/part-0.parquet], [/testdata/tpcds/plans_sf1_partitions4/time_dim/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/time_dim/part-2.parquet], [/testdata/tpcds/plans_sf1_partitions4/time_dim/part-3.parquet]]}, projection=[t_time_sk, t_time], file_type=parquet, predicate=t_time@2 >= 30838 AND t_time@2 <= 59638 AND DynamicFilter [ empty ], pruning_predicate=t_time_null_count@1 != row_count@2 AND t_time_max@0 >= 30838 AND t_time_null_count@1 != row_count@2 AND t_time_min@3 <= 59638, required_guarantees=[] + │ DistributedLeafExec: DataSourceExec: file_groups={2 groups: [[/testdata/tpcds/plans_sf1_partitions4/time_dim/part-0.parquet, /testdata/tpcds/plans_sf1_partitions4/time_dim/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/time_dim/part-2.parquet, /testdata/tpcds/plans_sf1_partitions4/time_dim/part-3.parquet]]}, projection=[t_time_sk, t_time], file_type=parquet, predicate=t_time@2 >= 30838 AND t_time@2 <= 59638 AND DynamicFilter [ empty ], pruning_predicate=t_time_null_count@1 != row_count@2 AND t_time_max@0 >= 30838 AND t_time_null_count@1 != row_count@2 AND t_time_min@3 <= 59638, required_guarantees=[] └────────────────────────────────────────────────── ┌───── Stage 6 ── Tasks: t0:[p0..p8] t1:[p9..p17] │ BroadcastExec: input_partitions=3, consumer_tasks=3, output_partitions=9 │ ProjectionExec: expr=[sm_ship_mode_sk@0 as sm_ship_mode_sk, CAST(sm_ship_mode_sk@0 AS Float64) as CAST(ship_mode.sm_ship_mode_sk AS Float64)] │ FilterExec: sm_carrier@1 = DHL OR sm_carrier@1 = BARIAN, projection=[sm_ship_mode_sk@0] │ RepartitionExec: partitioning=RoundRobinBatch(3), input_partitions=2 - │ PartitionIsolatorExec: tasks=2 partitions=4 - │ DataSourceExec: file_groups={4 groups: [[/testdata/tpcds/plans_sf1_partitions4/ship_mode/part-0.parquet], [/testdata/tpcds/plans_sf1_partitions4/ship_mode/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/ship_mode/part-2.parquet], [/testdata/tpcds/plans_sf1_partitions4/ship_mode/part-3.parquet]]}, projection=[sm_ship_mode_sk, sm_carrier], file_type=parquet, predicate=sm_carrier@4 = DHL OR sm_carrier@4 = BARIAN, pruning_predicate=sm_carrier_null_count@2 != row_count@3 AND sm_carrier_min@0 <= DHL AND DHL <= sm_carrier_max@1 OR sm_carrier_null_count@2 != row_count@3 AND sm_carrier_min@0 <= BARIAN AND BARIAN <= sm_carrier_max@1, required_guarantees=[sm_carrier in (BARIAN, DHL)] + │ DistributedLeafExec: DataSourceExec: file_groups={2 groups: [[/testdata/tpcds/plans_sf1_partitions4/ship_mode/part-0.parquet, /testdata/tpcds/plans_sf1_partitions4/ship_mode/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/ship_mode/part-2.parquet, /testdata/tpcds/plans_sf1_partitions4/ship_mode/part-3.parquet]]}, projection=[sm_ship_mode_sk, sm_carrier], file_type=parquet, predicate=sm_carrier@4 = DHL OR sm_carrier@4 = BARIAN, pruning_predicate=sm_carrier_null_count@2 != row_count@3 AND sm_carrier_min@0 <= DHL AND DHL <= sm_carrier_max@1 OR sm_carrier_null_count@2 != row_count@3 AND sm_carrier_min@0 <= BARIAN AND BARIAN <= sm_carrier_max@1, required_guarantees=[sm_carrier in (BARIAN, DHL)] └────────────────────────────────────────────────── ┌───── Stage 9 ── Tasks: t0:[p0..p8] t1:[p9..p17] t2:[p18..p26] │ BroadcastExec: input_partitions=3, consumer_tasks=3, output_partitions=9 │ HashJoinExec: mode=CollectLeft, join_type=Inner, on=[(cs_sold_date_sk@0, CAST(date_dim.d_date_sk AS Float64)@3)], projection=[cs_sold_time_sk@1, cs_ship_mode_sk@2, cs_quantity@3, cs_sales_price@4, cs_net_paid_inc_tax@5, w_warehouse_name@6, w_warehouse_sq_ft@7, w_city@8, w_county@9, w_state@10, w_country@11, d_year@13, d_moy@14] │ CoalescePartitionsExec - │ [Stage 8] => NetworkBroadcastExec: partitions_per_consumer=2, stage_partitions=6, input_tasks=3 + │ [Stage 8] => NetworkBroadcastExec: partitions_per_consumer=3, stage_partitions=9, input_tasks=3 │ ProjectionExec: expr=[d_date_sk@0 as d_date_sk, d_year@1 as d_year, d_moy@2 as d_moy, CAST(d_date_sk@0 AS Float64) as CAST(date_dim.d_date_sk AS Float64)] │ FilterExec: d_year@1 = 2001 │ RepartitionExec: partitioning=RoundRobinBatch(3), input_partitions=2 - │ PartitionIsolatorExec: tasks=3 partitions=4 - │ DataSourceExec: file_groups={4 groups: [[/testdata/tpcds/plans_sf1_partitions4/date_dim/part-0.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-2.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-3.parquet]]}, projection=[d_date_sk, d_year, d_moy], file_type=parquet, predicate=d_year@6 = 2001 AND DynamicFilter [ empty ], pruning_predicate=d_year_null_count@2 != row_count@3 AND d_year_min@0 <= 2001 AND 2001 <= d_year_max@1, required_guarantees=[d_year in (2001)] + │ DistributedLeafExec: DataSourceExec: file_groups={2 groups: [[/testdata/tpcds/plans_sf1_partitions4/date_dim/part-0.parquet, /testdata/tpcds/plans_sf1_partitions4/date_dim/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-2.parquet, /testdata/tpcds/plans_sf1_partitions4/date_dim/part-3.parquet]]}, projection=[d_date_sk, d_year, d_moy], file_type=parquet, predicate=d_year@6 = 2001 AND DynamicFilter [ empty ], pruning_predicate=d_year_null_count@2 != row_count@3 AND d_year_min@0 <= 2001 AND 2001 <= d_year_max@1, required_guarantees=[d_year in (2001)] └────────────────────────────────────────────────── - ┌───── Stage 8 ── Tasks: t0:[p0..p5] t1:[p6..p11] t2:[p12..p17] - │ BroadcastExec: input_partitions=2, consumer_tasks=3, output_partitions=6 + ┌───── Stage 8 ── Tasks: t0:[p0..p8] t1:[p9..p17] t2:[p18..p26] + │ BroadcastExec: input_partitions=3, consumer_tasks=3, output_partitions=9 │ ProjectionExec: expr=[cs_sold_date_sk@6 as cs_sold_date_sk, cs_sold_time_sk@7 as cs_sold_time_sk, cs_ship_mode_sk@8 as cs_ship_mode_sk, cs_quantity@9 as cs_quantity, cs_sales_price@10 as cs_sales_price, cs_net_paid_inc_tax@11 as cs_net_paid_inc_tax, w_warehouse_name@0 as w_warehouse_name, w_warehouse_sq_ft@1 as w_warehouse_sq_ft, w_city@2 as w_city, w_county@3 as w_county, w_state@4 as w_state, w_country@5 as w_country] │ HashJoinExec: mode=CollectLeft, join_type=Inner, on=[(CAST(warehouse.w_warehouse_sk AS Float64)@7, cs_warehouse_sk@3)], projection=[w_warehouse_name@1, w_warehouse_sq_ft@2, w_city@3, w_county@4, w_state@5, w_country@6, cs_sold_date_sk@8, cs_sold_time_sk@9, cs_ship_mode_sk@10, cs_quantity@12, cs_sales_price@13, cs_net_paid_inc_tax@14] │ CoalescePartitionsExec │ [Stage 7] => NetworkBroadcastExec: partitions_per_consumer=2, stage_partitions=6, input_tasks=2 - │ PartitionIsolatorExec: tasks=3 partitions=6 - │ DataSourceExec: file_groups={6 groups: [[/testdata/tpcds/plans_sf1_partitions4/catalog_sales/part-0.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/catalog_sales/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/catalog_sales/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/catalog_sales/part-2.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/catalog_sales/part-2.parquet:..], ...]}, projection=[cs_sold_date_sk, cs_sold_time_sk, cs_ship_mode_sk, cs_warehouse_sk, cs_quantity, cs_sales_price, cs_net_paid_inc_tax], file_type=parquet, predicate=DynamicFilter [ empty ] AND DynamicFilter [ empty ] + │ DistributedLeafExec: DataSourceExec: file_groups={3 groups: [[/testdata/tpcds/plans_sf1_partitions4/catalog_sales/part-0.parquet:.., /testdata/tpcds/plans_sf1_partitions4/catalog_sales/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/catalog_sales/part-1.parquet:.., /testdata/tpcds/plans_sf1_partitions4/catalog_sales/part-2.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/catalog_sales/part-2.parquet:.., /testdata/tpcds/plans_sf1_partitions4/catalog_sales/part-3.parquet:..]]}, projection=[cs_sold_date_sk, cs_sold_time_sk, cs_ship_mode_sk, cs_warehouse_sk, cs_quantity, cs_sales_price, cs_net_paid_inc_tax], file_type=parquet, predicate=DynamicFilter [ empty ] AND DynamicFilter [ empty ] └────────────────────────────────────────────────── ┌───── Stage 7 ── Tasks: t0:[p0..p5] t1:[p6..p11] │ BroadcastExec: input_partitions=2, consumer_tasks=3, output_partitions=6 - │ PartitionIsolatorExec: tasks=2 partitions=4 - │ DataSourceExec: file_groups={4 groups: [[/testdata/tpcds/plans_sf1_partitions4/warehouse/part-0.parquet], [/testdata/tpcds/plans_sf1_partitions4/warehouse/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/warehouse/part-2.parquet], [/testdata/tpcds/plans_sf1_partitions4/warehouse/part-3.parquet]]}, projection=[w_warehouse_sk, w_warehouse_name, w_warehouse_sq_ft, w_city, w_county, w_state, w_country, CAST(w_warehouse_sk@0 AS Float64) as CAST(warehouse.w_warehouse_sk AS Float64)], file_type=parquet + │ DistributedLeafExec: DataSourceExec: file_groups={2 groups: [[/testdata/tpcds/plans_sf1_partitions4/warehouse/part-0.parquet, /testdata/tpcds/plans_sf1_partitions4/warehouse/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/warehouse/part-2.parquet, /testdata/tpcds/plans_sf1_partitions4/warehouse/part-3.parquet]]}, projection=[w_warehouse_sk, w_warehouse_name, w_warehouse_sq_ft, w_city, w_county, w_state, w_country, CAST(w_warehouse_sk@0 AS Float64) as CAST(warehouse.w_warehouse_sk AS Float64)], file_type=parquet └────────────────────────────────────────────────── "); Ok(()) @@ -7433,13 +6858,12 @@ mod tests { │ ProjectionExec: expr=[CAST(ss_sales_price@1 AS Float64) * ss_quantity@0 as __common_expr_1, d_year@2 as d_year, d_moy@3 as d_moy, d_qoy@4 as d_qoy, s_store_id@5 as s_store_id, i_brand@6 as i_brand, i_class@7 as i_class, i_category@8 as i_category, i_product_name@9 as i_product_name] │ HashJoinExec: mode=CollectLeft, join_type=Inner, on=[(ss_item_sk@0, i_item_sk@0)], projection=[ss_quantity@1, ss_sales_price@2, d_year@3, d_moy@4, d_qoy@5, s_store_id@6, i_brand@8, i_class@9, i_category@10, i_product_name@11] │ CoalescePartitionsExec - │ [Stage 3] => NetworkBroadcastExec: partitions_per_consumer=2, stage_partitions=6, input_tasks=3 + │ [Stage 3] => NetworkBroadcastExec: partitions_per_consumer=3, stage_partitions=9, input_tasks=3 │ RepartitionExec: partitioning=RoundRobinBatch(3), input_partitions=2 - │ PartitionIsolatorExec: tasks=3 partitions=4 - │ DataSourceExec: file_groups={4 groups: [[/testdata/tpcds/plans_sf1_partitions4/item/part-0.parquet], [/testdata/tpcds/plans_sf1_partitions4/item/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/item/part-2.parquet], [/testdata/tpcds/plans_sf1_partitions4/item/part-3.parquet]]}, projection=[i_item_sk, i_brand, i_class, i_category, i_product_name], file_type=parquet, predicate=DynamicFilter [ empty ] + │ DistributedLeafExec: DataSourceExec: file_groups={2 groups: [[/testdata/tpcds/plans_sf1_partitions4/item/part-0.parquet, /testdata/tpcds/plans_sf1_partitions4/item/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/item/part-2.parquet, /testdata/tpcds/plans_sf1_partitions4/item/part-3.parquet]]}, projection=[i_item_sk, i_brand, i_class, i_category, i_product_name], file_type=parquet, predicate=DynamicFilter [ empty ] └────────────────────────────────────────────────── - ┌───── Stage 3 ── Tasks: t0:[p0..p5] t1:[p6..p11] t2:[p12..p17] - │ BroadcastExec: input_partitions=2, consumer_tasks=3, output_partitions=6 + ┌───── Stage 3 ── Tasks: t0:[p0..p8] t1:[p9..p17] t2:[p18..p26] + │ BroadcastExec: input_partitions=3, consumer_tasks=3, output_partitions=9 │ ProjectionExec: expr=[ss_item_sk@1 as ss_item_sk, ss_quantity@2 as ss_quantity, ss_sales_price@3 as ss_sales_price, d_year@4 as d_year, d_moy@5 as d_moy, d_qoy@6 as d_qoy, s_store_id@0 as s_store_id] │ HashJoinExec: mode=CollectLeft, join_type=Inner, on=[(CAST(store.s_store_sk AS Float64)@2, ss_store_sk@1)], projection=[s_store_id@1, ss_item_sk@3, ss_quantity@5, ss_sales_price@6, d_year@7, d_moy@8, d_qoy@9] │ CoalescePartitionsExec @@ -7448,21 +6872,18 @@ mod tests { │ HashJoinExec: mode=CollectLeft, join_type=Inner, on=[(CAST(date_dim.d_date_sk AS Float64)@4, ss_sold_date_sk@0)], projection=[d_year@1, d_moy@2, d_qoy@3, ss_item_sk@6, ss_store_sk@7, ss_quantity@8, ss_sales_price@9] │ CoalescePartitionsExec │ [Stage 2] => NetworkBroadcastExec: partitions_per_consumer=3, stage_partitions=9, input_tasks=2 - │ PartitionIsolatorExec: tasks=3 partitions=6 - │ DataSourceExec: file_groups={6 groups: [[/testdata/tpcds/plans_sf1_partitions4/store_sales/part-0.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_sales/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_sales/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_sales/part-2.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_sales/part-2.parquet:..], ...]}, projection=[ss_sold_date_sk, ss_item_sk, ss_store_sk, ss_quantity, ss_sales_price], file_type=parquet, predicate=DynamicFilter [ empty ] AND DynamicFilter [ empty ] + │ DistributedLeafExec: DataSourceExec: file_groups={3 groups: [[/testdata/tpcds/plans_sf1_partitions4/store_sales/part-0.parquet:.., /testdata/tpcds/plans_sf1_partitions4/store_sales/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_sales/part-1.parquet:.., /testdata/tpcds/plans_sf1_partitions4/store_sales/part-2.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_sales/part-2.parquet:.., /testdata/tpcds/plans_sf1_partitions4/store_sales/part-3.parquet:..]]}, projection=[ss_sold_date_sk, ss_item_sk, ss_store_sk, ss_quantity, ss_sales_price], file_type=parquet, predicate=DynamicFilter [ empty ] AND DynamicFilter [ empty ] └────────────────────────────────────────────────── ┌───── Stage 1 ── Tasks: t0:[p0..p5] t1:[p6..p11] │ BroadcastExec: input_partitions=2, consumer_tasks=3, output_partitions=6 - │ PartitionIsolatorExec: tasks=2 partitions=4 - │ DataSourceExec: file_groups={4 groups: [[/testdata/tpcds/plans_sf1_partitions4/store/part-0.parquet], [/testdata/tpcds/plans_sf1_partitions4/store/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/store/part-2.parquet], [/testdata/tpcds/plans_sf1_partitions4/store/part-3.parquet]]}, projection=[s_store_sk, s_store_id, CAST(s_store_sk@0 AS Float64) as CAST(store.s_store_sk AS Float64)], file_type=parquet + │ DistributedLeafExec: DataSourceExec: file_groups={2 groups: [[/testdata/tpcds/plans_sf1_partitions4/store/part-0.parquet, /testdata/tpcds/plans_sf1_partitions4/store/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/store/part-2.parquet, /testdata/tpcds/plans_sf1_partitions4/store/part-3.parquet]]}, projection=[s_store_sk, s_store_id, CAST(s_store_sk@0 AS Float64) as CAST(store.s_store_sk AS Float64)], file_type=parquet └────────────────────────────────────────────────── ┌───── Stage 2 ── Tasks: t0:[p0..p8] t1:[p9..p17] │ BroadcastExec: input_partitions=3, consumer_tasks=3, output_partitions=9 │ ProjectionExec: expr=[d_date_sk@0 as d_date_sk, d_year@1 as d_year, d_moy@2 as d_moy, d_qoy@3 as d_qoy, CAST(d_date_sk@0 AS Float64) as CAST(date_dim.d_date_sk AS Float64)] │ FilterExec: d_month_seq@1 >= 1200 AND d_month_seq@1 <= 1211, projection=[d_date_sk@0, d_year@2, d_moy@3, d_qoy@4] │ RepartitionExec: partitioning=RoundRobinBatch(3), input_partitions=2 - │ PartitionIsolatorExec: tasks=2 partitions=4 - │ DataSourceExec: file_groups={4 groups: [[/testdata/tpcds/plans_sf1_partitions4/date_dim/part-0.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-2.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-3.parquet]]}, projection=[d_date_sk, d_month_seq, d_year, d_moy, d_qoy], file_type=parquet, predicate=d_month_seq@3 >= 1200 AND d_month_seq@3 <= 1211, pruning_predicate=d_month_seq_null_count@1 != row_count@2 AND d_month_seq_max@0 >= 1200 AND d_month_seq_null_count@1 != row_count@2 AND d_month_seq_min@3 <= 1211, required_guarantees=[] + │ DistributedLeafExec: DataSourceExec: file_groups={2 groups: [[/testdata/tpcds/plans_sf1_partitions4/date_dim/part-0.parquet, /testdata/tpcds/plans_sf1_partitions4/date_dim/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-2.parquet, /testdata/tpcds/plans_sf1_partitions4/date_dim/part-3.parquet]]}, projection=[d_date_sk, d_month_seq, d_year, d_moy, d_qoy], file_type=parquet, predicate=d_month_seq@3 >= 1200 AND d_month_seq@3 <= 1211, pruning_predicate=d_month_seq_null_count@1 != row_count@2 AND d_month_seq_max@0 >= 1200 AND d_month_seq_null_count@1 != row_count@2 AND d_month_seq_min@3 <= 1211, required_guarantees=[] └────────────────────────────────────────────────── "#); Ok(()) @@ -7470,7 +6891,7 @@ mod tests { #[tokio::test] async fn test_tpcds_68() -> Result<()> { let display = test_tpcds_query("q68").await?; - assert_snapshot!(display, @" + assert_snapshot!(display, @r" ┌───── DistributedExec ── Tasks: t0:[p0] │ SortPreservingMergeExec: [c_last_name@0 ASC, ss_ticket_number@4 ASC], fetch=100 │ [Stage 8] => NetworkCoalesceExec: output_partitions=6, input_tasks=2 @@ -7482,8 +6903,7 @@ mod tests { │ CoalescePartitionsExec │ [Stage 7] => NetworkBroadcastExec: partitions_per_consumer=3, stage_partitions=6, input_tasks=2 │ RepartitionExec: partitioning=RoundRobinBatch(3), input_partitions=2 - │ PartitionIsolatorExec: tasks=2 partitions=4 - │ DataSourceExec: file_groups={4 groups: [[/testdata/tpcds/plans_sf1_partitions4/customer_address/part-0.parquet], [/testdata/tpcds/plans_sf1_partitions4/customer_address/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/customer_address/part-2.parquet], [/testdata/tpcds/plans_sf1_partitions4/customer_address/part-3.parquet]]}, projection=[ca_address_sk, ca_city], file_type=parquet, predicate=DynamicFilter [ empty ] + │ DistributedLeafExec: DataSourceExec: file_groups={2 groups: [[/testdata/tpcds/plans_sf1_partitions4/customer_address/part-0.parquet, /testdata/tpcds/plans_sf1_partitions4/customer_address/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/customer_address/part-2.parquet, /testdata/tpcds/plans_sf1_partitions4/customer_address/part-3.parquet]]}, projection=[ca_address_sk, ca_city], file_type=parquet, predicate=DynamicFilter [ empty ] └────────────────────────────────────────────────── ┌───── Stage 7 ── Tasks: t0:[p0..p5] t1:[p6..p11] │ BroadcastExec: input_partitions=3, consumer_tasks=2, output_partitions=6 @@ -7492,8 +6912,7 @@ mod tests { │ [Stage 6] => NetworkBroadcastExec: partitions_per_consumer=3, stage_partitions=6, input_tasks=2 │ ProjectionExec: expr=[c_customer_sk@0 as c_customer_sk, c_current_addr_sk@1 as c_current_addr_sk, c_first_name@2 as c_first_name, c_last_name@3 as c_last_name, CAST(c_customer_sk@0 AS Float64) as CAST(customer.c_customer_sk AS Float64)] │ RepartitionExec: partitioning=RoundRobinBatch(3), input_partitions=2 - │ PartitionIsolatorExec: tasks=2 partitions=4 - │ DataSourceExec: file_groups={4 groups: [[/testdata/tpcds/plans_sf1_partitions4/customer/part-0.parquet], [/testdata/tpcds/plans_sf1_partitions4/customer/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/customer/part-2.parquet], [/testdata/tpcds/plans_sf1_partitions4/customer/part-3.parquet]]}, projection=[c_customer_sk, c_current_addr_sk, c_first_name, c_last_name], file_type=parquet, predicate=DynamicFilter [ empty ] + │ DistributedLeafExec: DataSourceExec: file_groups={2 groups: [[/testdata/tpcds/plans_sf1_partitions4/customer/part-0.parquet, /testdata/tpcds/plans_sf1_partitions4/customer/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/customer/part-2.parquet, /testdata/tpcds/plans_sf1_partitions4/customer/part-3.parquet]]}, projection=[c_customer_sk, c_current_addr_sk, c_first_name, c_last_name], file_type=parquet, predicate=DynamicFilter [ empty ] └────────────────────────────────────────────────── ┌───── Stage 6 ── Tasks: t0:[p0..p5] t1:[p6..p11] │ BroadcastExec: input_partitions=3, consumer_tasks=2, output_partitions=6 @@ -7506,14 +6925,13 @@ mod tests { │ AggregateExec: mode=Partial, gby=[ss_ticket_number@2 as ss_ticket_number, ss_customer_sk@0 as ss_customer_sk, ss_addr_sk@1 as ss_addr_sk, ca_city@6 as ca_city], aggr=[sum(store_sales.ss_ext_sales_price), sum(store_sales.ss_ext_list_price), sum(store_sales.ss_ext_tax)] │ HashJoinExec: mode=CollectLeft, join_type=Inner, on=[(ss_addr_sk@1, CAST(customer_address.ca_address_sk AS Float64)@2)], projection=[ss_customer_sk@0, ss_addr_sk@1, ss_ticket_number@2, ss_ext_sales_price@3, ss_ext_list_price@4, ss_ext_tax@5, ca_city@7] │ CoalescePartitionsExec - │ [Stage 4] => NetworkBroadcastExec: partitions_per_consumer=2, stage_partitions=6, input_tasks=3 + │ [Stage 4] => NetworkBroadcastExec: partitions_per_consumer=3, stage_partitions=9, input_tasks=3 │ ProjectionExec: expr=[ca_address_sk@0 as ca_address_sk, ca_city@1 as ca_city, CAST(ca_address_sk@0 AS Float64) as CAST(customer_address.ca_address_sk AS Float64)] │ RepartitionExec: partitioning=RoundRobinBatch(3), input_partitions=2 - │ PartitionIsolatorExec: tasks=3 partitions=4 - │ DataSourceExec: file_groups={4 groups: [[/testdata/tpcds/plans_sf1_partitions4/customer_address/part-0.parquet], [/testdata/tpcds/plans_sf1_partitions4/customer_address/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/customer_address/part-2.parquet], [/testdata/tpcds/plans_sf1_partitions4/customer_address/part-3.parquet]]}, projection=[ca_address_sk, ca_city], file_type=parquet, predicate=DynamicFilter [ empty ] + │ DistributedLeafExec: DataSourceExec: file_groups={2 groups: [[/testdata/tpcds/plans_sf1_partitions4/customer_address/part-0.parquet, /testdata/tpcds/plans_sf1_partitions4/customer_address/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/customer_address/part-2.parquet, /testdata/tpcds/plans_sf1_partitions4/customer_address/part-3.parquet]]}, projection=[ca_address_sk, ca_city], file_type=parquet, predicate=DynamicFilter [ empty ] └────────────────────────────────────────────────── - ┌───── Stage 4 ── Tasks: t0:[p0..p5] t1:[p6..p11] t2:[p12..p17] - │ BroadcastExec: input_partitions=2, consumer_tasks=3, output_partitions=6 + ┌───── Stage 4 ── Tasks: t0:[p0..p8] t1:[p9..p17] t2:[p18..p26] + │ BroadcastExec: input_partitions=3, consumer_tasks=3, output_partitions=9 │ HashJoinExec: mode=CollectLeft, join_type=Inner, on=[(CAST(household_demographics.hd_demo_sk AS Float64)@1, ss_hdemo_sk@1)], projection=[ss_customer_sk@2, ss_addr_sk@4, ss_ticket_number@5, ss_ext_sales_price@6, ss_ext_list_price@7, ss_ext_tax@8] │ CoalescePartitionsExec │ [Stage 1] => NetworkBroadcastExec: partitions_per_consumer=3, stage_partitions=9, input_tasks=2 @@ -7523,32 +6941,28 @@ mod tests { │ HashJoinExec: mode=CollectLeft, join_type=Inner, on=[(CAST(date_dim.d_date_sk AS Float64)@1, ss_sold_date_sk@0)], projection=[ss_customer_sk@3, ss_hdemo_sk@4, ss_addr_sk@5, ss_store_sk@6, ss_ticket_number@7, ss_ext_sales_price@8, ss_ext_list_price@9, ss_ext_tax@10] │ CoalescePartitionsExec │ [Stage 3] => NetworkBroadcastExec: partitions_per_consumer=3, stage_partitions=9, input_tasks=2 - │ PartitionIsolatorExec: tasks=3 partitions=6 - │ DataSourceExec: file_groups={6 groups: [[/testdata/tpcds/plans_sf1_partitions4/store_sales/part-0.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_sales/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_sales/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_sales/part-2.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_sales/part-2.parquet:..], ...]}, projection=[ss_sold_date_sk, ss_customer_sk, ss_hdemo_sk, ss_addr_sk, ss_store_sk, ss_ticket_number, ss_ext_sales_price, ss_ext_list_price, ss_ext_tax], file_type=parquet, predicate=DynamicFilter [ empty ] AND DynamicFilter [ empty ] AND DynamicFilter [ empty ] + │ DistributedLeafExec: DataSourceExec: file_groups={3 groups: [[/testdata/tpcds/plans_sf1_partitions4/store_sales/part-0.parquet:.., /testdata/tpcds/plans_sf1_partitions4/store_sales/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_sales/part-1.parquet:.., /testdata/tpcds/plans_sf1_partitions4/store_sales/part-2.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_sales/part-2.parquet:.., /testdata/tpcds/plans_sf1_partitions4/store_sales/part-3.parquet:..]]}, projection=[ss_sold_date_sk, ss_customer_sk, ss_hdemo_sk, ss_addr_sk, ss_store_sk, ss_ticket_number, ss_ext_sales_price, ss_ext_list_price, ss_ext_tax], file_type=parquet, predicate=DynamicFilter [ empty ] AND DynamicFilter [ empty ] AND DynamicFilter [ empty ] └────────────────────────────────────────────────── ┌───── Stage 1 ── Tasks: t0:[p0..p8] t1:[p9..p17] │ BroadcastExec: input_partitions=3, consumer_tasks=3, output_partitions=9 │ ProjectionExec: expr=[hd_demo_sk@0 as hd_demo_sk, CAST(hd_demo_sk@0 AS Float64) as CAST(household_demographics.hd_demo_sk AS Float64)] │ FilterExec: hd_dep_count@1 = 4 OR hd_vehicle_count@2 = 3, projection=[hd_demo_sk@0] │ RepartitionExec: partitioning=RoundRobinBatch(3), input_partitions=2 - │ PartitionIsolatorExec: tasks=2 partitions=4 - │ DataSourceExec: file_groups={4 groups: [[/testdata/tpcds/plans_sf1_partitions4/household_demographics/part-0.parquet], [/testdata/tpcds/plans_sf1_partitions4/household_demographics/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/household_demographics/part-2.parquet], [/testdata/tpcds/plans_sf1_partitions4/household_demographics/part-3.parquet]]}, projection=[hd_demo_sk, hd_dep_count, hd_vehicle_count], file_type=parquet, predicate=hd_dep_count@3 = 4 OR hd_vehicle_count@4 = 3, pruning_predicate=hd_dep_count_null_count@2 != row_count@3 AND hd_dep_count_min@0 <= 4 AND 4 <= hd_dep_count_max@1 OR hd_vehicle_count_null_count@6 != row_count@3 AND hd_vehicle_count_min@4 <= 3 AND 3 <= hd_vehicle_count_max@5, required_guarantees=[] + │ DistributedLeafExec: DataSourceExec: file_groups={2 groups: [[/testdata/tpcds/plans_sf1_partitions4/household_demographics/part-0.parquet, /testdata/tpcds/plans_sf1_partitions4/household_demographics/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/household_demographics/part-2.parquet, /testdata/tpcds/plans_sf1_partitions4/household_demographics/part-3.parquet]]}, projection=[hd_demo_sk, hd_dep_count, hd_vehicle_count], file_type=parquet, predicate=hd_dep_count@3 = 4 OR hd_vehicle_count@4 = 3, pruning_predicate=hd_dep_count_null_count@2 != row_count@3 AND hd_dep_count_min@0 <= 4 AND 4 <= hd_dep_count_max@1 OR hd_vehicle_count_null_count@6 != row_count@3 AND hd_vehicle_count_min@4 <= 3 AND 3 <= hd_vehicle_count_max@5, required_guarantees=[] └────────────────────────────────────────────────── ┌───── Stage 2 ── Tasks: t0:[p0..p8] t1:[p9..p17] │ BroadcastExec: input_partitions=3, consumer_tasks=3, output_partitions=9 │ ProjectionExec: expr=[s_store_sk@0 as s_store_sk, CAST(s_store_sk@0 AS Float64) as CAST(store.s_store_sk AS Float64)] │ FilterExec: s_city@1 = Fairview OR s_city@1 = Midway, projection=[s_store_sk@0] │ RepartitionExec: partitioning=RoundRobinBatch(3), input_partitions=2 - │ PartitionIsolatorExec: tasks=2 partitions=4 - │ DataSourceExec: file_groups={4 groups: [[/testdata/tpcds/plans_sf1_partitions4/store/part-0.parquet], [/testdata/tpcds/plans_sf1_partitions4/store/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/store/part-2.parquet], [/testdata/tpcds/plans_sf1_partitions4/store/part-3.parquet]]}, projection=[s_store_sk, s_city], file_type=parquet, predicate=s_city@22 = Fairview OR s_city@22 = Midway, pruning_predicate=s_city_null_count@2 != row_count@3 AND s_city_min@0 <= Fairview AND Fairview <= s_city_max@1 OR s_city_null_count@2 != row_count@3 AND s_city_min@0 <= Midway AND Midway <= s_city_max@1, required_guarantees=[s_city in (Fairview, Midway)] + │ DistributedLeafExec: DataSourceExec: file_groups={2 groups: [[/testdata/tpcds/plans_sf1_partitions4/store/part-0.parquet, /testdata/tpcds/plans_sf1_partitions4/store/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/store/part-2.parquet, /testdata/tpcds/plans_sf1_partitions4/store/part-3.parquet]]}, projection=[s_store_sk, s_city], file_type=parquet, predicate=s_city@22 = Fairview OR s_city@22 = Midway, pruning_predicate=s_city_null_count@2 != row_count@3 AND s_city_min@0 <= Fairview AND Fairview <= s_city_max@1 OR s_city_null_count@2 != row_count@3 AND s_city_min@0 <= Midway AND Midway <= s_city_max@1, required_guarantees=[s_city in (Fairview, Midway)] └────────────────────────────────────────────────── ┌───── Stage 3 ── Tasks: t0:[p0..p8] t1:[p9..p17] │ BroadcastExec: input_partitions=3, consumer_tasks=3, output_partitions=9 │ ProjectionExec: expr=[d_date_sk@0 as d_date_sk, CAST(d_date_sk@0 AS Float64) as CAST(date_dim.d_date_sk AS Float64)] │ FilterExec: d_dom@2 >= 1 AND d_dom@2 <= 2 AND (d_year@1 = 1999 OR d_year@1 = 2000 OR d_year@1 = 2001), projection=[d_date_sk@0] │ RepartitionExec: partitioning=RoundRobinBatch(3), input_partitions=2 - │ PartitionIsolatorExec: tasks=2 partitions=4 - │ DataSourceExec: file_groups={4 groups: [[/testdata/tpcds/plans_sf1_partitions4/date_dim/part-0.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-2.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-3.parquet]]}, projection=[d_date_sk, d_year, d_dom], file_type=parquet, predicate=d_dom@9 >= 1 AND d_dom@9 <= 2 AND (d_year@6 = 1999 OR d_year@6 = 2000 OR d_year@6 = 2001), pruning_predicate=d_dom_null_count@1 != row_count@2 AND d_dom_max@0 >= 1 AND d_dom_null_count@1 != row_count@2 AND d_dom_min@3 <= 2 AND (d_year_null_count@6 != row_count@2 AND d_year_min@4 <= 1999 AND 1999 <= d_year_max@5 OR d_year_null_count@6 != row_count@2 AND d_year_min@4 <= 2000 AND 2000 <= d_year_max@5 OR d_year_null_count@6 != row_count@2 AND d_year_min@4 <= 2001 AND 2001 <= d_year_max@5), required_guarantees=[d_year in (1999, 2000, 2001)] + │ DistributedLeafExec: DataSourceExec: file_groups={2 groups: [[/testdata/tpcds/plans_sf1_partitions4/date_dim/part-0.parquet, /testdata/tpcds/plans_sf1_partitions4/date_dim/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-2.parquet, /testdata/tpcds/plans_sf1_partitions4/date_dim/part-3.parquet]]}, projection=[d_date_sk, d_year, d_dom], file_type=parquet, predicate=d_dom@9 >= 1 AND d_dom@9 <= 2 AND (d_year@6 = 1999 OR d_year@6 = 2000 OR d_year@6 = 2001), pruning_predicate=d_dom_null_count@1 != row_count@2 AND d_dom_max@0 >= 1 AND d_dom_null_count@1 != row_count@2 AND d_dom_min@3 <= 2 AND (d_year_null_count@6 != row_count@2 AND d_year_min@4 <= 1999 AND 1999 <= d_year_max@5 OR d_year_null_count@6 != row_count@2 AND d_year_min@4 <= 2000 AND 2000 <= d_year_max@5 OR d_year_null_count@6 != row_count@2 AND d_year_min@4 <= 2001 AND 2001 <= d_year_max@5), required_guarantees=[d_year in (1999, 2000, 2001)] └────────────────────────────────────────────────── "); Ok(()) @@ -7556,7 +6970,7 @@ mod tests { #[tokio::test] async fn test_tpcds_69() -> Result<()> { let display = test_tpcds_query("q69").await?; - assert_snapshot!(display, @" + assert_snapshot!(display, @r" ┌───── DistributedExec ── Tasks: t0:[p0] │ SortPreservingMergeExec: [cd_gender@0 ASC NULLS LAST, cd_marital_status@1 ASC NULLS LAST, cd_education_status@2 ASC NULLS LAST, cd_purchase_estimate@4 ASC NULLS LAST, cd_credit_rating@6 ASC NULLS LAST], fetch=100 │ [Stage 10] => NetworkCoalesceExec: output_partitions=6, input_tasks=2 @@ -7568,73 +6982,66 @@ mod tests { │ [Stage 9] => NetworkShuffleExec: output_partitions=3, input_tasks=3 └────────────────────────────────────────────────── ┌───── Stage 9 ── Tasks: t0:[p0..p5] t1:[p0..p5] t2:[p0..p5] - │ RepartitionExec: partitioning=Hash([cd_gender@0, cd_marital_status@1, cd_education_status@2, cd_purchase_estimate@3, cd_credit_rating@4], 6), input_partitions=2 + │ RepartitionExec: partitioning=Hash([cd_gender@0, cd_marital_status@1, cd_education_status@2, cd_purchase_estimate@3, cd_credit_rating@4], 6), input_partitions=3 │ AggregateExec: mode=Partial, gby=[cd_gender@0 as cd_gender, cd_marital_status@1 as cd_marital_status, cd_education_status@2 as cd_education_status, cd_purchase_estimate@3 as cd_purchase_estimate, cd_credit_rating@4 as cd_credit_rating], aggr=[count(Int64(1))] │ HashJoinExec: mode=CollectLeft, join_type=RightAnti, on=[(cs_ship_customer_sk@0, CAST(c.c_customer_sk AS Float64)@6)], projection=[cd_gender@1, cd_marital_status@2, cd_education_status@3, cd_purchase_estimate@4, cd_credit_rating@5] │ CoalescePartitionsExec - │ [Stage 2] => NetworkBroadcastExec: partitions_per_consumer=2, stage_partitions=6, input_tasks=3 + │ [Stage 2] => NetworkBroadcastExec: partitions_per_consumer=3, stage_partitions=9, input_tasks=3 │ ProjectionExec: expr=[c_customer_sk@0 as c_customer_sk, cd_gender@1 as cd_gender, cd_marital_status@2 as cd_marital_status, cd_education_status@3 as cd_education_status, cd_purchase_estimate@4 as cd_purchase_estimate, cd_credit_rating@5 as cd_credit_rating, CAST(c_customer_sk@0 AS Float64) as CAST(c.c_customer_sk AS Float64)] │ HashJoinExec: mode=CollectLeft, join_type=RightAnti, on=[(ws_bill_customer_sk@0, CAST(c.c_customer_sk AS Float64)@6)], projection=[c_customer_sk@0, cd_gender@1, cd_marital_status@2, cd_education_status@3, cd_purchase_estimate@4, cd_credit_rating@5] │ CoalescePartitionsExec - │ [Stage 4] => NetworkBroadcastExec: partitions_per_consumer=2, stage_partitions=6, input_tasks=3 + │ [Stage 4] => NetworkBroadcastExec: partitions_per_consumer=3, stage_partitions=9, input_tasks=3 │ ProjectionExec: expr=[c_customer_sk@0 as c_customer_sk, cd_gender@1 as cd_gender, cd_marital_status@2 as cd_marital_status, cd_education_status@3 as cd_education_status, cd_purchase_estimate@4 as cd_purchase_estimate, cd_credit_rating@5 as cd_credit_rating, CAST(c_customer_sk@0 AS Float64) as CAST(c.c_customer_sk AS Float64)] │ HashJoinExec: mode=CollectLeft, join_type=RightSemi, on=[(ss_customer_sk@0, CAST(c.c_customer_sk AS Float64)@6)], projection=[c_customer_sk@0, cd_gender@1, cd_marital_status@2, cd_education_status@3, cd_purchase_estimate@4, cd_credit_rating@5] │ CoalescePartitionsExec - │ [Stage 6] => NetworkBroadcastExec: partitions_per_consumer=2, stage_partitions=6, input_tasks=3 + │ [Stage 6] => NetworkBroadcastExec: partitions_per_consumer=3, stage_partitions=9, input_tasks=3 │ ProjectionExec: expr=[c_customer_sk@0 as c_customer_sk, cd_gender@1 as cd_gender, cd_marital_status@2 as cd_marital_status, cd_education_status@3 as cd_education_status, cd_purchase_estimate@4 as cd_purchase_estimate, cd_credit_rating@5 as cd_credit_rating, CAST(c_customer_sk@0 AS Float64) as CAST(c.c_customer_sk AS Float64)] │ HashJoinExec: mode=CollectLeft, join_type=Inner, on=[(c_current_cdemo_sk@1, CAST(customer_demographics.cd_demo_sk AS Float64)@6)], projection=[c_customer_sk@0, cd_gender@3, cd_marital_status@4, cd_education_status@5, cd_purchase_estimate@6, cd_credit_rating@7] │ CoalescePartitionsExec │ [Stage 8] => NetworkBroadcastExec: partitions_per_consumer=3, stage_partitions=9, input_tasks=2 - │ PartitionIsolatorExec: tasks=3 partitions=6 - │ DataSourceExec: file_groups={6 groups: [[/testdata/tpcds/plans_sf1_partitions4/customer_demographics/part-0.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/customer_demographics/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/customer_demographics/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/customer_demographics/part-2.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/customer_demographics/part-2.parquet:..], ...]}, projection=[cd_demo_sk, cd_gender, cd_marital_status, cd_education_status, cd_purchase_estimate, cd_credit_rating, CAST(cd_demo_sk@0 AS Float64) as CAST(customer_demographics.cd_demo_sk AS Float64)], file_type=parquet, predicate=DynamicFilter [ empty ] AND DynamicFilter [ empty ] + │ DistributedLeafExec: DataSourceExec: file_groups={3 groups: [[/testdata/tpcds/plans_sf1_partitions4/customer_demographics/part-0.parquet:.., /testdata/tpcds/plans_sf1_partitions4/customer_demographics/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/customer_demographics/part-1.parquet:.., /testdata/tpcds/plans_sf1_partitions4/customer_demographics/part-2.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/customer_demographics/part-2.parquet:.., /testdata/tpcds/plans_sf1_partitions4/customer_demographics/part-3.parquet:..]]}, projection=[cd_demo_sk, cd_gender, cd_marital_status, cd_education_status, cd_purchase_estimate, cd_credit_rating, CAST(cd_demo_sk@0 AS Float64) as CAST(customer_demographics.cd_demo_sk AS Float64)], file_type=parquet, predicate=DynamicFilter [ empty ] AND DynamicFilter [ empty ] └────────────────────────────────────────────────── - ┌───── Stage 2 ── Tasks: t0:[p0..p5] t1:[p6..p11] t2:[p12..p17] - │ BroadcastExec: input_partitions=2, consumer_tasks=3, output_partitions=6 + ┌───── Stage 2 ── Tasks: t0:[p0..p8] t1:[p9..p17] t2:[p18..p26] + │ BroadcastExec: input_partitions=3, consumer_tasks=3, output_partitions=9 │ HashJoinExec: mode=CollectLeft, join_type=Inner, on=[(CAST(date_dim.d_date_sk AS Float64)@1, cs_sold_date_sk@0)], projection=[cs_ship_customer_sk@3] │ CoalescePartitionsExec │ [Stage 1] => NetworkBroadcastExec: partitions_per_consumer=3, stage_partitions=9, input_tasks=2 - │ PartitionIsolatorExec: tasks=3 partitions=6 - │ DataSourceExec: file_groups={6 groups: [[/testdata/tpcds/plans_sf1_partitions4/catalog_sales/part-0.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/catalog_sales/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/catalog_sales/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/catalog_sales/part-2.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/catalog_sales/part-2.parquet:..], ...]}, projection=[cs_sold_date_sk, cs_ship_customer_sk], file_type=parquet, predicate=DynamicFilter [ empty ] + │ DistributedLeafExec: DataSourceExec: file_groups={3 groups: [[/testdata/tpcds/plans_sf1_partitions4/catalog_sales/part-0.parquet:.., /testdata/tpcds/plans_sf1_partitions4/catalog_sales/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/catalog_sales/part-1.parquet:.., /testdata/tpcds/plans_sf1_partitions4/catalog_sales/part-2.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/catalog_sales/part-2.parquet:.., /testdata/tpcds/plans_sf1_partitions4/catalog_sales/part-3.parquet:..]]}, projection=[cs_sold_date_sk, cs_ship_customer_sk], file_type=parquet, predicate=DynamicFilter [ empty ] └────────────────────────────────────────────────── ┌───── Stage 1 ── Tasks: t0:[p0..p8] t1:[p9..p17] │ BroadcastExec: input_partitions=3, consumer_tasks=3, output_partitions=9 │ ProjectionExec: expr=[d_date_sk@0 as d_date_sk, CAST(d_date_sk@0 AS Float64) as CAST(date_dim.d_date_sk AS Float64)] │ FilterExec: d_year@1 = 2001 AND d_moy@2 >= 4 AND d_moy@2 <= 6, projection=[d_date_sk@0] │ RepartitionExec: partitioning=RoundRobinBatch(3), input_partitions=2 - │ PartitionIsolatorExec: tasks=2 partitions=4 - │ DataSourceExec: file_groups={4 groups: [[/testdata/tpcds/plans_sf1_partitions4/date_dim/part-0.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-2.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-3.parquet]]}, projection=[d_date_sk, d_year, d_moy], file_type=parquet, predicate=d_year@6 = 2001 AND d_moy@8 >= 4 AND d_moy@8 <= 6, pruning_predicate=d_year_null_count@2 != row_count@3 AND d_year_min@0 <= 2001 AND 2001 <= d_year_max@1 AND d_moy_null_count@5 != row_count@3 AND d_moy_max@4 >= 4 AND d_moy_null_count@5 != row_count@3 AND d_moy_min@6 <= 6, required_guarantees=[d_year in (2001)] + │ DistributedLeafExec: DataSourceExec: file_groups={2 groups: [[/testdata/tpcds/plans_sf1_partitions4/date_dim/part-0.parquet, /testdata/tpcds/plans_sf1_partitions4/date_dim/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-2.parquet, /testdata/tpcds/plans_sf1_partitions4/date_dim/part-3.parquet]]}, projection=[d_date_sk, d_year, d_moy], file_type=parquet, predicate=d_year@6 = 2001 AND d_moy@8 >= 4 AND d_moy@8 <= 6, pruning_predicate=d_year_null_count@2 != row_count@3 AND d_year_min@0 <= 2001 AND 2001 <= d_year_max@1 AND d_moy_null_count@5 != row_count@3 AND d_moy_max@4 >= 4 AND d_moy_null_count@5 != row_count@3 AND d_moy_min@6 <= 6, required_guarantees=[d_year in (2001)] └────────────────────────────────────────────────── - ┌───── Stage 4 ── Tasks: t0:[p0..p5] t1:[p6..p11] t2:[p12..p17] - │ BroadcastExec: input_partitions=2, consumer_tasks=3, output_partitions=6 + ┌───── Stage 4 ── Tasks: t0:[p0..p8] t1:[p9..p17] t2:[p18..p26] + │ BroadcastExec: input_partitions=3, consumer_tasks=3, output_partitions=9 │ HashJoinExec: mode=CollectLeft, join_type=Inner, on=[(CAST(date_dim.d_date_sk AS Float64)@1, ws_sold_date_sk@0)], projection=[ws_bill_customer_sk@3] │ CoalescePartitionsExec │ [Stage 3] => NetworkBroadcastExec: partitions_per_consumer=3, stage_partitions=9, input_tasks=2 - │ PartitionIsolatorExec: tasks=3 partitions=6 - │ DataSourceExec: file_groups={6 groups: [[/testdata/tpcds/plans_sf1_partitions4/web_sales/part-0.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/web_sales/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/web_sales/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/web_sales/part-2.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/web_sales/part-2.parquet:..], ...]}, projection=[ws_sold_date_sk, ws_bill_customer_sk], file_type=parquet, predicate=DynamicFilter [ empty ] + │ DistributedLeafExec: DataSourceExec: file_groups={3 groups: [[/testdata/tpcds/plans_sf1_partitions4/web_sales/part-0.parquet:.., /testdata/tpcds/plans_sf1_partitions4/web_sales/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/web_sales/part-1.parquet:.., /testdata/tpcds/plans_sf1_partitions4/web_sales/part-2.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/web_sales/part-2.parquet:.., /testdata/tpcds/plans_sf1_partitions4/web_sales/part-3.parquet:..]]}, projection=[ws_sold_date_sk, ws_bill_customer_sk], file_type=parquet, predicate=DynamicFilter [ empty ] └────────────────────────────────────────────────── ┌───── Stage 3 ── Tasks: t0:[p0..p8] t1:[p9..p17] │ BroadcastExec: input_partitions=3, consumer_tasks=3, output_partitions=9 │ ProjectionExec: expr=[d_date_sk@0 as d_date_sk, CAST(d_date_sk@0 AS Float64) as CAST(date_dim.d_date_sk AS Float64)] │ FilterExec: d_year@1 = 2001 AND d_moy@2 >= 4 AND d_moy@2 <= 6, projection=[d_date_sk@0] │ RepartitionExec: partitioning=RoundRobinBatch(3), input_partitions=2 - │ PartitionIsolatorExec: tasks=2 partitions=4 - │ DataSourceExec: file_groups={4 groups: [[/testdata/tpcds/plans_sf1_partitions4/date_dim/part-0.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-2.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-3.parquet]]}, projection=[d_date_sk, d_year, d_moy], file_type=parquet, predicate=d_year@6 = 2001 AND d_moy@8 >= 4 AND d_moy@8 <= 6, pruning_predicate=d_year_null_count@2 != row_count@3 AND d_year_min@0 <= 2001 AND 2001 <= d_year_max@1 AND d_moy_null_count@5 != row_count@3 AND d_moy_max@4 >= 4 AND d_moy_null_count@5 != row_count@3 AND d_moy_min@6 <= 6, required_guarantees=[d_year in (2001)] + │ DistributedLeafExec: DataSourceExec: file_groups={2 groups: [[/testdata/tpcds/plans_sf1_partitions4/date_dim/part-0.parquet, /testdata/tpcds/plans_sf1_partitions4/date_dim/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-2.parquet, /testdata/tpcds/plans_sf1_partitions4/date_dim/part-3.parquet]]}, projection=[d_date_sk, d_year, d_moy], file_type=parquet, predicate=d_year@6 = 2001 AND d_moy@8 >= 4 AND d_moy@8 <= 6, pruning_predicate=d_year_null_count@2 != row_count@3 AND d_year_min@0 <= 2001 AND 2001 <= d_year_max@1 AND d_moy_null_count@5 != row_count@3 AND d_moy_max@4 >= 4 AND d_moy_null_count@5 != row_count@3 AND d_moy_min@6 <= 6, required_guarantees=[d_year in (2001)] └────────────────────────────────────────────────── - ┌───── Stage 6 ── Tasks: t0:[p0..p5] t1:[p6..p11] t2:[p12..p17] - │ BroadcastExec: input_partitions=2, consumer_tasks=3, output_partitions=6 + ┌───── Stage 6 ── Tasks: t0:[p0..p8] t1:[p9..p17] t2:[p18..p26] + │ BroadcastExec: input_partitions=3, consumer_tasks=3, output_partitions=9 │ HashJoinExec: mode=CollectLeft, join_type=Inner, on=[(CAST(date_dim.d_date_sk AS Float64)@1, ss_sold_date_sk@0)], projection=[ss_customer_sk@3] │ CoalescePartitionsExec │ [Stage 5] => NetworkBroadcastExec: partitions_per_consumer=3, stage_partitions=9, input_tasks=2 - │ PartitionIsolatorExec: tasks=3 partitions=6 - │ DataSourceExec: file_groups={6 groups: [[/testdata/tpcds/plans_sf1_partitions4/store_sales/part-0.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_sales/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_sales/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_sales/part-2.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_sales/part-2.parquet:..], ...]}, projection=[ss_sold_date_sk, ss_customer_sk], file_type=parquet, predicate=DynamicFilter [ empty ] + │ DistributedLeafExec: DataSourceExec: file_groups={3 groups: [[/testdata/tpcds/plans_sf1_partitions4/store_sales/part-0.parquet:.., /testdata/tpcds/plans_sf1_partitions4/store_sales/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_sales/part-1.parquet:.., /testdata/tpcds/plans_sf1_partitions4/store_sales/part-2.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_sales/part-2.parquet:.., /testdata/tpcds/plans_sf1_partitions4/store_sales/part-3.parquet:..]]}, projection=[ss_sold_date_sk, ss_customer_sk], file_type=parquet, predicate=DynamicFilter [ empty ] └────────────────────────────────────────────────── ┌───── Stage 5 ── Tasks: t0:[p0..p8] t1:[p9..p17] │ BroadcastExec: input_partitions=3, consumer_tasks=3, output_partitions=9 │ ProjectionExec: expr=[d_date_sk@0 as d_date_sk, CAST(d_date_sk@0 AS Float64) as CAST(date_dim.d_date_sk AS Float64)] │ FilterExec: d_year@1 = 2001 AND d_moy@2 >= 4 AND d_moy@2 <= 6, projection=[d_date_sk@0] │ RepartitionExec: partitioning=RoundRobinBatch(3), input_partitions=2 - │ PartitionIsolatorExec: tasks=2 partitions=4 - │ DataSourceExec: file_groups={4 groups: [[/testdata/tpcds/plans_sf1_partitions4/date_dim/part-0.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-2.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-3.parquet]]}, projection=[d_date_sk, d_year, d_moy], file_type=parquet, predicate=d_year@6 = 2001 AND d_moy@8 >= 4 AND d_moy@8 <= 6, pruning_predicate=d_year_null_count@2 != row_count@3 AND d_year_min@0 <= 2001 AND 2001 <= d_year_max@1 AND d_moy_null_count@5 != row_count@3 AND d_moy_max@4 >= 4 AND d_moy_null_count@5 != row_count@3 AND d_moy_min@6 <= 6, required_guarantees=[d_year in (2001)] + │ DistributedLeafExec: DataSourceExec: file_groups={2 groups: [[/testdata/tpcds/plans_sf1_partitions4/date_dim/part-0.parquet, /testdata/tpcds/plans_sf1_partitions4/date_dim/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-2.parquet, /testdata/tpcds/plans_sf1_partitions4/date_dim/part-3.parquet]]}, projection=[d_date_sk, d_year, d_moy], file_type=parquet, predicate=d_year@6 = 2001 AND d_moy@8 >= 4 AND d_moy@8 <= 6, pruning_predicate=d_year_null_count@2 != row_count@3 AND d_year_min@0 <= 2001 AND 2001 <= d_year_max@1 AND d_moy_null_count@5 != row_count@3 AND d_moy_max@4 >= 4 AND d_moy_null_count@5 != row_count@3 AND d_moy_min@6 <= 6, required_guarantees=[d_year in (2001)] └────────────────────────────────────────────────── ┌───── Stage 8 ── Tasks: t0:[p0..p8] t1:[p9..p17] │ BroadcastExec: input_partitions=3, consumer_tasks=3, output_partitions=9 @@ -7642,15 +7049,13 @@ mod tests { │ CoalescePartitionsExec │ [Stage 7] => NetworkBroadcastExec: partitions_per_consumer=3, stage_partitions=6, input_tasks=2 │ RepartitionExec: partitioning=RoundRobinBatch(3), input_partitions=2 - │ PartitionIsolatorExec: tasks=2 partitions=4 - │ DataSourceExec: file_groups={4 groups: [[/testdata/tpcds/plans_sf1_partitions4/customer/part-0.parquet], [/testdata/tpcds/plans_sf1_partitions4/customer/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/customer/part-2.parquet], [/testdata/tpcds/plans_sf1_partitions4/customer/part-3.parquet]]}, projection=[c_customer_sk, c_current_cdemo_sk, c_current_addr_sk], file_type=parquet, predicate=DynamicFilter [ empty ] + │ DistributedLeafExec: DataSourceExec: file_groups={2 groups: [[/testdata/tpcds/plans_sf1_partitions4/customer/part-0.parquet, /testdata/tpcds/plans_sf1_partitions4/customer/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/customer/part-2.parquet, /testdata/tpcds/plans_sf1_partitions4/customer/part-3.parquet]]}, projection=[c_customer_sk, c_current_cdemo_sk, c_current_addr_sk], file_type=parquet, predicate=DynamicFilter [ empty ] └────────────────────────────────────────────────── ┌───── Stage 7 ── Tasks: t0:[p0..p5] t1:[p6..p11] │ BroadcastExec: input_partitions=3, consumer_tasks=2, output_partitions=6 │ FilterExec: ca_state@1 = KY OR ca_state@1 = GA OR ca_state@1 = NM, projection=[ca_address_sk@0] │ RepartitionExec: partitioning=RoundRobinBatch(3), input_partitions=2 - │ PartitionIsolatorExec: tasks=2 partitions=4 - │ DataSourceExec: file_groups={4 groups: [[/testdata/tpcds/plans_sf1_partitions4/customer_address/part-0.parquet], [/testdata/tpcds/plans_sf1_partitions4/customer_address/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/customer_address/part-2.parquet], [/testdata/tpcds/plans_sf1_partitions4/customer_address/part-3.parquet]]}, projection=[ca_address_sk, ca_state], file_type=parquet, predicate=ca_state@8 = KY OR ca_state@8 = GA OR ca_state@8 = NM, pruning_predicate=ca_state_null_count@2 != row_count@3 AND ca_state_min@0 <= KY AND KY <= ca_state_max@1 OR ca_state_null_count@2 != row_count@3 AND ca_state_min@0 <= GA AND GA <= ca_state_max@1 OR ca_state_null_count@2 != row_count@3 AND ca_state_min@0 <= NM AND NM <= ca_state_max@1, required_guarantees=[ca_state in (GA, KY, NM)] + │ DistributedLeafExec: DataSourceExec: file_groups={2 groups: [[/testdata/tpcds/plans_sf1_partitions4/customer_address/part-0.parquet, /testdata/tpcds/plans_sf1_partitions4/customer_address/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/customer_address/part-2.parquet, /testdata/tpcds/plans_sf1_partitions4/customer_address/part-3.parquet]]}, projection=[ca_address_sk, ca_state], file_type=parquet, predicate=ca_state@8 = KY OR ca_state@8 = GA OR ca_state@8 = NM, pruning_predicate=ca_state_null_count@2 != row_count@3 AND ca_state_min@0 <= KY AND KY <= ca_state_max@1 OR ca_state_null_count@2 != row_count@3 AND ca_state_min@0 <= GA AND GA <= ca_state_max@1 OR ca_state_null_count@2 != row_count@3 AND ca_state_min@0 <= NM AND NM <= ca_state_max@1, required_guarantees=[ca_state in (GA, KY, NM)] └────────────────────────────────────────────────── "); Ok(()) @@ -7761,9 +7166,8 @@ mod tests { │ [Stage 5] => NetworkBroadcastExec: partitions_per_consumer=3, stage_partitions=12, input_tasks=4 │ ProjectionExec: expr=[t_time_sk@0 as t_time_sk, t_hour@1 as t_hour, t_minute@2 as t_minute, CAST(t_time_sk@0 AS Float64) as CAST(time_dim.t_time_sk AS Float64)] │ FilterExec: t_meal_time@3 = breakfast OR t_meal_time@3 = dinner, projection=[t_time_sk@0, t_hour@1, t_minute@2] - │ RepartitionExec: partitioning=RoundRobinBatch(3), input_partitions=1 - │ PartitionIsolatorExec: tasks=4 partitions=4 - │ DataSourceExec: file_groups={4 groups: [[/testdata/tpcds/plans_sf1_partitions4/time_dim/part-0.parquet], [/testdata/tpcds/plans_sf1_partitions4/time_dim/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/time_dim/part-2.parquet], [/testdata/tpcds/plans_sf1_partitions4/time_dim/part-3.parquet]]}, projection=[t_time_sk, t_hour, t_minute, t_meal_time], file_type=parquet, predicate=(t_meal_time@9 = breakfast OR t_meal_time@9 = dinner) AND DynamicFilter [ empty ], pruning_predicate=t_meal_time_null_count@2 != row_count@3 AND t_meal_time_min@0 <= breakfast AND breakfast <= t_meal_time_max@1 OR t_meal_time_null_count@2 != row_count@3 AND t_meal_time_min@0 <= dinner AND dinner <= t_meal_time_max@1, required_guarantees=[t_meal_time in (breakfast, dinner)] + │ RepartitionExec: partitioning=RoundRobinBatch(3), input_partitions=2 + │ DistributedLeafExec: DataSourceExec: file_groups={2 groups: [[/testdata/tpcds/plans_sf1_partitions4/time_dim/part-0.parquet, /testdata/tpcds/plans_sf1_partitions4/time_dim/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/time_dim/part-2.parquet, /testdata/tpcds/plans_sf1_partitions4/time_dim/part-3.parquet]]}, projection=[t_time_sk, t_hour, t_minute, t_meal_time], file_type=parquet, predicate=(t_meal_time@9 = breakfast OR t_meal_time@9 = dinner) AND DynamicFilter [ empty ], pruning_predicate=t_meal_time_null_count@2 != row_count@3 AND t_meal_time_min@0 <= breakfast AND breakfast <= t_meal_time_max@1 OR t_meal_time_null_count@2 != row_count@3 AND t_meal_time_min@0 <= dinner AND dinner <= t_meal_time_max@1, required_guarantees=[t_meal_time in (breakfast, dinner)] └────────────────────────────────────────────────── ┌───── Stage 5 ── Tasks: t0:[p0..p11] t1:[p12..p23] t2:[p24..p35] t3:[p36..p47] │ BroadcastExec: input_partitions=3, consumer_tasks=4, output_partitions=12 @@ -7772,9 +7176,8 @@ mod tests { │ CoalescePartitionsExec │ [Stage 4] => NetworkBroadcastExec: partitions_per_consumer=3, stage_partitions=12, input_tasks=4 │ FilterExec: i_manager_id@3 = 1, projection=[i_item_sk@0, i_brand_id@1, i_brand@2] - │ RepartitionExec: partitioning=RoundRobinBatch(3), input_partitions=1 - │ PartitionIsolatorExec: tasks=4 partitions=4 - │ DataSourceExec: file_groups={4 groups: [[/testdata/tpcds/plans_sf1_partitions4/item/part-0.parquet], [/testdata/tpcds/plans_sf1_partitions4/item/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/item/part-2.parquet], [/testdata/tpcds/plans_sf1_partitions4/item/part-3.parquet]]}, projection=[i_item_sk, i_brand_id, i_brand, i_manager_id], file_type=parquet, predicate=i_manager_id@20 = 1 AND DynamicFilter [ empty ], pruning_predicate=i_manager_id_null_count@2 != row_count@3 AND i_manager_id_min@0 <= 1 AND 1 <= i_manager_id_max@1, required_guarantees=[i_manager_id in (1)] + │ RepartitionExec: partitioning=RoundRobinBatch(3), input_partitions=2 + │ DistributedLeafExec: DataSourceExec: file_groups={2 groups: [[/testdata/tpcds/plans_sf1_partitions4/item/part-0.parquet, /testdata/tpcds/plans_sf1_partitions4/item/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/item/part-2.parquet, /testdata/tpcds/plans_sf1_partitions4/item/part-3.parquet]]}, projection=[i_item_sk, i_brand_id, i_brand, i_manager_id], file_type=parquet, predicate=i_manager_id@20 = 1 AND DynamicFilter [ empty ], pruning_predicate=i_manager_id_null_count@2 != row_count@3 AND i_manager_id_min@0 <= 1 AND 1 <= i_manager_id_max@1, required_guarantees=[i_manager_id in (1)] └────────────────────────────────────────────────── ┌───── Stage 4 ── Tasks: t0:[p0..p11] t1:[p12..p23] t2:[p24..p35] t3:[p36..p47] │ BroadcastExec: input_partitions=3, consumer_tasks=4, output_partitions=12 @@ -7783,8 +7186,7 @@ mod tests { │ HashJoinExec: mode=CollectLeft, join_type=Inner, on=[(CAST(date_dim.d_date_sk AS Float64)@1, ws_sold_date_sk@0)], projection=[ws_sold_time_sk@3, ws_item_sk@4, ws_ext_sales_price@5] │ CoalescePartitionsExec │ [Stage 1] => NetworkBroadcastExec: partitions_per_consumer=3, stage_partitions=6, input_tasks=2 - │ PartitionIsolatorExec: tasks=2 partitions=6 - │ DataSourceExec: file_groups={6 groups: [[/testdata/tpcds/plans_sf1_partitions4/web_sales/part-0.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/web_sales/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/web_sales/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/web_sales/part-2.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/web_sales/part-2.parquet:..], ...]}, projection=[ws_sold_date_sk, ws_sold_time_sk, ws_item_sk, ws_ext_sales_price], file_type=parquet, predicate=DynamicFilter [ empty ] + │ DistributedLeafExec: DataSourceExec: file_groups={3 groups: [[/testdata/tpcds/plans_sf1_partitions4/web_sales/part-0.parquet:.., /testdata/tpcds/plans_sf1_partitions4/web_sales/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/web_sales/part-1.parquet:.., /testdata/tpcds/plans_sf1_partitions4/web_sales/part-2.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/web_sales/part-2.parquet:.., /testdata/tpcds/plans_sf1_partitions4/web_sales/part-3.parquet:..]]}, projection=[ws_sold_date_sk, ws_sold_time_sk, ws_item_sk, ws_ext_sales_price], file_type=parquet, predicate=DynamicFilter [ empty ] │ ProjectionExec: expr=[cs_ext_sales_price@2 as ext_price, cs_item_sk@1 as sold_item_sk, cs_sold_time_sk@0 as time_sk] │ HashJoinExec: mode=CollectLeft, join_type=Inner, on=[(CAST(date_dim.d_date_sk AS Float64)@1, cs_sold_date_sk@0)], projection=[cs_sold_time_sk@3, cs_item_sk@4, cs_ext_sales_price@5] │ CoalescePartitionsExec @@ -7801,24 +7203,21 @@ mod tests { │ ProjectionExec: expr=[d_date_sk@0 as d_date_sk, CAST(d_date_sk@0 AS Float64) as CAST(date_dim.d_date_sk AS Float64)] │ FilterExec: d_moy@2 = 11 AND d_year@1 = 1999, projection=[d_date_sk@0] │ RepartitionExec: partitioning=RoundRobinBatch(3), input_partitions=2 - │ PartitionIsolatorExec: tasks=2 partitions=4 - │ DataSourceExec: file_groups={4 groups: [[/testdata/tpcds/plans_sf1_partitions4/date_dim/part-0.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-2.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-3.parquet]]}, projection=[d_date_sk, d_year, d_moy], file_type=parquet, predicate=d_moy@8 = 11 AND d_year@6 = 1999, pruning_predicate=d_moy_null_count@2 != row_count@3 AND d_moy_min@0 <= 11 AND 11 <= d_moy_max@1 AND d_year_null_count@6 != row_count@3 AND d_year_min@4 <= 1999 AND 1999 <= d_year_max@5, required_guarantees=[d_moy in (11), d_year in (1999)] + │ DistributedLeafExec: DataSourceExec: file_groups={2 groups: [[/testdata/tpcds/plans_sf1_partitions4/date_dim/part-0.parquet, /testdata/tpcds/plans_sf1_partitions4/date_dim/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-2.parquet, /testdata/tpcds/plans_sf1_partitions4/date_dim/part-3.parquet]]}, projection=[d_date_sk, d_year, d_moy], file_type=parquet, predicate=d_moy@8 = 11 AND d_year@6 = 1999, pruning_predicate=d_moy_null_count@2 != row_count@3 AND d_moy_min@0 <= 11 AND 11 <= d_moy_max@1 AND d_year_null_count@6 != row_count@3 AND d_year_min@4 <= 1999 AND 1999 <= d_year_max@5, required_guarantees=[d_moy in (11), d_year in (1999)] └────────────────────────────────────────────────── ┌───── Stage 2 ── Tasks: t0:[p0..p2] t1:[p3..p5] │ BroadcastExec: input_partitions=3, consumer_tasks=1, output_partitions=3 │ ProjectionExec: expr=[d_date_sk@0 as d_date_sk, CAST(d_date_sk@0 AS Float64) as CAST(date_dim.d_date_sk AS Float64)] │ FilterExec: d_moy@2 = 11 AND d_year@1 = 1999, projection=[d_date_sk@0] │ RepartitionExec: partitioning=RoundRobinBatch(3), input_partitions=2 - │ PartitionIsolatorExec: tasks=2 partitions=4 - │ DataSourceExec: file_groups={4 groups: [[/testdata/tpcds/plans_sf1_partitions4/date_dim/part-0.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-2.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-3.parquet]]}, projection=[d_date_sk, d_year, d_moy], file_type=parquet, predicate=d_moy@8 = 11 AND d_year@6 = 1999, pruning_predicate=d_moy_null_count@2 != row_count@3 AND d_moy_min@0 <= 11 AND 11 <= d_moy_max@1 AND d_year_null_count@6 != row_count@3 AND d_year_min@4 <= 1999 AND 1999 <= d_year_max@5, required_guarantees=[d_moy in (11), d_year in (1999)] + │ DistributedLeafExec: DataSourceExec: file_groups={2 groups: [[/testdata/tpcds/plans_sf1_partitions4/date_dim/part-0.parquet, /testdata/tpcds/plans_sf1_partitions4/date_dim/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-2.parquet, /testdata/tpcds/plans_sf1_partitions4/date_dim/part-3.parquet]]}, projection=[d_date_sk, d_year, d_moy], file_type=parquet, predicate=d_moy@8 = 11 AND d_year@6 = 1999, pruning_predicate=d_moy_null_count@2 != row_count@3 AND d_moy_min@0 <= 11 AND 11 <= d_moy_max@1 AND d_year_null_count@6 != row_count@3 AND d_year_min@4 <= 1999 AND 1999 <= d_year_max@5, required_guarantees=[d_moy in (11), d_year in (1999)] └────────────────────────────────────────────────── ┌───── Stage 3 ── Tasks: t0:[p0..p2] t1:[p3..p5] │ BroadcastExec: input_partitions=3, consumer_tasks=1, output_partitions=3 │ ProjectionExec: expr=[d_date_sk@0 as d_date_sk, CAST(d_date_sk@0 AS Float64) as CAST(date_dim.d_date_sk AS Float64)] │ FilterExec: d_moy@2 = 11 AND d_year@1 = 1999, projection=[d_date_sk@0] │ RepartitionExec: partitioning=RoundRobinBatch(3), input_partitions=2 - │ PartitionIsolatorExec: tasks=2 partitions=4 - │ DataSourceExec: file_groups={4 groups: [[/testdata/tpcds/plans_sf1_partitions4/date_dim/part-0.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-2.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-3.parquet]]}, projection=[d_date_sk, d_year, d_moy], file_type=parquet, predicate=d_moy@8 = 11 AND d_year@6 = 1999, pruning_predicate=d_moy_null_count@2 != row_count@3 AND d_moy_min@0 <= 11 AND 11 <= d_moy_max@1 AND d_year_null_count@6 != row_count@3 AND d_year_min@4 <= 1999 AND 1999 <= d_year_max@5, required_guarantees=[d_moy in (11), d_year in (1999)] + │ DistributedLeafExec: DataSourceExec: file_groups={2 groups: [[/testdata/tpcds/plans_sf1_partitions4/date_dim/part-0.parquet, /testdata/tpcds/plans_sf1_partitions4/date_dim/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-2.parquet, /testdata/tpcds/plans_sf1_partitions4/date_dim/part-3.parquet]]}, projection=[d_date_sk, d_year, d_moy], file_type=parquet, predicate=d_moy@8 = 11 AND d_year@6 = 1999, pruning_predicate=d_moy_null_count@2 != row_count@3 AND d_moy_min@0 <= 11 AND 11 <= d_moy_max@1 AND d_year_null_count@6 != row_count@3 AND d_year_min@4 <= 1999 AND 1999 <= d_year_max@5, required_guarantees=[d_moy in (11), d_year in (1999)] └────────────────────────────────────────────────── "); Ok(()) @@ -7826,7 +7225,7 @@ mod tests { #[tokio::test] async fn test_tpcds_72() -> Result<()> { let display = test_tpcds_query("q72").await?; - assert_snapshot!(display, @" + assert_snapshot!(display, @r" ┌───── DistributedExec ── Tasks: t0:[p0] │ SortPreservingMergeExec: [total_cnt@5 DESC, i_item_desc@0 ASC, w_warehouse_name@1 ASC, d_week_seq@2 ASC], fetch=100 │ SortExec: TopK(fetch=100), expr=[total_cnt@5 DESC, i_item_desc@0 ASC, w_warehouse_name@1 ASC, d_week_seq@2 ASC], preserve_partitioning=[true] @@ -7849,13 +7248,11 @@ mod tests { │ [Stage 11] => NetworkBroadcastExec: partitions_per_consumer=3, stage_partitions=9, input_tasks=3 │ ProjectionExec: expr=[d_date_sk@0 as d_date_sk, d_date@1 as d_date, CAST(d_date_sk@0 AS Float64) as CAST(d3.d_date_sk AS Float64)] │ RepartitionExec: partitioning=RoundRobinBatch(3), input_partitions=2 - │ PartitionIsolatorExec: tasks=3 partitions=4 - │ DataSourceExec: file_groups={4 groups: [[/testdata/tpcds/plans_sf1_partitions4/date_dim/part-0.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-2.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-3.parquet]]}, projection=[d_date_sk, d_date], file_type=parquet, predicate=DynamicFilter [ empty ] + │ DistributedLeafExec: DataSourceExec: file_groups={2 groups: [[/testdata/tpcds/plans_sf1_partitions4/date_dim/part-0.parquet, /testdata/tpcds/plans_sf1_partitions4/date_dim/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-2.parquet, /testdata/tpcds/plans_sf1_partitions4/date_dim/part-3.parquet]]}, projection=[d_date_sk, d_date], file_type=parquet, predicate=DynamicFilter [ empty ] └────────────────────────────────────────────────── ┌───── Stage 1 ── Tasks: t0:[p0..p5] t1:[p6..p11] │ BroadcastExec: input_partitions=2, consumer_tasks=3, output_partitions=6 - │ PartitionIsolatorExec: tasks=2 partitions=4 - │ DataSourceExec: file_groups={4 groups: [[/testdata/tpcds/plans_sf1_partitions4/promotion/part-0.parquet], [/testdata/tpcds/plans_sf1_partitions4/promotion/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/promotion/part-2.parquet], [/testdata/tpcds/plans_sf1_partitions4/promotion/part-3.parquet]]}, projection=[p_promo_sk, CAST(p_promo_sk@0 AS Float64) as CAST(promotion.p_promo_sk AS Float64)], file_type=parquet + │ DistributedLeafExec: DataSourceExec: file_groups={2 groups: [[/testdata/tpcds/plans_sf1_partitions4/promotion/part-0.parquet, /testdata/tpcds/plans_sf1_partitions4/promotion/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/promotion/part-2.parquet, /testdata/tpcds/plans_sf1_partitions4/promotion/part-3.parquet]]}, projection=[p_promo_sk, CAST(p_promo_sk@0 AS Float64) as CAST(promotion.p_promo_sk AS Float64)], file_type=parquet └────────────────────────────────────────────────── ┌───── Stage 11 ── Tasks: t0:[p0..p8] t1:[p9..p17] t2:[p18..p26] │ BroadcastExec: input_partitions=3, consumer_tasks=3, output_partitions=9 @@ -7863,8 +7260,7 @@ mod tests { │ CoalescePartitionsExec │ [Stage 10] => NetworkBroadcastExec: partitions_per_consumer=3, stage_partitions=9, input_tasks=3 │ RepartitionExec: partitioning=RoundRobinBatch(3), input_partitions=2 - │ PartitionIsolatorExec: tasks=3 partitions=4 - │ DataSourceExec: file_groups={4 groups: [[/testdata/tpcds/plans_sf1_partitions4/date_dim/part-0.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-2.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-3.parquet]]}, projection=[d_date_sk, d_week_seq], file_type=parquet, predicate=DynamicFilter [ empty ] + │ DistributedLeafExec: DataSourceExec: file_groups={2 groups: [[/testdata/tpcds/plans_sf1_partitions4/date_dim/part-0.parquet, /testdata/tpcds/plans_sf1_partitions4/date_dim/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-2.parquet, /testdata/tpcds/plans_sf1_partitions4/date_dim/part-3.parquet]]}, projection=[d_date_sk, d_week_seq], file_type=parquet, predicate=DynamicFilter [ empty ] └────────────────────────────────────────────────── ┌───── Stage 10 ── Tasks: t0:[p0..p8] t1:[p9..p17] t2:[p18..p26] │ BroadcastExec: input_partitions=3, consumer_tasks=3, output_partitions=9 @@ -7884,23 +7280,20 @@ mod tests { │ ProjectionExec: expr=[d_date_sk@0 as d_date_sk, d_date@1 as d_date, d_week_seq@2 as d_week_seq, CAST(d_date_sk@0 AS Float64) as CAST(d1.d_date_sk AS Float64)] │ FilterExec: d_year@3 = 1999, projection=[d_date_sk@0, d_date@1, d_week_seq@2] │ RepartitionExec: partitioning=RoundRobinBatch(3), input_partitions=2 - │ PartitionIsolatorExec: tasks=2 partitions=4 - │ DataSourceExec: file_groups={4 groups: [[/testdata/tpcds/plans_sf1_partitions4/date_dim/part-0.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-2.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-3.parquet]]}, projection=[d_date_sk, d_date, d_week_seq, d_year], file_type=parquet, predicate=d_year@6 = 1999, pruning_predicate=d_year_null_count@2 != row_count@3 AND d_year_min@0 <= 1999 AND 1999 <= d_year_max@1, required_guarantees=[d_year in (1999)] + │ DistributedLeafExec: DataSourceExec: file_groups={2 groups: [[/testdata/tpcds/plans_sf1_partitions4/date_dim/part-0.parquet, /testdata/tpcds/plans_sf1_partitions4/date_dim/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-2.parquet, /testdata/tpcds/plans_sf1_partitions4/date_dim/part-3.parquet]]}, projection=[d_date_sk, d_date, d_week_seq, d_year], file_type=parquet, predicate=d_year@6 = 1999, pruning_predicate=d_year_null_count@2 != row_count@3 AND d_year_min@0 <= 1999 AND 1999 <= d_year_max@1, required_guarantees=[d_year in (1999)] └────────────────────────────────────────────────── ┌───── Stage 3 ── Tasks: t0:[p0..p8] t1:[p9..p17] │ BroadcastExec: input_partitions=3, consumer_tasks=3, output_partitions=9 │ ProjectionExec: expr=[hd_demo_sk@0 as hd_demo_sk, CAST(hd_demo_sk@0 AS Float64) as CAST(household_demographics.hd_demo_sk AS Float64)] │ FilterExec: hd_buy_potential@1 = >10000, projection=[hd_demo_sk@0] │ RepartitionExec: partitioning=RoundRobinBatch(3), input_partitions=2 - │ PartitionIsolatorExec: tasks=2 partitions=4 - │ DataSourceExec: file_groups={4 groups: [[/testdata/tpcds/plans_sf1_partitions4/household_demographics/part-0.parquet], [/testdata/tpcds/plans_sf1_partitions4/household_demographics/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/household_demographics/part-2.parquet], [/testdata/tpcds/plans_sf1_partitions4/household_demographics/part-3.parquet]]}, projection=[hd_demo_sk, hd_buy_potential], file_type=parquet, predicate=hd_buy_potential@2 = >10000, pruning_predicate=hd_buy_potential_null_count@2 != row_count@3 AND hd_buy_potential_min@0 <= >10000 AND >10000 <= hd_buy_potential_max@1, required_guarantees=[hd_buy_potential in (>10000)] + │ DistributedLeafExec: DataSourceExec: file_groups={2 groups: [[/testdata/tpcds/plans_sf1_partitions4/household_demographics/part-0.parquet, /testdata/tpcds/plans_sf1_partitions4/household_demographics/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/household_demographics/part-2.parquet, /testdata/tpcds/plans_sf1_partitions4/household_demographics/part-3.parquet]]}, projection=[hd_demo_sk, hd_buy_potential], file_type=parquet, predicate=hd_buy_potential@2 = >10000, pruning_predicate=hd_buy_potential_null_count@2 != row_count@3 AND hd_buy_potential_min@0 <= >10000 AND >10000 <= hd_buy_potential_max@1, required_guarantees=[hd_buy_potential in (>10000)] └────────────────────────────────────────────────── ┌───── Stage 4 ── Tasks: t0:[p0..p8] t1:[p0..p8] t2:[p0..p8] - │ RepartitionExec: partitioning=Hash([CAST(customer_demographics.cd_demo_sk AS Float64)@1], 9), input_partitions=2 + │ RepartitionExec: partitioning=Hash([CAST(customer_demographics.cd_demo_sk AS Float64)@1], 9), input_partitions=3 │ ProjectionExec: expr=[cd_demo_sk@0 as cd_demo_sk, CAST(cd_demo_sk@0 AS Float64) as CAST(customer_demographics.cd_demo_sk AS Float64)] │ FilterExec: cd_marital_status@1 = D, projection=[cd_demo_sk@0] - │ PartitionIsolatorExec: tasks=3 partitions=6 - │ DataSourceExec: file_groups={6 groups: [[/testdata/tpcds/plans_sf1_partitions4/customer_demographics/part-0.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/customer_demographics/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/customer_demographics/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/customer_demographics/part-2.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/customer_demographics/part-2.parquet:..], ...]}, projection=[cd_demo_sk, cd_marital_status], file_type=parquet, predicate=cd_marital_status@2 = D, pruning_predicate=cd_marital_status_null_count@2 != row_count@3 AND cd_marital_status_min@0 <= D AND D <= cd_marital_status_max@1, required_guarantees=[cd_marital_status in (D)] + │ DistributedLeafExec: DataSourceExec: file_groups={3 groups: [[/testdata/tpcds/plans_sf1_partitions4/customer_demographics/part-0.parquet:.., /testdata/tpcds/plans_sf1_partitions4/customer_demographics/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/customer_demographics/part-1.parquet:.., /testdata/tpcds/plans_sf1_partitions4/customer_demographics/part-2.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/customer_demographics/part-2.parquet:.., /testdata/tpcds/plans_sf1_partitions4/customer_demographics/part-3.parquet:..]]}, projection=[cd_demo_sk, cd_marital_status], file_type=parquet, predicate=cd_marital_status@2 = D, pruning_predicate=cd_marital_status_null_count@2 != row_count@3 AND cd_marital_status_min@0 <= D AND D <= cd_marital_status_max@1, required_guarantees=[cd_marital_status in (D)] └────────────────────────────────────────────────── ┌───── Stage 9 ── Tasks: t0:[p0..p8] t1:[p0..p8] t2:[p0..p8] │ RepartitionExec: partitioning=Hash([cs_bill_cdemo_sk@2], 9), input_partitions=3 @@ -7918,23 +7311,19 @@ mod tests { └────────────────────────────────────────────────── ┌───── Stage 5 ── Tasks: t0:[p0..p5] t1:[p6..p11] │ BroadcastExec: input_partitions=2, consumer_tasks=3, output_partitions=6 - │ PartitionIsolatorExec: tasks=2 partitions=4 - │ DataSourceExec: file_groups={4 groups: [[/testdata/tpcds/plans_sf1_partitions4/item/part-0.parquet], [/testdata/tpcds/plans_sf1_partitions4/item/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/item/part-2.parquet], [/testdata/tpcds/plans_sf1_partitions4/item/part-3.parquet]]}, projection=[i_item_sk, i_item_desc], file_type=parquet + │ DistributedLeafExec: DataSourceExec: file_groups={2 groups: [[/testdata/tpcds/plans_sf1_partitions4/item/part-0.parquet, /testdata/tpcds/plans_sf1_partitions4/item/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/item/part-2.parquet, /testdata/tpcds/plans_sf1_partitions4/item/part-3.parquet]]}, projection=[i_item_sk, i_item_desc], file_type=parquet └────────────────────────────────────────────────── ┌───── Stage 6 ── Tasks: t0:[p0..p5] t1:[p6..p11] │ BroadcastExec: input_partitions=2, consumer_tasks=3, output_partitions=6 - │ PartitionIsolatorExec: tasks=2 partitions=4 - │ DataSourceExec: file_groups={4 groups: [[/testdata/tpcds/plans_sf1_partitions4/warehouse/part-0.parquet], [/testdata/tpcds/plans_sf1_partitions4/warehouse/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/warehouse/part-2.parquet], [/testdata/tpcds/plans_sf1_partitions4/warehouse/part-3.parquet]]}, projection=[w_warehouse_sk, w_warehouse_name], file_type=parquet + │ DistributedLeafExec: DataSourceExec: file_groups={2 groups: [[/testdata/tpcds/plans_sf1_partitions4/warehouse/part-0.parquet, /testdata/tpcds/plans_sf1_partitions4/warehouse/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/warehouse/part-2.parquet, /testdata/tpcds/plans_sf1_partitions4/warehouse/part-3.parquet]]}, projection=[w_warehouse_sk, w_warehouse_name], file_type=parquet └────────────────────────────────────────────────── ┌───── Stage 7 ── Tasks: t0:[p0..p8] t1:[p0..p8] t2:[p0..p8] - │ RepartitionExec: partitioning=Hash([cs_item_sk@4], 9), input_partitions=2 - │ PartitionIsolatorExec: tasks=3 partitions=6 - │ DataSourceExec: file_groups={6 groups: [[/testdata/tpcds/plans_sf1_partitions4/catalog_sales/part-0.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/catalog_sales/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/catalog_sales/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/catalog_sales/part-2.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/catalog_sales/part-2.parquet:..], ...]}, projection=[cs_sold_date_sk, cs_ship_date_sk, cs_bill_cdemo_sk, cs_bill_hdemo_sk, cs_item_sk, cs_promo_sk, cs_order_number, cs_quantity], file_type=parquet, predicate=DynamicFilter [ empty ] AND DynamicFilter [ empty ] AND DynamicFilter [ empty ] AND DynamicFilter [ empty ] + │ RepartitionExec: partitioning=Hash([cs_item_sk@4], 9), input_partitions=3 + │ DistributedLeafExec: DataSourceExec: file_groups={3 groups: [[/testdata/tpcds/plans_sf1_partitions4/catalog_sales/part-0.parquet:.., /testdata/tpcds/plans_sf1_partitions4/catalog_sales/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/catalog_sales/part-1.parquet:.., /testdata/tpcds/plans_sf1_partitions4/catalog_sales/part-2.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/catalog_sales/part-2.parquet:.., /testdata/tpcds/plans_sf1_partitions4/catalog_sales/part-3.parquet:..]]}, projection=[cs_sold_date_sk, cs_ship_date_sk, cs_bill_cdemo_sk, cs_bill_hdemo_sk, cs_item_sk, cs_promo_sk, cs_order_number, cs_quantity], file_type=parquet, predicate=DynamicFilter [ empty ] AND DynamicFilter [ empty ] AND DynamicFilter [ empty ] AND DynamicFilter [ empty ] └────────────────────────────────────────────────── ┌───── Stage 8 ── Tasks: t0:[p0..p8] t1:[p0..p8] t2:[p0..p8] - │ RepartitionExec: partitioning=Hash([inv_item_sk@1], 9), input_partitions=2 - │ PartitionIsolatorExec: tasks=3 partitions=6 - │ DataSourceExec: file_groups={6 groups: [[/testdata/tpcds/plans_sf1_partitions4/inventory/part-0.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/inventory/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/inventory/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/inventory/part-2.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/inventory/part-2.parquet:..], ...]}, projection=[inv_date_sk, inv_item_sk, inv_warehouse_sk, inv_quantity_on_hand], file_type=parquet, predicate=DynamicFilter [ empty ] AND DynamicFilter [ empty ] + │ RepartitionExec: partitioning=Hash([inv_item_sk@1], 9), input_partitions=3 + │ DistributedLeafExec: DataSourceExec: file_groups={3 groups: [[/testdata/tpcds/plans_sf1_partitions4/inventory/part-0.parquet:.., /testdata/tpcds/plans_sf1_partitions4/inventory/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/inventory/part-1.parquet:.., /testdata/tpcds/plans_sf1_partitions4/inventory/part-2.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/inventory/part-2.parquet:.., /testdata/tpcds/plans_sf1_partitions4/inventory/part-3.parquet:..]]}, projection=[inv_date_sk, inv_item_sk, inv_warehouse_sk, inv_quantity_on_hand], file_type=parquet, predicate=DynamicFilter [ empty ] AND DynamicFilter [ empty ] └────────────────────────────────────────────────── "); Ok(()) @@ -7942,7 +7331,7 @@ mod tests { #[tokio::test] async fn test_tpcds_73() -> Result<()> { let display = test_tpcds_query("q73").await?; - assert_snapshot!(display, @" + assert_snapshot!(display, @r" ┌───── DistributedExec ── Tasks: t0:[p0] │ SortPreservingMergeExec: [cnt@5 DESC, c_last_name@0 ASC NULLS LAST] │ [Stage 6] => NetworkCoalesceExec: output_partitions=6, input_tasks=2 @@ -7955,8 +7344,7 @@ mod tests { │ [Stage 5] => NetworkBroadcastExec: partitions_per_consumer=3, stage_partitions=6, input_tasks=2 │ ProjectionExec: expr=[c_customer_sk@0 as c_customer_sk, c_salutation@1 as c_salutation, c_first_name@2 as c_first_name, c_last_name@3 as c_last_name, c_preferred_cust_flag@4 as c_preferred_cust_flag, CAST(c_customer_sk@0 AS Float64) as CAST(customer.c_customer_sk AS Float64)] │ RepartitionExec: partitioning=RoundRobinBatch(3), input_partitions=2 - │ PartitionIsolatorExec: tasks=2 partitions=4 - │ DataSourceExec: file_groups={4 groups: [[/testdata/tpcds/plans_sf1_partitions4/customer/part-0.parquet], [/testdata/tpcds/plans_sf1_partitions4/customer/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/customer/part-2.parquet], [/testdata/tpcds/plans_sf1_partitions4/customer/part-3.parquet]]}, projection=[c_customer_sk, c_salutation, c_first_name, c_last_name, c_preferred_cust_flag], file_type=parquet, predicate=DynamicFilter [ empty ] + │ DistributedLeafExec: DataSourceExec: file_groups={2 groups: [[/testdata/tpcds/plans_sf1_partitions4/customer/part-0.parquet, /testdata/tpcds/plans_sf1_partitions4/customer/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/customer/part-2.parquet, /testdata/tpcds/plans_sf1_partitions4/customer/part-3.parquet]]}, projection=[c_customer_sk, c_salutation, c_first_name, c_last_name, c_preferred_cust_flag], file_type=parquet, predicate=DynamicFilter [ empty ] └────────────────────────────────────────────────── ┌───── Stage 5 ── Tasks: t0:[p0..p5] t1:[p6..p11] │ BroadcastExec: input_partitions=3, consumer_tasks=2, output_partitions=6 @@ -7966,7 +7354,7 @@ mod tests { │ [Stage 4] => NetworkShuffleExec: output_partitions=3, input_tasks=3 └────────────────────────────────────────────────── ┌───── Stage 4 ── Tasks: t0:[p0..p5] t1:[p0..p5] t2:[p0..p5] - │ RepartitionExec: partitioning=Hash([ss_ticket_number@0, ss_customer_sk@1], 6), input_partitions=2 + │ RepartitionExec: partitioning=Hash([ss_ticket_number@0, ss_customer_sk@1], 6), input_partitions=3 │ AggregateExec: mode=Partial, gby=[ss_ticket_number@1 as ss_ticket_number, ss_customer_sk@0 as ss_customer_sk], aggr=[count(Int64(1))] │ HashJoinExec: mode=CollectLeft, join_type=Inner, on=[(CAST(household_demographics.hd_demo_sk AS Float64)@1, ss_hdemo_sk@1)], projection=[ss_customer_sk@2, ss_ticket_number@4] │ CoalescePartitionsExec @@ -7977,32 +7365,28 @@ mod tests { │ HashJoinExec: mode=CollectLeft, join_type=Inner, on=[(CAST(date_dim.d_date_sk AS Float64)@1, ss_sold_date_sk@0)], projection=[ss_customer_sk@3, ss_hdemo_sk@4, ss_store_sk@5, ss_ticket_number@6] │ CoalescePartitionsExec │ [Stage 3] => NetworkBroadcastExec: partitions_per_consumer=3, stage_partitions=9, input_tasks=2 - │ PartitionIsolatorExec: tasks=3 partitions=6 - │ DataSourceExec: file_groups={6 groups: [[/testdata/tpcds/plans_sf1_partitions4/store_sales/part-0.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_sales/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_sales/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_sales/part-2.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_sales/part-2.parquet:..], ...]}, projection=[ss_sold_date_sk, ss_customer_sk, ss_hdemo_sk, ss_store_sk, ss_ticket_number], file_type=parquet, predicate=DynamicFilter [ empty ] AND DynamicFilter [ empty ] AND DynamicFilter [ empty ] + │ DistributedLeafExec: DataSourceExec: file_groups={3 groups: [[/testdata/tpcds/plans_sf1_partitions4/store_sales/part-0.parquet:.., /testdata/tpcds/plans_sf1_partitions4/store_sales/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_sales/part-1.parquet:.., /testdata/tpcds/plans_sf1_partitions4/store_sales/part-2.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_sales/part-2.parquet:.., /testdata/tpcds/plans_sf1_partitions4/store_sales/part-3.parquet:..]]}, projection=[ss_sold_date_sk, ss_customer_sk, ss_hdemo_sk, ss_store_sk, ss_ticket_number], file_type=parquet, predicate=DynamicFilter [ empty ] AND DynamicFilter [ empty ] AND DynamicFilter [ empty ] └────────────────────────────────────────────────── ┌───── Stage 1 ── Tasks: t0:[p0..p8] t1:[p9..p17] │ BroadcastExec: input_partitions=3, consumer_tasks=3, output_partitions=9 │ ProjectionExec: expr=[hd_demo_sk@0 as hd_demo_sk, CAST(hd_demo_sk@0 AS Float64) as CAST(household_demographics.hd_demo_sk AS Float64)] │ FilterExec: (hd_buy_potential@1 = Unknown OR hd_buy_potential@1 = >10000) AND hd_vehicle_count@3 > 0 AND CASE WHEN hd_vehicle_count@3 > 0 THEN CAST(hd_dep_count@2 AS Float64) / CAST(hd_vehicle_count@3 AS Float64) END > 1, projection=[hd_demo_sk@0] │ RepartitionExec: partitioning=RoundRobinBatch(3), input_partitions=2 - │ PartitionIsolatorExec: tasks=2 partitions=4 - │ DataSourceExec: file_groups={4 groups: [[/testdata/tpcds/plans_sf1_partitions4/household_demographics/part-0.parquet], [/testdata/tpcds/plans_sf1_partitions4/household_demographics/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/household_demographics/part-2.parquet], [/testdata/tpcds/plans_sf1_partitions4/household_demographics/part-3.parquet]]}, projection=[hd_demo_sk, hd_buy_potential, hd_dep_count, hd_vehicle_count], file_type=parquet, predicate=(hd_buy_potential@2 = Unknown OR hd_buy_potential@2 = >10000) AND hd_vehicle_count@4 > 0 AND CASE WHEN hd_vehicle_count@4 > 0 THEN CAST(hd_dep_count@3 AS Float64) / CAST(hd_vehicle_count@4 AS Float64) END > 1, pruning_predicate=(hd_buy_potential_null_count@2 != row_count@3 AND hd_buy_potential_min@0 <= Unknown AND Unknown <= hd_buy_potential_max@1 OR hd_buy_potential_null_count@2 != row_count@3 AND hd_buy_potential_min@0 <= >10000 AND >10000 <= hd_buy_potential_max@1) AND hd_vehicle_count_null_count@5 != row_count@3 AND hd_vehicle_count_max@4 > 0, required_guarantees=[hd_buy_potential in (>10000, Unknown)] + │ DistributedLeafExec: DataSourceExec: file_groups={2 groups: [[/testdata/tpcds/plans_sf1_partitions4/household_demographics/part-0.parquet, /testdata/tpcds/plans_sf1_partitions4/household_demographics/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/household_demographics/part-2.parquet, /testdata/tpcds/plans_sf1_partitions4/household_demographics/part-3.parquet]]}, projection=[hd_demo_sk, hd_buy_potential, hd_dep_count, hd_vehicle_count], file_type=parquet, predicate=(hd_buy_potential@2 = Unknown OR hd_buy_potential@2 = >10000) AND hd_vehicle_count@4 > 0 AND CASE WHEN hd_vehicle_count@4 > 0 THEN CAST(hd_dep_count@3 AS Float64) / CAST(hd_vehicle_count@4 AS Float64) END > 1, pruning_predicate=(hd_buy_potential_null_count@2 != row_count@3 AND hd_buy_potential_min@0 <= Unknown AND Unknown <= hd_buy_potential_max@1 OR hd_buy_potential_null_count@2 != row_count@3 AND hd_buy_potential_min@0 <= >10000 AND >10000 <= hd_buy_potential_max@1) AND hd_vehicle_count_null_count@5 != row_count@3 AND hd_vehicle_count_max@4 > 0, required_guarantees=[hd_buy_potential in (>10000, Unknown)] └────────────────────────────────────────────────── ┌───── Stage 2 ── Tasks: t0:[p0..p8] t1:[p9..p17] │ BroadcastExec: input_partitions=3, consumer_tasks=3, output_partitions=9 │ ProjectionExec: expr=[s_store_sk@0 as s_store_sk, CAST(s_store_sk@0 AS Float64) as CAST(store.s_store_sk AS Float64)] │ FilterExec: s_county@1 IN (SET) ([Orange County, Bronx County, Franklin Parish, Williamson County]), projection=[s_store_sk@0] │ RepartitionExec: partitioning=RoundRobinBatch(3), input_partitions=2 - │ PartitionIsolatorExec: tasks=2 partitions=4 - │ DataSourceExec: file_groups={4 groups: [[/testdata/tpcds/plans_sf1_partitions4/store/part-0.parquet], [/testdata/tpcds/plans_sf1_partitions4/store/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/store/part-2.parquet], [/testdata/tpcds/plans_sf1_partitions4/store/part-3.parquet]]}, projection=[s_store_sk, s_county], file_type=parquet, predicate=s_county@23 IN (SET) ([Orange County, Bronx County, Franklin Parish, Williamson County]), pruning_predicate=s_county_null_count@2 != row_count@3 AND s_county_min@0 <= Orange County AND Orange County <= s_county_max@1 OR s_county_null_count@2 != row_count@3 AND s_county_min@0 <= Bronx County AND Bronx County <= s_county_max@1 OR s_county_null_count@2 != row_count@3 AND s_county_min@0 <= Franklin Parish AND Franklin Parish <= s_county_max@1 OR s_county_null_count@2 != row_count@3 AND s_county_min@0 <= Williamson County AND Williamson County <= s_county_max@1, required_guarantees=[s_county in (Bronx County, Franklin Parish, Orange County, Williamson County)] + │ DistributedLeafExec: DataSourceExec: file_groups={2 groups: [[/testdata/tpcds/plans_sf1_partitions4/store/part-0.parquet, /testdata/tpcds/plans_sf1_partitions4/store/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/store/part-2.parquet, /testdata/tpcds/plans_sf1_partitions4/store/part-3.parquet]]}, projection=[s_store_sk, s_county], file_type=parquet, predicate=s_county@23 IN (SET) ([Orange County, Bronx County, Franklin Parish, Williamson County]), pruning_predicate=s_county_null_count@2 != row_count@3 AND s_county_min@0 <= Orange County AND Orange County <= s_county_max@1 OR s_county_null_count@2 != row_count@3 AND s_county_min@0 <= Bronx County AND Bronx County <= s_county_max@1 OR s_county_null_count@2 != row_count@3 AND s_county_min@0 <= Franklin Parish AND Franklin Parish <= s_county_max@1 OR s_county_null_count@2 != row_count@3 AND s_county_min@0 <= Williamson County AND Williamson County <= s_county_max@1, required_guarantees=[s_county in (Bronx County, Franklin Parish, Orange County, Williamson County)] └────────────────────────────────────────────────── ┌───── Stage 3 ── Tasks: t0:[p0..p8] t1:[p9..p17] │ BroadcastExec: input_partitions=3, consumer_tasks=3, output_partitions=9 │ ProjectionExec: expr=[d_date_sk@0 as d_date_sk, CAST(d_date_sk@0 AS Float64) as CAST(date_dim.d_date_sk AS Float64)] │ FilterExec: d_dom@2 >= 1 AND d_dom@2 <= 2 AND (d_year@1 = 1999 OR d_year@1 = 2000 OR d_year@1 = 2001), projection=[d_date_sk@0] │ RepartitionExec: partitioning=RoundRobinBatch(3), input_partitions=2 - │ PartitionIsolatorExec: tasks=2 partitions=4 - │ DataSourceExec: file_groups={4 groups: [[/testdata/tpcds/plans_sf1_partitions4/date_dim/part-0.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-2.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-3.parquet]]}, projection=[d_date_sk, d_year, d_dom], file_type=parquet, predicate=d_dom@9 >= 1 AND d_dom@9 <= 2 AND (d_year@6 = 1999 OR d_year@6 = 2000 OR d_year@6 = 2001), pruning_predicate=d_dom_null_count@1 != row_count@2 AND d_dom_max@0 >= 1 AND d_dom_null_count@1 != row_count@2 AND d_dom_min@3 <= 2 AND (d_year_null_count@6 != row_count@2 AND d_year_min@4 <= 1999 AND 1999 <= d_year_max@5 OR d_year_null_count@6 != row_count@2 AND d_year_min@4 <= 2000 AND 2000 <= d_year_max@5 OR d_year_null_count@6 != row_count@2 AND d_year_min@4 <= 2001 AND 2001 <= d_year_max@5), required_guarantees=[d_year in (1999, 2000, 2001)] + │ DistributedLeafExec: DataSourceExec: file_groups={2 groups: [[/testdata/tpcds/plans_sf1_partitions4/date_dim/part-0.parquet, /testdata/tpcds/plans_sf1_partitions4/date_dim/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-2.parquet, /testdata/tpcds/plans_sf1_partitions4/date_dim/part-3.parquet]]}, projection=[d_date_sk, d_year, d_dom], file_type=parquet, predicate=d_dom@9 >= 1 AND d_dom@9 <= 2 AND (d_year@6 = 1999 OR d_year@6 = 2000 OR d_year@6 = 2001), pruning_predicate=d_dom_null_count@1 != row_count@2 AND d_dom_max@0 >= 1 AND d_dom_null_count@1 != row_count@2 AND d_dom_min@3 <= 2 AND (d_year_null_count@6 != row_count@2 AND d_year_min@4 <= 1999 AND 1999 <= d_year_max@5 OR d_year_null_count@6 != row_count@2 AND d_year_min@4 <= 2000 AND 2000 <= d_year_max@5 OR d_year_null_count@6 != row_count@2 AND d_year_min@4 <= 2001 AND 2001 <= d_year_max@5), required_guarantees=[d_year in (1999, 2000, 2001)] └────────────────────────────────────────────────── "); Ok(()) @@ -8010,7 +7394,7 @@ mod tests { #[tokio::test] async fn test_tpcds_74() -> Result<()> { let display = test_tpcds_query("q74").await?; - assert_snapshot!(display, @" + assert_snapshot!(display, @r" ┌───── DistributedExec ── Tasks: t0:[p0] │ SortPreservingMergeExec: [customer_id@0 ASC], fetch=100 │ [Stage 16] => NetworkCoalesceExec: output_partitions=6, input_tasks=2 @@ -8046,7 +7430,7 @@ mod tests { │ [Stage 3] => NetworkShuffleExec: output_partitions=3, input_tasks=3 └────────────────────────────────────────────────── ┌───── Stage 3 ── Tasks: t0:[p0..p5] t1:[p0..p5] t2:[p0..p5] - │ RepartitionExec: partitioning=Hash([c_customer_id@0, c_first_name@1, c_last_name@2, d_year@3], 6), input_partitions=2 + │ RepartitionExec: partitioning=Hash([c_customer_id@0, c_first_name@1, c_last_name@2, d_year@3], 6), input_partitions=3 │ AggregateExec: mode=Partial, gby=[c_customer_id@0 as c_customer_id, c_first_name@1 as c_first_name, c_last_name@2 as c_last_name, d_year@4 as d_year], aggr=[sum(web_sales.ws_net_paid)], ordering_mode=PartiallySorted([3]) │ ProjectionExec: expr=[c_customer_id@1 as c_customer_id, c_first_name@2 as c_first_name, c_last_name@3 as c_last_name, ws_net_paid@4 as ws_net_paid, d_year@0 as d_year] │ HashJoinExec: mode=CollectLeft, join_type=Inner, on=[(CAST(date_dim.d_date_sk AS Float64)@2, ws_sold_date_sk@3)], projection=[d_year@1, c_customer_id@3, c_first_name@4, c_last_name@5, ws_net_paid@7] @@ -8055,23 +7439,20 @@ mod tests { │ HashJoinExec: mode=CollectLeft, join_type=Inner, on=[(CAST(customer.c_customer_sk AS Float64)@4, ws_bill_customer_sk@1)], projection=[c_customer_id@1, c_first_name@2, c_last_name@3, ws_sold_date_sk@5, ws_net_paid@7] │ CoalescePartitionsExec │ [Stage 2] => NetworkBroadcastExec: partitions_per_consumer=3, stage_partitions=9, input_tasks=2 - │ PartitionIsolatorExec: tasks=3 partitions=6 - │ DataSourceExec: file_groups={6 groups: [[/testdata/tpcds/plans_sf1_partitions4/web_sales/part-0.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/web_sales/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/web_sales/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/web_sales/part-2.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/web_sales/part-2.parquet:..], ...]}, projection=[ws_sold_date_sk, ws_bill_customer_sk, ws_net_paid], file_type=parquet, predicate=DynamicFilter [ empty ] AND DynamicFilter [ empty ] + │ DistributedLeafExec: DataSourceExec: file_groups={3 groups: [[/testdata/tpcds/plans_sf1_partitions4/web_sales/part-0.parquet:.., /testdata/tpcds/plans_sf1_partitions4/web_sales/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/web_sales/part-1.parquet:.., /testdata/tpcds/plans_sf1_partitions4/web_sales/part-2.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/web_sales/part-2.parquet:.., /testdata/tpcds/plans_sf1_partitions4/web_sales/part-3.parquet:..]]}, projection=[ws_sold_date_sk, ws_bill_customer_sk, ws_net_paid], file_type=parquet, predicate=DynamicFilter [ empty ] AND DynamicFilter [ empty ] └────────────────────────────────────────────────── ┌───── Stage 1 ── Tasks: t0:[p0..p8] t1:[p9..p17] │ BroadcastExec: input_partitions=3, consumer_tasks=3, output_partitions=9 │ ProjectionExec: expr=[d_date_sk@0 as d_date_sk, d_year@1 as d_year, CAST(d_date_sk@0 AS Float64) as CAST(date_dim.d_date_sk AS Float64)] │ FilterExec: d_year@1 = 2001 │ RepartitionExec: partitioning=RoundRobinBatch(3), input_partitions=2 - │ PartitionIsolatorExec: tasks=2 partitions=4 - │ DataSourceExec: file_groups={4 groups: [[/testdata/tpcds/plans_sf1_partitions4/date_dim/part-0.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-2.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-3.parquet]]}, projection=[d_date_sk, d_year], file_type=parquet, predicate=d_year@6 = 2001, pruning_predicate=d_year_null_count@2 != row_count@3 AND d_year_min@0 <= 2001 AND 2001 <= d_year_max@1, required_guarantees=[d_year in (2001)] + │ DistributedLeafExec: DataSourceExec: file_groups={2 groups: [[/testdata/tpcds/plans_sf1_partitions4/date_dim/part-0.parquet, /testdata/tpcds/plans_sf1_partitions4/date_dim/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-2.parquet, /testdata/tpcds/plans_sf1_partitions4/date_dim/part-3.parquet]]}, projection=[d_date_sk, d_year], file_type=parquet, predicate=d_year@6 = 2001, pruning_predicate=d_year_null_count@2 != row_count@3 AND d_year_min@0 <= 2001 AND 2001 <= d_year_max@1, required_guarantees=[d_year in (2001)] └────────────────────────────────────────────────── ┌───── Stage 2 ── Tasks: t0:[p0..p8] t1:[p9..p17] │ BroadcastExec: input_partitions=3, consumer_tasks=3, output_partitions=9 │ ProjectionExec: expr=[c_customer_sk@0 as c_customer_sk, c_customer_id@1 as c_customer_id, c_first_name@2 as c_first_name, c_last_name@3 as c_last_name, CAST(c_customer_sk@0 AS Float64) as CAST(customer.c_customer_sk AS Float64)] │ RepartitionExec: partitioning=RoundRobinBatch(3), input_partitions=2 - │ PartitionIsolatorExec: tasks=2 partitions=4 - │ DataSourceExec: file_groups={4 groups: [[/testdata/tpcds/plans_sf1_partitions4/customer/part-0.parquet], [/testdata/tpcds/plans_sf1_partitions4/customer/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/customer/part-2.parquet], [/testdata/tpcds/plans_sf1_partitions4/customer/part-3.parquet]]}, projection=[c_customer_sk, c_customer_id, c_first_name, c_last_name], file_type=parquet + │ DistributedLeafExec: DataSourceExec: file_groups={2 groups: [[/testdata/tpcds/plans_sf1_partitions4/customer/part-0.parquet, /testdata/tpcds/plans_sf1_partitions4/customer/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/customer/part-2.parquet, /testdata/tpcds/plans_sf1_partitions4/customer/part-3.parquet]]}, projection=[c_customer_sk, c_customer_id, c_first_name, c_last_name], file_type=parquet └────────────────────────────────────────────────── ┌───── Stage 8 ── Tasks: t0:[p0..p5] t1:[p6..p11] │ BroadcastExec: input_partitions=3, consumer_tasks=2, output_partitions=6 @@ -8082,7 +7463,7 @@ mod tests { │ [Stage 7] => NetworkShuffleExec: output_partitions=3, input_tasks=3 └────────────────────────────────────────────────── ┌───── Stage 7 ── Tasks: t0:[p0..p5] t1:[p0..p5] t2:[p0..p5] - │ RepartitionExec: partitioning=Hash([c_customer_id@0, c_first_name@1, c_last_name@2, d_year@3], 6), input_partitions=2 + │ RepartitionExec: partitioning=Hash([c_customer_id@0, c_first_name@1, c_last_name@2, d_year@3], 6), input_partitions=3 │ AggregateExec: mode=Partial, gby=[c_customer_id@0 as c_customer_id, c_first_name@1 as c_first_name, c_last_name@2 as c_last_name, d_year@4 as d_year], aggr=[sum(store_sales.ss_net_paid)], ordering_mode=PartiallySorted([3]) │ ProjectionExec: expr=[c_customer_id@1 as c_customer_id, c_first_name@2 as c_first_name, c_last_name@3 as c_last_name, ss_net_paid@4 as ss_net_paid, d_year@0 as d_year] │ HashJoinExec: mode=CollectLeft, join_type=Inner, on=[(CAST(date_dim.d_date_sk AS Float64)@2, ss_sold_date_sk@3)], projection=[d_year@1, c_customer_id@3, c_first_name@4, c_last_name@5, ss_net_paid@7] @@ -8091,26 +7472,23 @@ mod tests { │ HashJoinExec: mode=CollectLeft, join_type=Inner, on=[(CAST(customer.c_customer_sk AS Float64)@4, ss_customer_sk@1)], projection=[c_customer_id@1, c_first_name@2, c_last_name@3, ss_sold_date_sk@5, ss_net_paid@7] │ CoalescePartitionsExec │ [Stage 6] => NetworkBroadcastExec: partitions_per_consumer=3, stage_partitions=9, input_tasks=2 - │ PartitionIsolatorExec: tasks=3 partitions=6 - │ DataSourceExec: file_groups={6 groups: [[/testdata/tpcds/plans_sf1_partitions4/store_sales/part-0.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_sales/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_sales/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_sales/part-2.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_sales/part-2.parquet:..], ...]}, projection=[ss_sold_date_sk, ss_customer_sk, ss_net_paid], file_type=parquet, predicate=DynamicFilter [ empty ] AND DynamicFilter [ empty ] + │ DistributedLeafExec: DataSourceExec: file_groups={3 groups: [[/testdata/tpcds/plans_sf1_partitions4/store_sales/part-0.parquet:.., /testdata/tpcds/plans_sf1_partitions4/store_sales/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_sales/part-1.parquet:.., /testdata/tpcds/plans_sf1_partitions4/store_sales/part-2.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_sales/part-2.parquet:.., /testdata/tpcds/plans_sf1_partitions4/store_sales/part-3.parquet:..]]}, projection=[ss_sold_date_sk, ss_customer_sk, ss_net_paid], file_type=parquet, predicate=DynamicFilter [ empty ] AND DynamicFilter [ empty ] └────────────────────────────────────────────────── ┌───── Stage 5 ── Tasks: t0:[p0..p8] t1:[p9..p17] │ BroadcastExec: input_partitions=3, consumer_tasks=3, output_partitions=9 │ ProjectionExec: expr=[d_date_sk@0 as d_date_sk, d_year@1 as d_year, CAST(d_date_sk@0 AS Float64) as CAST(date_dim.d_date_sk AS Float64)] │ FilterExec: d_year@1 = 2001 │ RepartitionExec: partitioning=RoundRobinBatch(3), input_partitions=2 - │ PartitionIsolatorExec: tasks=2 partitions=4 - │ DataSourceExec: file_groups={4 groups: [[/testdata/tpcds/plans_sf1_partitions4/date_dim/part-0.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-2.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-3.parquet]]}, projection=[d_date_sk, d_year], file_type=parquet, predicate=d_year@6 = 2001, pruning_predicate=d_year_null_count@2 != row_count@3 AND d_year_min@0 <= 2001 AND 2001 <= d_year_max@1, required_guarantees=[d_year in (2001)] + │ DistributedLeafExec: DataSourceExec: file_groups={2 groups: [[/testdata/tpcds/plans_sf1_partitions4/date_dim/part-0.parquet, /testdata/tpcds/plans_sf1_partitions4/date_dim/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-2.parquet, /testdata/tpcds/plans_sf1_partitions4/date_dim/part-3.parquet]]}, projection=[d_date_sk, d_year], file_type=parquet, predicate=d_year@6 = 2001, pruning_predicate=d_year_null_count@2 != row_count@3 AND d_year_min@0 <= 2001 AND 2001 <= d_year_max@1, required_guarantees=[d_year in (2001)] └────────────────────────────────────────────────── ┌───── Stage 6 ── Tasks: t0:[p0..p8] t1:[p9..p17] │ BroadcastExec: input_partitions=3, consumer_tasks=3, output_partitions=9 │ ProjectionExec: expr=[c_customer_sk@0 as c_customer_sk, c_customer_id@1 as c_customer_id, c_first_name@2 as c_first_name, c_last_name@3 as c_last_name, CAST(c_customer_sk@0 AS Float64) as CAST(customer.c_customer_sk AS Float64)] │ RepartitionExec: partitioning=RoundRobinBatch(3), input_partitions=2 - │ PartitionIsolatorExec: tasks=2 partitions=4 - │ DataSourceExec: file_groups={4 groups: [[/testdata/tpcds/plans_sf1_partitions4/customer/part-0.parquet], [/testdata/tpcds/plans_sf1_partitions4/customer/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/customer/part-2.parquet], [/testdata/tpcds/plans_sf1_partitions4/customer/part-3.parquet]]}, projection=[c_customer_sk, c_customer_id, c_first_name, c_last_name], file_type=parquet, predicate=DynamicFilter [ empty ] AND DynamicFilter [ empty ] + │ DistributedLeafExec: DataSourceExec: file_groups={2 groups: [[/testdata/tpcds/plans_sf1_partitions4/customer/part-0.parquet, /testdata/tpcds/plans_sf1_partitions4/customer/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/customer/part-2.parquet, /testdata/tpcds/plans_sf1_partitions4/customer/part-3.parquet]]}, projection=[c_customer_sk, c_customer_id, c_first_name, c_last_name], file_type=parquet, predicate=DynamicFilter [ empty ] AND DynamicFilter [ empty ] └────────────────────────────────────────────────── ┌───── Stage 11 ── Tasks: t0:[p0..p5] t1:[p0..p5] t2:[p0..p5] - │ RepartitionExec: partitioning=Hash([c_customer_id@0, c_first_name@1, c_last_name@2, d_year@3], 6), input_partitions=2 + │ RepartitionExec: partitioning=Hash([c_customer_id@0, c_first_name@1, c_last_name@2, d_year@3], 6), input_partitions=3 │ AggregateExec: mode=Partial, gby=[c_customer_id@0 as c_customer_id, c_first_name@1 as c_first_name, c_last_name@2 as c_last_name, d_year@4 as d_year], aggr=[sum(store_sales.ss_net_paid)], ordering_mode=PartiallySorted([3]) │ ProjectionExec: expr=[c_customer_id@1 as c_customer_id, c_first_name@2 as c_first_name, c_last_name@3 as c_last_name, ss_net_paid@4 as ss_net_paid, d_year@0 as d_year] │ HashJoinExec: mode=CollectLeft, join_type=Inner, on=[(CAST(date_dim.d_date_sk AS Float64)@2, ss_sold_date_sk@3)], projection=[d_year@1, c_customer_id@3, c_first_name@4, c_last_name@5, ss_net_paid@7] @@ -8119,26 +7497,23 @@ mod tests { │ HashJoinExec: mode=CollectLeft, join_type=Inner, on=[(CAST(customer.c_customer_sk AS Float64)@4, ss_customer_sk@1)], projection=[c_customer_id@1, c_first_name@2, c_last_name@3, ss_sold_date_sk@5, ss_net_paid@7] │ CoalescePartitionsExec │ [Stage 10] => NetworkBroadcastExec: partitions_per_consumer=3, stage_partitions=9, input_tasks=2 - │ PartitionIsolatorExec: tasks=3 partitions=6 - │ DataSourceExec: file_groups={6 groups: [[/testdata/tpcds/plans_sf1_partitions4/store_sales/part-0.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_sales/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_sales/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_sales/part-2.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_sales/part-2.parquet:..], ...]}, projection=[ss_sold_date_sk, ss_customer_sk, ss_net_paid], file_type=parquet, predicate=DynamicFilter [ empty ] AND DynamicFilter [ empty ] + │ DistributedLeafExec: DataSourceExec: file_groups={3 groups: [[/testdata/tpcds/plans_sf1_partitions4/store_sales/part-0.parquet:.., /testdata/tpcds/plans_sf1_partitions4/store_sales/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_sales/part-1.parquet:.., /testdata/tpcds/plans_sf1_partitions4/store_sales/part-2.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_sales/part-2.parquet:.., /testdata/tpcds/plans_sf1_partitions4/store_sales/part-3.parquet:..]]}, projection=[ss_sold_date_sk, ss_customer_sk, ss_net_paid], file_type=parquet, predicate=DynamicFilter [ empty ] AND DynamicFilter [ empty ] └────────────────────────────────────────────────── ┌───── Stage 9 ── Tasks: t0:[p0..p8] t1:[p9..p17] │ BroadcastExec: input_partitions=3, consumer_tasks=3, output_partitions=9 │ ProjectionExec: expr=[d_date_sk@0 as d_date_sk, d_year@1 as d_year, CAST(d_date_sk@0 AS Float64) as CAST(date_dim.d_date_sk AS Float64)] │ FilterExec: d_year@1 = 2002 │ RepartitionExec: partitioning=RoundRobinBatch(3), input_partitions=2 - │ PartitionIsolatorExec: tasks=2 partitions=4 - │ DataSourceExec: file_groups={4 groups: [[/testdata/tpcds/plans_sf1_partitions4/date_dim/part-0.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-2.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-3.parquet]]}, projection=[d_date_sk, d_year], file_type=parquet, predicate=d_year@6 = 2002, pruning_predicate=d_year_null_count@2 != row_count@3 AND d_year_min@0 <= 2002 AND 2002 <= d_year_max@1, required_guarantees=[d_year in (2002)] + │ DistributedLeafExec: DataSourceExec: file_groups={2 groups: [[/testdata/tpcds/plans_sf1_partitions4/date_dim/part-0.parquet, /testdata/tpcds/plans_sf1_partitions4/date_dim/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-2.parquet, /testdata/tpcds/plans_sf1_partitions4/date_dim/part-3.parquet]]}, projection=[d_date_sk, d_year], file_type=parquet, predicate=d_year@6 = 2002, pruning_predicate=d_year_null_count@2 != row_count@3 AND d_year_min@0 <= 2002 AND 2002 <= d_year_max@1, required_guarantees=[d_year in (2002)] └────────────────────────────────────────────────── ┌───── Stage 10 ── Tasks: t0:[p0..p8] t1:[p9..p17] │ BroadcastExec: input_partitions=3, consumer_tasks=3, output_partitions=9 │ ProjectionExec: expr=[c_customer_sk@0 as c_customer_sk, c_customer_id@1 as c_customer_id, c_first_name@2 as c_first_name, c_last_name@3 as c_last_name, CAST(c_customer_sk@0 AS Float64) as CAST(customer.c_customer_sk AS Float64)] │ RepartitionExec: partitioning=RoundRobinBatch(3), input_partitions=2 - │ PartitionIsolatorExec: tasks=2 partitions=4 - │ DataSourceExec: file_groups={4 groups: [[/testdata/tpcds/plans_sf1_partitions4/customer/part-0.parquet], [/testdata/tpcds/plans_sf1_partitions4/customer/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/customer/part-2.parquet], [/testdata/tpcds/plans_sf1_partitions4/customer/part-3.parquet]]}, projection=[c_customer_sk, c_customer_id, c_first_name, c_last_name], file_type=parquet, predicate=DynamicFilter [ empty ] + │ DistributedLeafExec: DataSourceExec: file_groups={2 groups: [[/testdata/tpcds/plans_sf1_partitions4/customer/part-0.parquet, /testdata/tpcds/plans_sf1_partitions4/customer/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/customer/part-2.parquet, /testdata/tpcds/plans_sf1_partitions4/customer/part-3.parquet]]}, projection=[c_customer_sk, c_customer_id, c_first_name, c_last_name], file_type=parquet, predicate=DynamicFilter [ empty ] └────────────────────────────────────────────────── ┌───── Stage 15 ── Tasks: t0:[p0..p5] t1:[p0..p5] t2:[p0..p5] - │ RepartitionExec: partitioning=Hash([c_customer_id@0, c_first_name@1, c_last_name@2, d_year@3], 6), input_partitions=2 + │ RepartitionExec: partitioning=Hash([c_customer_id@0, c_first_name@1, c_last_name@2, d_year@3], 6), input_partitions=3 │ AggregateExec: mode=Partial, gby=[c_customer_id@0 as c_customer_id, c_first_name@1 as c_first_name, c_last_name@2 as c_last_name, d_year@4 as d_year], aggr=[sum(web_sales.ws_net_paid)], ordering_mode=PartiallySorted([3]) │ ProjectionExec: expr=[c_customer_id@1 as c_customer_id, c_first_name@2 as c_first_name, c_last_name@3 as c_last_name, ws_net_paid@4 as ws_net_paid, d_year@0 as d_year] │ HashJoinExec: mode=CollectLeft, join_type=Inner, on=[(CAST(date_dim.d_date_sk AS Float64)@2, ws_sold_date_sk@3)], projection=[d_year@1, c_customer_id@3, c_first_name@4, c_last_name@5, ws_net_paid@7] @@ -8147,23 +7522,20 @@ mod tests { │ HashJoinExec: mode=CollectLeft, join_type=Inner, on=[(CAST(customer.c_customer_sk AS Float64)@4, ws_bill_customer_sk@1)], projection=[c_customer_id@1, c_first_name@2, c_last_name@3, ws_sold_date_sk@5, ws_net_paid@7] │ CoalescePartitionsExec │ [Stage 14] => NetworkBroadcastExec: partitions_per_consumer=3, stage_partitions=9, input_tasks=2 - │ PartitionIsolatorExec: tasks=3 partitions=6 - │ DataSourceExec: file_groups={6 groups: [[/testdata/tpcds/plans_sf1_partitions4/web_sales/part-0.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/web_sales/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/web_sales/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/web_sales/part-2.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/web_sales/part-2.parquet:..], ...]}, projection=[ws_sold_date_sk, ws_bill_customer_sk, ws_net_paid], file_type=parquet, predicate=DynamicFilter [ empty ] AND DynamicFilter [ empty ] + │ DistributedLeafExec: DataSourceExec: file_groups={3 groups: [[/testdata/tpcds/plans_sf1_partitions4/web_sales/part-0.parquet:.., /testdata/tpcds/plans_sf1_partitions4/web_sales/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/web_sales/part-1.parquet:.., /testdata/tpcds/plans_sf1_partitions4/web_sales/part-2.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/web_sales/part-2.parquet:.., /testdata/tpcds/plans_sf1_partitions4/web_sales/part-3.parquet:..]]}, projection=[ws_sold_date_sk, ws_bill_customer_sk, ws_net_paid], file_type=parquet, predicate=DynamicFilter [ empty ] AND DynamicFilter [ empty ] └────────────────────────────────────────────────── ┌───── Stage 13 ── Tasks: t0:[p0..p8] t1:[p9..p17] │ BroadcastExec: input_partitions=3, consumer_tasks=3, output_partitions=9 │ ProjectionExec: expr=[d_date_sk@0 as d_date_sk, d_year@1 as d_year, CAST(d_date_sk@0 AS Float64) as CAST(date_dim.d_date_sk AS Float64)] │ FilterExec: d_year@1 = 2002 │ RepartitionExec: partitioning=RoundRobinBatch(3), input_partitions=2 - │ PartitionIsolatorExec: tasks=2 partitions=4 - │ DataSourceExec: file_groups={4 groups: [[/testdata/tpcds/plans_sf1_partitions4/date_dim/part-0.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-2.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-3.parquet]]}, projection=[d_date_sk, d_year], file_type=parquet, predicate=d_year@6 = 2002, pruning_predicate=d_year_null_count@2 != row_count@3 AND d_year_min@0 <= 2002 AND 2002 <= d_year_max@1, required_guarantees=[d_year in (2002)] + │ DistributedLeafExec: DataSourceExec: file_groups={2 groups: [[/testdata/tpcds/plans_sf1_partitions4/date_dim/part-0.parquet, /testdata/tpcds/plans_sf1_partitions4/date_dim/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-2.parquet, /testdata/tpcds/plans_sf1_partitions4/date_dim/part-3.parquet]]}, projection=[d_date_sk, d_year], file_type=parquet, predicate=d_year@6 = 2002, pruning_predicate=d_year_null_count@2 != row_count@3 AND d_year_min@0 <= 2002 AND 2002 <= d_year_max@1, required_guarantees=[d_year in (2002)] └────────────────────────────────────────────────── ┌───── Stage 14 ── Tasks: t0:[p0..p8] t1:[p9..p17] │ BroadcastExec: input_partitions=3, consumer_tasks=3, output_partitions=9 │ ProjectionExec: expr=[c_customer_sk@0 as c_customer_sk, c_customer_id@1 as c_customer_id, c_first_name@2 as c_first_name, c_last_name@3 as c_last_name, CAST(c_customer_sk@0 AS Float64) as CAST(customer.c_customer_sk AS Float64)] │ RepartitionExec: partitioning=RoundRobinBatch(3), input_partitions=2 - │ PartitionIsolatorExec: tasks=2 partitions=4 - │ DataSourceExec: file_groups={4 groups: [[/testdata/tpcds/plans_sf1_partitions4/customer/part-0.parquet], [/testdata/tpcds/plans_sf1_partitions4/customer/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/customer/part-2.parquet], [/testdata/tpcds/plans_sf1_partitions4/customer/part-3.parquet]]}, projection=[c_customer_sk, c_customer_id, c_first_name, c_last_name], file_type=parquet, predicate=DynamicFilter [ empty ] + │ DistributedLeafExec: DataSourceExec: file_groups={2 groups: [[/testdata/tpcds/plans_sf1_partitions4/customer/part-0.parquet, /testdata/tpcds/plans_sf1_partitions4/customer/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/customer/part-2.parquet, /testdata/tpcds/plans_sf1_partitions4/customer/part-3.parquet]]}, projection=[c_customer_sk, c_customer_id, c_first_name, c_last_name], file_type=parquet, predicate=DynamicFilter [ empty ] └────────────────────────────────────────────────── "); Ok(()) @@ -8171,7 +7543,7 @@ mod tests { #[tokio::test] async fn test_tpcds_75() -> Result<()> { let display = test_tpcds_query("q75").await?; - assert_snapshot!(display, @" + assert_snapshot!(display, @r" ┌───── DistributedExec ── Tasks: t0:[p0] │ SortPreservingMergeExec: [sales_cnt_diff@8 ASC NULLS LAST, sales_amt_diff@9 ASC NULLS LAST], fetch=100 │ SortExec: TopK(fetch=100), expr=[sales_cnt_diff@8 ASC NULLS LAST, sales_amt_diff@9 ASC NULLS LAST], preserve_partitioning=[true] @@ -8200,23 +7572,23 @@ mod tests { │ ProjectionExec: expr=[CAST(cr_return_amount@8 AS Decimal128(30, 15)) as __common_expr_1, cs_quantity@0 as cs_quantity, cs_ext_sales_price@1 as cs_ext_sales_price, i_brand_id@2 as i_brand_id, i_class_id@3 as i_class_id, i_category_id@4 as i_category_id, i_manufact_id@5 as i_manufact_id, d_year@6 as d_year, cr_return_quantity@7 as cr_return_quantity] │ HashJoinExec: mode=CollectLeft, join_type=Left, on=[(cs_order_number@1, cr_order_number@1), (cs_item_sk@0, cr_item_sk@0)], projection=[cs_quantity@2, cs_ext_sales_price@3, i_brand_id@4, i_class_id@5, i_category_id@6, i_manufact_id@7, d_year@8, cr_return_quantity@11, cr_return_amount@12] │ CoalescePartitionsExec - │ [Stage 3] => NetworkCoalesceExec: output_partitions=6, input_tasks=3 + │ [Stage 3] => NetworkCoalesceExec: output_partitions=9, input_tasks=3 │ DataSourceExec: file_groups={3 groups: [[/testdata/tpcds/plans_sf1_partitions4/catalog_returns/part-0.parquet:.., /testdata/tpcds/plans_sf1_partitions4/catalog_returns/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/catalog_returns/part-1.parquet:.., /testdata/tpcds/plans_sf1_partitions4/catalog_returns/part-2.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/catalog_returns/part-2.parquet:.., /testdata/tpcds/plans_sf1_partitions4/catalog_returns/part-3.parquet:..]]}, projection=[cr_item_sk, cr_order_number, cr_return_quantity, cr_return_amount], file_type=parquet │ ProjectionExec: expr=[d_year@7 as d_year, i_brand_id@3 as i_brand_id, i_class_id@4 as i_class_id, i_category_id@5 as i_category_id, i_manufact_id@6 as i_manufact_id, ss_quantity@1 - CASE WHEN sr_return_quantity@8 IS NOT NULL THEN sr_return_quantity@8 ELSE 0 END as sales_cnt, ss_ext_sales_price@2 - CASE WHEN __common_expr_2@0 IS NOT NULL THEN __common_expr_2@0 ELSE Some(0),30,15 END as sales_amt] │ ProjectionExec: expr=[CAST(sr_return_amt@8 AS Decimal128(30, 15)) as __common_expr_2, ss_quantity@0 as ss_quantity, ss_ext_sales_price@1 as ss_ext_sales_price, i_brand_id@2 as i_brand_id, i_class_id@3 as i_class_id, i_category_id@4 as i_category_id, i_manufact_id@5 as i_manufact_id, d_year@6 as d_year, sr_return_quantity@7 as sr_return_quantity] │ HashJoinExec: mode=CollectLeft, join_type=Left, on=[(ss_ticket_number@1, sr_ticket_number@1), (ss_item_sk@0, sr_item_sk@0)], projection=[ss_quantity@2, ss_ext_sales_price@3, i_brand_id@4, i_class_id@5, i_category_id@6, i_manufact_id@7, d_year@8, sr_return_quantity@11, sr_return_amt@12] │ CoalescePartitionsExec - │ [Stage 6] => NetworkCoalesceExec: output_partitions=6, input_tasks=3 + │ [Stage 6] => NetworkCoalesceExec: output_partitions=9, input_tasks=3 │ DataSourceExec: file_groups={3 groups: [[/testdata/tpcds/plans_sf1_partitions4/store_returns/part-0.parquet:.., /testdata/tpcds/plans_sf1_partitions4/store_returns/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_returns/part-1.parquet:.., /testdata/tpcds/plans_sf1_partitions4/store_returns/part-2.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_returns/part-2.parquet:.., /testdata/tpcds/plans_sf1_partitions4/store_returns/part-3.parquet:..]]}, projection=[sr_item_sk, sr_ticket_number, sr_return_quantity, sr_return_amt], file_type=parquet │ ProjectionExec: expr=[d_year@7 as d_year, i_brand_id@3 as i_brand_id, i_class_id@4 as i_class_id, i_category_id@5 as i_category_id, i_manufact_id@6 as i_manufact_id, ws_quantity@1 - CASE WHEN wr_return_quantity@8 IS NOT NULL THEN wr_return_quantity@8 ELSE 0 END as sales_cnt, ws_ext_sales_price@2 - CASE WHEN __common_expr_3@0 IS NOT NULL THEN __common_expr_3@0 ELSE Some(0),30,15 END as sales_amt] │ ProjectionExec: expr=[CAST(wr_return_amt@8 AS Decimal128(30, 15)) as __common_expr_3, ws_quantity@0 as ws_quantity, ws_ext_sales_price@1 as ws_ext_sales_price, i_brand_id@2 as i_brand_id, i_class_id@3 as i_class_id, i_category_id@4 as i_category_id, i_manufact_id@5 as i_manufact_id, d_year@6 as d_year, wr_return_quantity@7 as wr_return_quantity] │ HashJoinExec: mode=CollectLeft, join_type=Left, on=[(ws_order_number@1, wr_order_number@1), (ws_item_sk@0, wr_item_sk@0)], projection=[ws_quantity@2, ws_ext_sales_price@3, i_brand_id@4, i_class_id@5, i_category_id@6, i_manufact_id@7, d_year@8, wr_return_quantity@11, wr_return_amt@12] │ CoalescePartitionsExec - │ [Stage 9] => NetworkCoalesceExec: output_partitions=6, input_tasks=3 + │ [Stage 9] => NetworkCoalesceExec: output_partitions=9, input_tasks=3 │ RepartitionExec: partitioning=RoundRobinBatch(3), input_partitions=2 │ DataSourceExec: file_groups={2 groups: [[/testdata/tpcds/plans_sf1_partitions4/web_returns/part-0.parquet, /testdata/tpcds/plans_sf1_partitions4/web_returns/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/web_returns/part-2.parquet, /testdata/tpcds/plans_sf1_partitions4/web_returns/part-3.parquet]]}, projection=[wr_item_sk, wr_order_number, wr_return_quantity, wr_return_amt], file_type=parquet └────────────────────────────────────────────────── - ┌───── Stage 3 ── Tasks: t0:[p0..p1] t1:[p2..p3] t2:[p4..p5] + ┌───── Stage 3 ── Tasks: t0:[p0..p2] t1:[p3..p5] t2:[p6..p8] │ ProjectionExec: expr=[cs_item_sk@1 as cs_item_sk, cs_order_number@2 as cs_order_number, cs_quantity@3 as cs_quantity, cs_ext_sales_price@4 as cs_ext_sales_price, i_brand_id@5 as i_brand_id, i_class_id@6 as i_class_id, i_category_id@7 as i_category_id, i_manufact_id@8 as i_manufact_id, d_year@0 as d_year] │ HashJoinExec: mode=CollectLeft, join_type=Inner, on=[(CAST(date_dim.d_date_sk AS Float64)@2, cs_sold_date_sk@0)], projection=[d_year@1, cs_item_sk@4, cs_order_number@5, cs_quantity@6, cs_ext_sales_price@7, i_brand_id@8, i_class_id@9, i_category_id@10, i_manufact_id@11] │ CoalescePartitionsExec @@ -8225,25 +7597,22 @@ mod tests { │ HashJoinExec: mode=CollectLeft, join_type=Inner, on=[(i_item_sk@0, cs_item_sk@1)], projection=[i_brand_id@1, i_class_id@2, i_category_id@3, i_manufact_id@4, cs_sold_date_sk@5, cs_item_sk@6, cs_order_number@7, cs_quantity@8, cs_ext_sales_price@9] │ CoalescePartitionsExec │ [Stage 2] => NetworkBroadcastExec: partitions_per_consumer=3, stage_partitions=9, input_tasks=2 - │ PartitionIsolatorExec: tasks=3 partitions=6 - │ DataSourceExec: file_groups={6 groups: [[/testdata/tpcds/plans_sf1_partitions4/catalog_sales/part-0.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/catalog_sales/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/catalog_sales/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/catalog_sales/part-2.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/catalog_sales/part-2.parquet:..], ...]}, projection=[cs_sold_date_sk, cs_item_sk, cs_order_number, cs_quantity, cs_ext_sales_price], file_type=parquet, predicate=DynamicFilter [ empty ] AND DynamicFilter [ empty ] + │ DistributedLeafExec: DataSourceExec: file_groups={3 groups: [[/testdata/tpcds/plans_sf1_partitions4/catalog_sales/part-0.parquet:.., /testdata/tpcds/plans_sf1_partitions4/catalog_sales/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/catalog_sales/part-1.parquet:.., /testdata/tpcds/plans_sf1_partitions4/catalog_sales/part-2.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/catalog_sales/part-2.parquet:.., /testdata/tpcds/plans_sf1_partitions4/catalog_sales/part-3.parquet:..]]}, projection=[cs_sold_date_sk, cs_item_sk, cs_order_number, cs_quantity, cs_ext_sales_price], file_type=parquet, predicate=DynamicFilter [ empty ] AND DynamicFilter [ empty ] └────────────────────────────────────────────────── ┌───── Stage 1 ── Tasks: t0:[p0..p8] t1:[p9..p17] │ BroadcastExec: input_partitions=3, consumer_tasks=3, output_partitions=9 │ ProjectionExec: expr=[d_date_sk@0 as d_date_sk, d_year@1 as d_year, CAST(d_date_sk@0 AS Float64) as CAST(date_dim.d_date_sk AS Float64)] │ FilterExec: d_year@1 = 2002 │ RepartitionExec: partitioning=RoundRobinBatch(3), input_partitions=2 - │ PartitionIsolatorExec: tasks=2 partitions=4 - │ DataSourceExec: file_groups={4 groups: [[/testdata/tpcds/plans_sf1_partitions4/date_dim/part-0.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-2.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-3.parquet]]}, projection=[d_date_sk, d_year], file_type=parquet, predicate=d_year@6 = 2002, pruning_predicate=d_year_null_count@2 != row_count@3 AND d_year_min@0 <= 2002 AND 2002 <= d_year_max@1, required_guarantees=[d_year in (2002)] + │ DistributedLeafExec: DataSourceExec: file_groups={2 groups: [[/testdata/tpcds/plans_sf1_partitions4/date_dim/part-0.parquet, /testdata/tpcds/plans_sf1_partitions4/date_dim/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-2.parquet, /testdata/tpcds/plans_sf1_partitions4/date_dim/part-3.parquet]]}, projection=[d_date_sk, d_year], file_type=parquet, predicate=d_year@6 = 2002, pruning_predicate=d_year_null_count@2 != row_count@3 AND d_year_min@0 <= 2002 AND 2002 <= d_year_max@1, required_guarantees=[d_year in (2002)] └────────────────────────────────────────────────── ┌───── Stage 2 ── Tasks: t0:[p0..p8] t1:[p9..p17] │ BroadcastExec: input_partitions=3, consumer_tasks=3, output_partitions=9 │ FilterExec: i_category@4 = Books, projection=[i_item_sk@0, i_brand_id@1, i_class_id@2, i_category_id@3, i_manufact_id@5] │ RepartitionExec: partitioning=RoundRobinBatch(3), input_partitions=2 - │ PartitionIsolatorExec: tasks=2 partitions=4 - │ DataSourceExec: file_groups={4 groups: [[/testdata/tpcds/plans_sf1_partitions4/item/part-0.parquet], [/testdata/tpcds/plans_sf1_partitions4/item/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/item/part-2.parquet], [/testdata/tpcds/plans_sf1_partitions4/item/part-3.parquet]]}, projection=[i_item_sk, i_brand_id, i_class_id, i_category_id, i_category, i_manufact_id], file_type=parquet, predicate=i_category@12 = Books, pruning_predicate=i_category_null_count@2 != row_count@3 AND i_category_min@0 <= Books AND Books <= i_category_max@1, required_guarantees=[i_category in (Books)] + │ DistributedLeafExec: DataSourceExec: file_groups={2 groups: [[/testdata/tpcds/plans_sf1_partitions4/item/part-0.parquet, /testdata/tpcds/plans_sf1_partitions4/item/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/item/part-2.parquet, /testdata/tpcds/plans_sf1_partitions4/item/part-3.parquet]]}, projection=[i_item_sk, i_brand_id, i_class_id, i_category_id, i_category, i_manufact_id], file_type=parquet, predicate=i_category@12 = Books, pruning_predicate=i_category_null_count@2 != row_count@3 AND i_category_min@0 <= Books AND Books <= i_category_max@1, required_guarantees=[i_category in (Books)] └────────────────────────────────────────────────── - ┌───── Stage 6 ── Tasks: t0:[p0..p1] t1:[p2..p3] t2:[p4..p5] + ┌───── Stage 6 ── Tasks: t0:[p0..p2] t1:[p3..p5] t2:[p6..p8] │ ProjectionExec: expr=[ss_item_sk@1 as ss_item_sk, ss_ticket_number@2 as ss_ticket_number, ss_quantity@3 as ss_quantity, ss_ext_sales_price@4 as ss_ext_sales_price, i_brand_id@5 as i_brand_id, i_class_id@6 as i_class_id, i_category_id@7 as i_category_id, i_manufact_id@8 as i_manufact_id, d_year@0 as d_year] │ HashJoinExec: mode=CollectLeft, join_type=Inner, on=[(CAST(date_dim.d_date_sk AS Float64)@2, ss_sold_date_sk@0)], projection=[d_year@1, ss_item_sk@4, ss_ticket_number@5, ss_quantity@6, ss_ext_sales_price@7, i_brand_id@8, i_class_id@9, i_category_id@10, i_manufact_id@11] │ CoalescePartitionsExec @@ -8252,25 +7621,22 @@ mod tests { │ HashJoinExec: mode=CollectLeft, join_type=Inner, on=[(i_item_sk@0, ss_item_sk@1)], projection=[i_brand_id@1, i_class_id@2, i_category_id@3, i_manufact_id@4, ss_sold_date_sk@5, ss_item_sk@6, ss_ticket_number@7, ss_quantity@8, ss_ext_sales_price@9] │ CoalescePartitionsExec │ [Stage 5] => NetworkBroadcastExec: partitions_per_consumer=3, stage_partitions=9, input_tasks=2 - │ PartitionIsolatorExec: tasks=3 partitions=6 - │ DataSourceExec: file_groups={6 groups: [[/testdata/tpcds/plans_sf1_partitions4/store_sales/part-0.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_sales/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_sales/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_sales/part-2.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_sales/part-2.parquet:..], ...]}, projection=[ss_sold_date_sk, ss_item_sk, ss_ticket_number, ss_quantity, ss_ext_sales_price], file_type=parquet, predicate=DynamicFilter [ empty ] AND DynamicFilter [ empty ] + │ DistributedLeafExec: DataSourceExec: file_groups={3 groups: [[/testdata/tpcds/plans_sf1_partitions4/store_sales/part-0.parquet:.., /testdata/tpcds/plans_sf1_partitions4/store_sales/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_sales/part-1.parquet:.., /testdata/tpcds/plans_sf1_partitions4/store_sales/part-2.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_sales/part-2.parquet:.., /testdata/tpcds/plans_sf1_partitions4/store_sales/part-3.parquet:..]]}, projection=[ss_sold_date_sk, ss_item_sk, ss_ticket_number, ss_quantity, ss_ext_sales_price], file_type=parquet, predicate=DynamicFilter [ empty ] AND DynamicFilter [ empty ] └────────────────────────────────────────────────── ┌───── Stage 4 ── Tasks: t0:[p0..p8] t1:[p9..p17] │ BroadcastExec: input_partitions=3, consumer_tasks=3, output_partitions=9 │ ProjectionExec: expr=[d_date_sk@0 as d_date_sk, d_year@1 as d_year, CAST(d_date_sk@0 AS Float64) as CAST(date_dim.d_date_sk AS Float64)] │ FilterExec: d_year@1 = 2002 │ RepartitionExec: partitioning=RoundRobinBatch(3), input_partitions=2 - │ PartitionIsolatorExec: tasks=2 partitions=4 - │ DataSourceExec: file_groups={4 groups: [[/testdata/tpcds/plans_sf1_partitions4/date_dim/part-0.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-2.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-3.parquet]]}, projection=[d_date_sk, d_year], file_type=parquet, predicate=d_year@6 = 2002, pruning_predicate=d_year_null_count@2 != row_count@3 AND d_year_min@0 <= 2002 AND 2002 <= d_year_max@1, required_guarantees=[d_year in (2002)] + │ DistributedLeafExec: DataSourceExec: file_groups={2 groups: [[/testdata/tpcds/plans_sf1_partitions4/date_dim/part-0.parquet, /testdata/tpcds/plans_sf1_partitions4/date_dim/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-2.parquet, /testdata/tpcds/plans_sf1_partitions4/date_dim/part-3.parquet]]}, projection=[d_date_sk, d_year], file_type=parquet, predicate=d_year@6 = 2002, pruning_predicate=d_year_null_count@2 != row_count@3 AND d_year_min@0 <= 2002 AND 2002 <= d_year_max@1, required_guarantees=[d_year in (2002)] └────────────────────────────────────────────────── ┌───── Stage 5 ── Tasks: t0:[p0..p8] t1:[p9..p17] │ BroadcastExec: input_partitions=3, consumer_tasks=3, output_partitions=9 │ FilterExec: i_category@4 = Books, projection=[i_item_sk@0, i_brand_id@1, i_class_id@2, i_category_id@3, i_manufact_id@5] │ RepartitionExec: partitioning=RoundRobinBatch(3), input_partitions=2 - │ PartitionIsolatorExec: tasks=2 partitions=4 - │ DataSourceExec: file_groups={4 groups: [[/testdata/tpcds/plans_sf1_partitions4/item/part-0.parquet], [/testdata/tpcds/plans_sf1_partitions4/item/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/item/part-2.parquet], [/testdata/tpcds/plans_sf1_partitions4/item/part-3.parquet]]}, projection=[i_item_sk, i_brand_id, i_class_id, i_category_id, i_category, i_manufact_id], file_type=parquet, predicate=i_category@12 = Books, pruning_predicate=i_category_null_count@2 != row_count@3 AND i_category_min@0 <= Books AND Books <= i_category_max@1, required_guarantees=[i_category in (Books)] + │ DistributedLeafExec: DataSourceExec: file_groups={2 groups: [[/testdata/tpcds/plans_sf1_partitions4/item/part-0.parquet, /testdata/tpcds/plans_sf1_partitions4/item/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/item/part-2.parquet, /testdata/tpcds/plans_sf1_partitions4/item/part-3.parquet]]}, projection=[i_item_sk, i_brand_id, i_class_id, i_category_id, i_category, i_manufact_id], file_type=parquet, predicate=i_category@12 = Books, pruning_predicate=i_category_null_count@2 != row_count@3 AND i_category_min@0 <= Books AND Books <= i_category_max@1, required_guarantees=[i_category in (Books)] └────────────────────────────────────────────────── - ┌───── Stage 9 ── Tasks: t0:[p0..p1] t1:[p2..p3] t2:[p4..p5] + ┌───── Stage 9 ── Tasks: t0:[p0..p2] t1:[p3..p5] t2:[p6..p8] │ ProjectionExec: expr=[ws_item_sk@1 as ws_item_sk, ws_order_number@2 as ws_order_number, ws_quantity@3 as ws_quantity, ws_ext_sales_price@4 as ws_ext_sales_price, i_brand_id@5 as i_brand_id, i_class_id@6 as i_class_id, i_category_id@7 as i_category_id, i_manufact_id@8 as i_manufact_id, d_year@0 as d_year] │ HashJoinExec: mode=CollectLeft, join_type=Inner, on=[(CAST(date_dim.d_date_sk AS Float64)@2, ws_sold_date_sk@0)], projection=[d_year@1, ws_item_sk@4, ws_order_number@5, ws_quantity@6, ws_ext_sales_price@7, i_brand_id@8, i_class_id@9, i_category_id@10, i_manufact_id@11] │ CoalescePartitionsExec @@ -8279,23 +7645,20 @@ mod tests { │ HashJoinExec: mode=CollectLeft, join_type=Inner, on=[(i_item_sk@0, ws_item_sk@1)], projection=[i_brand_id@1, i_class_id@2, i_category_id@3, i_manufact_id@4, ws_sold_date_sk@5, ws_item_sk@6, ws_order_number@7, ws_quantity@8, ws_ext_sales_price@9] │ CoalescePartitionsExec │ [Stage 8] => NetworkBroadcastExec: partitions_per_consumer=3, stage_partitions=9, input_tasks=2 - │ PartitionIsolatorExec: tasks=3 partitions=6 - │ DataSourceExec: file_groups={6 groups: [[/testdata/tpcds/plans_sf1_partitions4/web_sales/part-0.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/web_sales/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/web_sales/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/web_sales/part-2.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/web_sales/part-2.parquet:..], ...]}, projection=[ws_sold_date_sk, ws_item_sk, ws_order_number, ws_quantity, ws_ext_sales_price], file_type=parquet, predicate=DynamicFilter [ empty ] AND DynamicFilter [ empty ] + │ DistributedLeafExec: DataSourceExec: file_groups={3 groups: [[/testdata/tpcds/plans_sf1_partitions4/web_sales/part-0.parquet:.., /testdata/tpcds/plans_sf1_partitions4/web_sales/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/web_sales/part-1.parquet:.., /testdata/tpcds/plans_sf1_partitions4/web_sales/part-2.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/web_sales/part-2.parquet:.., /testdata/tpcds/plans_sf1_partitions4/web_sales/part-3.parquet:..]]}, projection=[ws_sold_date_sk, ws_item_sk, ws_order_number, ws_quantity, ws_ext_sales_price], file_type=parquet, predicate=DynamicFilter [ empty ] AND DynamicFilter [ empty ] └────────────────────────────────────────────────── ┌───── Stage 7 ── Tasks: t0:[p0..p8] t1:[p9..p17] │ BroadcastExec: input_partitions=3, consumer_tasks=3, output_partitions=9 │ ProjectionExec: expr=[d_date_sk@0 as d_date_sk, d_year@1 as d_year, CAST(d_date_sk@0 AS Float64) as CAST(date_dim.d_date_sk AS Float64)] │ FilterExec: d_year@1 = 2002 │ RepartitionExec: partitioning=RoundRobinBatch(3), input_partitions=2 - │ PartitionIsolatorExec: tasks=2 partitions=4 - │ DataSourceExec: file_groups={4 groups: [[/testdata/tpcds/plans_sf1_partitions4/date_dim/part-0.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-2.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-3.parquet]]}, projection=[d_date_sk, d_year], file_type=parquet, predicate=d_year@6 = 2002, pruning_predicate=d_year_null_count@2 != row_count@3 AND d_year_min@0 <= 2002 AND 2002 <= d_year_max@1, required_guarantees=[d_year in (2002)] + │ DistributedLeafExec: DataSourceExec: file_groups={2 groups: [[/testdata/tpcds/plans_sf1_partitions4/date_dim/part-0.parquet, /testdata/tpcds/plans_sf1_partitions4/date_dim/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-2.parquet, /testdata/tpcds/plans_sf1_partitions4/date_dim/part-3.parquet]]}, projection=[d_date_sk, d_year], file_type=parquet, predicate=d_year@6 = 2002, pruning_predicate=d_year_null_count@2 != row_count@3 AND d_year_min@0 <= 2002 AND 2002 <= d_year_max@1, required_guarantees=[d_year in (2002)] └────────────────────────────────────────────────── ┌───── Stage 8 ── Tasks: t0:[p0..p8] t1:[p9..p17] │ BroadcastExec: input_partitions=3, consumer_tasks=3, output_partitions=9 │ FilterExec: i_category@4 = Books, projection=[i_item_sk@0, i_brand_id@1, i_class_id@2, i_category_id@3, i_manufact_id@5] │ RepartitionExec: partitioning=RoundRobinBatch(3), input_partitions=2 - │ PartitionIsolatorExec: tasks=2 partitions=4 - │ DataSourceExec: file_groups={4 groups: [[/testdata/tpcds/plans_sf1_partitions4/item/part-0.parquet], [/testdata/tpcds/plans_sf1_partitions4/item/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/item/part-2.parquet], [/testdata/tpcds/plans_sf1_partitions4/item/part-3.parquet]]}, projection=[i_item_sk, i_brand_id, i_class_id, i_category_id, i_category, i_manufact_id], file_type=parquet, predicate=i_category@12 = Books, pruning_predicate=i_category_null_count@2 != row_count@3 AND i_category_min@0 <= Books AND Books <= i_category_max@1, required_guarantees=[i_category in (Books)] + │ DistributedLeafExec: DataSourceExec: file_groups={2 groups: [[/testdata/tpcds/plans_sf1_partitions4/item/part-0.parquet, /testdata/tpcds/plans_sf1_partitions4/item/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/item/part-2.parquet, /testdata/tpcds/plans_sf1_partitions4/item/part-3.parquet]]}, projection=[i_item_sk, i_brand_id, i_class_id, i_category_id, i_category, i_manufact_id], file_type=parquet, predicate=i_category@12 = Books, pruning_predicate=i_category_null_count@2 != row_count@3 AND i_category_min@0 <= Books AND Books <= i_category_max@1, required_guarantees=[i_category in (Books)] └────────────────────────────────────────────────── ┌───── Stage 22 ── Tasks: t0:[p0..p2] t1:[p0..p2] │ RepartitionExec: partitioning=Hash([d_year@0, i_brand_id@1, i_class_id@2, i_category_id@3, i_manufact_id@4], 3), input_partitions=3 @@ -8311,23 +7674,23 @@ mod tests { │ ProjectionExec: expr=[CAST(cr_return_amount@8 AS Decimal128(30, 15)) as __common_expr_4, cs_quantity@0 as cs_quantity, cs_ext_sales_price@1 as cs_ext_sales_price, i_brand_id@2 as i_brand_id, i_class_id@3 as i_class_id, i_category_id@4 as i_category_id, i_manufact_id@5 as i_manufact_id, d_year@6 as d_year, cr_return_quantity@7 as cr_return_quantity] │ HashJoinExec: mode=CollectLeft, join_type=Left, on=[(cs_order_number@1, cr_order_number@1), (cs_item_sk@0, cr_item_sk@0)], projection=[cs_quantity@2, cs_ext_sales_price@3, i_brand_id@4, i_class_id@5, i_category_id@6, i_manufact_id@7, d_year@8, cr_return_quantity@11, cr_return_amount@12] │ CoalescePartitionsExec - │ [Stage 14] => NetworkCoalesceExec: output_partitions=6, input_tasks=3 + │ [Stage 14] => NetworkCoalesceExec: output_partitions=9, input_tasks=3 │ DataSourceExec: file_groups={3 groups: [[/testdata/tpcds/plans_sf1_partitions4/catalog_returns/part-0.parquet:.., /testdata/tpcds/plans_sf1_partitions4/catalog_returns/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/catalog_returns/part-1.parquet:.., /testdata/tpcds/plans_sf1_partitions4/catalog_returns/part-2.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/catalog_returns/part-2.parquet:.., /testdata/tpcds/plans_sf1_partitions4/catalog_returns/part-3.parquet:..]]}, projection=[cr_item_sk, cr_order_number, cr_return_quantity, cr_return_amount], file_type=parquet │ ProjectionExec: expr=[d_year@7 as d_year, i_brand_id@3 as i_brand_id, i_class_id@4 as i_class_id, i_category_id@5 as i_category_id, i_manufact_id@6 as i_manufact_id, ss_quantity@1 - CASE WHEN sr_return_quantity@8 IS NOT NULL THEN sr_return_quantity@8 ELSE 0 END as sales_cnt, ss_ext_sales_price@2 - CASE WHEN __common_expr_5@0 IS NOT NULL THEN __common_expr_5@0 ELSE Some(0),30,15 END as sales_amt] │ ProjectionExec: expr=[CAST(sr_return_amt@8 AS Decimal128(30, 15)) as __common_expr_5, ss_quantity@0 as ss_quantity, ss_ext_sales_price@1 as ss_ext_sales_price, i_brand_id@2 as i_brand_id, i_class_id@3 as i_class_id, i_category_id@4 as i_category_id, i_manufact_id@5 as i_manufact_id, d_year@6 as d_year, sr_return_quantity@7 as sr_return_quantity] │ HashJoinExec: mode=CollectLeft, join_type=Left, on=[(ss_ticket_number@1, sr_ticket_number@1), (ss_item_sk@0, sr_item_sk@0)], projection=[ss_quantity@2, ss_ext_sales_price@3, i_brand_id@4, i_class_id@5, i_category_id@6, i_manufact_id@7, d_year@8, sr_return_quantity@11, sr_return_amt@12] │ CoalescePartitionsExec - │ [Stage 17] => NetworkCoalesceExec: output_partitions=6, input_tasks=3 + │ [Stage 17] => NetworkCoalesceExec: output_partitions=9, input_tasks=3 │ DataSourceExec: file_groups={3 groups: [[/testdata/tpcds/plans_sf1_partitions4/store_returns/part-0.parquet:.., /testdata/tpcds/plans_sf1_partitions4/store_returns/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_returns/part-1.parquet:.., /testdata/tpcds/plans_sf1_partitions4/store_returns/part-2.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_returns/part-2.parquet:.., /testdata/tpcds/plans_sf1_partitions4/store_returns/part-3.parquet:..]]}, projection=[sr_item_sk, sr_ticket_number, sr_return_quantity, sr_return_amt], file_type=parquet │ ProjectionExec: expr=[d_year@7 as d_year, i_brand_id@3 as i_brand_id, i_class_id@4 as i_class_id, i_category_id@5 as i_category_id, i_manufact_id@6 as i_manufact_id, ws_quantity@1 - CASE WHEN wr_return_quantity@8 IS NOT NULL THEN wr_return_quantity@8 ELSE 0 END as sales_cnt, ws_ext_sales_price@2 - CASE WHEN __common_expr_6@0 IS NOT NULL THEN __common_expr_6@0 ELSE Some(0),30,15 END as sales_amt] │ ProjectionExec: expr=[CAST(wr_return_amt@8 AS Decimal128(30, 15)) as __common_expr_6, ws_quantity@0 as ws_quantity, ws_ext_sales_price@1 as ws_ext_sales_price, i_brand_id@2 as i_brand_id, i_class_id@3 as i_class_id, i_category_id@4 as i_category_id, i_manufact_id@5 as i_manufact_id, d_year@6 as d_year, wr_return_quantity@7 as wr_return_quantity] │ HashJoinExec: mode=CollectLeft, join_type=Left, on=[(ws_order_number@1, wr_order_number@1), (ws_item_sk@0, wr_item_sk@0)], projection=[ws_quantity@2, ws_ext_sales_price@3, i_brand_id@4, i_class_id@5, i_category_id@6, i_manufact_id@7, d_year@8, wr_return_quantity@11, wr_return_amt@12] │ CoalescePartitionsExec - │ [Stage 20] => NetworkCoalesceExec: output_partitions=6, input_tasks=3 + │ [Stage 20] => NetworkCoalesceExec: output_partitions=9, input_tasks=3 │ RepartitionExec: partitioning=RoundRobinBatch(3), input_partitions=2 │ DataSourceExec: file_groups={2 groups: [[/testdata/tpcds/plans_sf1_partitions4/web_returns/part-0.parquet, /testdata/tpcds/plans_sf1_partitions4/web_returns/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/web_returns/part-2.parquet, /testdata/tpcds/plans_sf1_partitions4/web_returns/part-3.parquet]]}, projection=[wr_item_sk, wr_order_number, wr_return_quantity, wr_return_amt], file_type=parquet └────────────────────────────────────────────────── - ┌───── Stage 14 ── Tasks: t0:[p0..p1] t1:[p2..p3] t2:[p4..p5] + ┌───── Stage 14 ── Tasks: t0:[p0..p2] t1:[p3..p5] t2:[p6..p8] │ ProjectionExec: expr=[cs_item_sk@1 as cs_item_sk, cs_order_number@2 as cs_order_number, cs_quantity@3 as cs_quantity, cs_ext_sales_price@4 as cs_ext_sales_price, i_brand_id@5 as i_brand_id, i_class_id@6 as i_class_id, i_category_id@7 as i_category_id, i_manufact_id@8 as i_manufact_id, d_year@0 as d_year] │ HashJoinExec: mode=CollectLeft, join_type=Inner, on=[(CAST(date_dim.d_date_sk AS Float64)@2, cs_sold_date_sk@0)], projection=[d_year@1, cs_item_sk@4, cs_order_number@5, cs_quantity@6, cs_ext_sales_price@7, i_brand_id@8, i_class_id@9, i_category_id@10, i_manufact_id@11] │ CoalescePartitionsExec @@ -8336,25 +7699,22 @@ mod tests { │ HashJoinExec: mode=CollectLeft, join_type=Inner, on=[(i_item_sk@0, cs_item_sk@1)], projection=[i_brand_id@1, i_class_id@2, i_category_id@3, i_manufact_id@4, cs_sold_date_sk@5, cs_item_sk@6, cs_order_number@7, cs_quantity@8, cs_ext_sales_price@9] │ CoalescePartitionsExec │ [Stage 13] => NetworkBroadcastExec: partitions_per_consumer=3, stage_partitions=9, input_tasks=2 - │ PartitionIsolatorExec: tasks=3 partitions=6 - │ DataSourceExec: file_groups={6 groups: [[/testdata/tpcds/plans_sf1_partitions4/catalog_sales/part-0.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/catalog_sales/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/catalog_sales/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/catalog_sales/part-2.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/catalog_sales/part-2.parquet:..], ...]}, projection=[cs_sold_date_sk, cs_item_sk, cs_order_number, cs_quantity, cs_ext_sales_price], file_type=parquet, predicate=DynamicFilter [ empty ] AND DynamicFilter [ empty ] + │ DistributedLeafExec: DataSourceExec: file_groups={3 groups: [[/testdata/tpcds/plans_sf1_partitions4/catalog_sales/part-0.parquet:.., /testdata/tpcds/plans_sf1_partitions4/catalog_sales/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/catalog_sales/part-1.parquet:.., /testdata/tpcds/plans_sf1_partitions4/catalog_sales/part-2.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/catalog_sales/part-2.parquet:.., /testdata/tpcds/plans_sf1_partitions4/catalog_sales/part-3.parquet:..]]}, projection=[cs_sold_date_sk, cs_item_sk, cs_order_number, cs_quantity, cs_ext_sales_price], file_type=parquet, predicate=DynamicFilter [ empty ] AND DynamicFilter [ empty ] └────────────────────────────────────────────────── ┌───── Stage 12 ── Tasks: t0:[p0..p8] t1:[p9..p17] │ BroadcastExec: input_partitions=3, consumer_tasks=3, output_partitions=9 │ ProjectionExec: expr=[d_date_sk@0 as d_date_sk, d_year@1 as d_year, CAST(d_date_sk@0 AS Float64) as CAST(date_dim.d_date_sk AS Float64)] │ FilterExec: d_year@1 = 2001 │ RepartitionExec: partitioning=RoundRobinBatch(3), input_partitions=2 - │ PartitionIsolatorExec: tasks=2 partitions=4 - │ DataSourceExec: file_groups={4 groups: [[/testdata/tpcds/plans_sf1_partitions4/date_dim/part-0.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-2.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-3.parquet]]}, projection=[d_date_sk, d_year], file_type=parquet, predicate=d_year@6 = 2001, pruning_predicate=d_year_null_count@2 != row_count@3 AND d_year_min@0 <= 2001 AND 2001 <= d_year_max@1, required_guarantees=[d_year in (2001)] + │ DistributedLeafExec: DataSourceExec: file_groups={2 groups: [[/testdata/tpcds/plans_sf1_partitions4/date_dim/part-0.parquet, /testdata/tpcds/plans_sf1_partitions4/date_dim/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-2.parquet, /testdata/tpcds/plans_sf1_partitions4/date_dim/part-3.parquet]]}, projection=[d_date_sk, d_year], file_type=parquet, predicate=d_year@6 = 2001, pruning_predicate=d_year_null_count@2 != row_count@3 AND d_year_min@0 <= 2001 AND 2001 <= d_year_max@1, required_guarantees=[d_year in (2001)] └────────────────────────────────────────────────── ┌───── Stage 13 ── Tasks: t0:[p0..p8] t1:[p9..p17] │ BroadcastExec: input_partitions=3, consumer_tasks=3, output_partitions=9 │ FilterExec: i_category@4 = Books, projection=[i_item_sk@0, i_brand_id@1, i_class_id@2, i_category_id@3, i_manufact_id@5] │ RepartitionExec: partitioning=RoundRobinBatch(3), input_partitions=2 - │ PartitionIsolatorExec: tasks=2 partitions=4 - │ DataSourceExec: file_groups={4 groups: [[/testdata/tpcds/plans_sf1_partitions4/item/part-0.parquet], [/testdata/tpcds/plans_sf1_partitions4/item/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/item/part-2.parquet], [/testdata/tpcds/plans_sf1_partitions4/item/part-3.parquet]]}, projection=[i_item_sk, i_brand_id, i_class_id, i_category_id, i_category, i_manufact_id], file_type=parquet, predicate=i_category@12 = Books AND DynamicFilter [ empty ], pruning_predicate=i_category_null_count@2 != row_count@3 AND i_category_min@0 <= Books AND Books <= i_category_max@1, required_guarantees=[i_category in (Books)] + │ DistributedLeafExec: DataSourceExec: file_groups={2 groups: [[/testdata/tpcds/plans_sf1_partitions4/item/part-0.parquet, /testdata/tpcds/plans_sf1_partitions4/item/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/item/part-2.parquet, /testdata/tpcds/plans_sf1_partitions4/item/part-3.parquet]]}, projection=[i_item_sk, i_brand_id, i_class_id, i_category_id, i_category, i_manufact_id], file_type=parquet, predicate=i_category@12 = Books AND DynamicFilter [ empty ], pruning_predicate=i_category_null_count@2 != row_count@3 AND i_category_min@0 <= Books AND Books <= i_category_max@1, required_guarantees=[i_category in (Books)] └────────────────────────────────────────────────── - ┌───── Stage 17 ── Tasks: t0:[p0..p1] t1:[p2..p3] t2:[p4..p5] + ┌───── Stage 17 ── Tasks: t0:[p0..p2] t1:[p3..p5] t2:[p6..p8] │ ProjectionExec: expr=[ss_item_sk@1 as ss_item_sk, ss_ticket_number@2 as ss_ticket_number, ss_quantity@3 as ss_quantity, ss_ext_sales_price@4 as ss_ext_sales_price, i_brand_id@5 as i_brand_id, i_class_id@6 as i_class_id, i_category_id@7 as i_category_id, i_manufact_id@8 as i_manufact_id, d_year@0 as d_year] │ HashJoinExec: mode=CollectLeft, join_type=Inner, on=[(CAST(date_dim.d_date_sk AS Float64)@2, ss_sold_date_sk@0)], projection=[d_year@1, ss_item_sk@4, ss_ticket_number@5, ss_quantity@6, ss_ext_sales_price@7, i_brand_id@8, i_class_id@9, i_category_id@10, i_manufact_id@11] │ CoalescePartitionsExec @@ -8363,25 +7723,22 @@ mod tests { │ HashJoinExec: mode=CollectLeft, join_type=Inner, on=[(i_item_sk@0, ss_item_sk@1)], projection=[i_brand_id@1, i_class_id@2, i_category_id@3, i_manufact_id@4, ss_sold_date_sk@5, ss_item_sk@6, ss_ticket_number@7, ss_quantity@8, ss_ext_sales_price@9] │ CoalescePartitionsExec │ [Stage 16] => NetworkBroadcastExec: partitions_per_consumer=3, stage_partitions=9, input_tasks=2 - │ PartitionIsolatorExec: tasks=3 partitions=6 - │ DataSourceExec: file_groups={6 groups: [[/testdata/tpcds/plans_sf1_partitions4/store_sales/part-0.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_sales/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_sales/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_sales/part-2.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_sales/part-2.parquet:..], ...]}, projection=[ss_sold_date_sk, ss_item_sk, ss_ticket_number, ss_quantity, ss_ext_sales_price], file_type=parquet, predicate=DynamicFilter [ empty ] AND DynamicFilter [ empty ] + │ DistributedLeafExec: DataSourceExec: file_groups={3 groups: [[/testdata/tpcds/plans_sf1_partitions4/store_sales/part-0.parquet:.., /testdata/tpcds/plans_sf1_partitions4/store_sales/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_sales/part-1.parquet:.., /testdata/tpcds/plans_sf1_partitions4/store_sales/part-2.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_sales/part-2.parquet:.., /testdata/tpcds/plans_sf1_partitions4/store_sales/part-3.parquet:..]]}, projection=[ss_sold_date_sk, ss_item_sk, ss_ticket_number, ss_quantity, ss_ext_sales_price], file_type=parquet, predicate=DynamicFilter [ empty ] AND DynamicFilter [ empty ] └────────────────────────────────────────────────── ┌───── Stage 15 ── Tasks: t0:[p0..p8] t1:[p9..p17] │ BroadcastExec: input_partitions=3, consumer_tasks=3, output_partitions=9 │ ProjectionExec: expr=[d_date_sk@0 as d_date_sk, d_year@1 as d_year, CAST(d_date_sk@0 AS Float64) as CAST(date_dim.d_date_sk AS Float64)] │ FilterExec: d_year@1 = 2001 │ RepartitionExec: partitioning=RoundRobinBatch(3), input_partitions=2 - │ PartitionIsolatorExec: tasks=2 partitions=4 - │ DataSourceExec: file_groups={4 groups: [[/testdata/tpcds/plans_sf1_partitions4/date_dim/part-0.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-2.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-3.parquet]]}, projection=[d_date_sk, d_year], file_type=parquet, predicate=d_year@6 = 2001, pruning_predicate=d_year_null_count@2 != row_count@3 AND d_year_min@0 <= 2001 AND 2001 <= d_year_max@1, required_guarantees=[d_year in (2001)] + │ DistributedLeafExec: DataSourceExec: file_groups={2 groups: [[/testdata/tpcds/plans_sf1_partitions4/date_dim/part-0.parquet, /testdata/tpcds/plans_sf1_partitions4/date_dim/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-2.parquet, /testdata/tpcds/plans_sf1_partitions4/date_dim/part-3.parquet]]}, projection=[d_date_sk, d_year], file_type=parquet, predicate=d_year@6 = 2001, pruning_predicate=d_year_null_count@2 != row_count@3 AND d_year_min@0 <= 2001 AND 2001 <= d_year_max@1, required_guarantees=[d_year in (2001)] └────────────────────────────────────────────────── ┌───── Stage 16 ── Tasks: t0:[p0..p8] t1:[p9..p17] │ BroadcastExec: input_partitions=3, consumer_tasks=3, output_partitions=9 │ FilterExec: i_category@4 = Books, projection=[i_item_sk@0, i_brand_id@1, i_class_id@2, i_category_id@3, i_manufact_id@5] │ RepartitionExec: partitioning=RoundRobinBatch(3), input_partitions=2 - │ PartitionIsolatorExec: tasks=2 partitions=4 - │ DataSourceExec: file_groups={4 groups: [[/testdata/tpcds/plans_sf1_partitions4/item/part-0.parquet], [/testdata/tpcds/plans_sf1_partitions4/item/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/item/part-2.parquet], [/testdata/tpcds/plans_sf1_partitions4/item/part-3.parquet]]}, projection=[i_item_sk, i_brand_id, i_class_id, i_category_id, i_category, i_manufact_id], file_type=parquet, predicate=i_category@12 = Books AND DynamicFilter [ empty ], pruning_predicate=i_category_null_count@2 != row_count@3 AND i_category_min@0 <= Books AND Books <= i_category_max@1, required_guarantees=[i_category in (Books)] + │ DistributedLeafExec: DataSourceExec: file_groups={2 groups: [[/testdata/tpcds/plans_sf1_partitions4/item/part-0.parquet, /testdata/tpcds/plans_sf1_partitions4/item/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/item/part-2.parquet, /testdata/tpcds/plans_sf1_partitions4/item/part-3.parquet]]}, projection=[i_item_sk, i_brand_id, i_class_id, i_category_id, i_category, i_manufact_id], file_type=parquet, predicate=i_category@12 = Books AND DynamicFilter [ empty ], pruning_predicate=i_category_null_count@2 != row_count@3 AND i_category_min@0 <= Books AND Books <= i_category_max@1, required_guarantees=[i_category in (Books)] └────────────────────────────────────────────────── - ┌───── Stage 20 ── Tasks: t0:[p0..p1] t1:[p2..p3] t2:[p4..p5] + ┌───── Stage 20 ── Tasks: t0:[p0..p2] t1:[p3..p5] t2:[p6..p8] │ ProjectionExec: expr=[ws_item_sk@1 as ws_item_sk, ws_order_number@2 as ws_order_number, ws_quantity@3 as ws_quantity, ws_ext_sales_price@4 as ws_ext_sales_price, i_brand_id@5 as i_brand_id, i_class_id@6 as i_class_id, i_category_id@7 as i_category_id, i_manufact_id@8 as i_manufact_id, d_year@0 as d_year] │ HashJoinExec: mode=CollectLeft, join_type=Inner, on=[(CAST(date_dim.d_date_sk AS Float64)@2, ws_sold_date_sk@0)], projection=[d_year@1, ws_item_sk@4, ws_order_number@5, ws_quantity@6, ws_ext_sales_price@7, i_brand_id@8, i_class_id@9, i_category_id@10, i_manufact_id@11] │ CoalescePartitionsExec @@ -8390,23 +7747,20 @@ mod tests { │ HashJoinExec: mode=CollectLeft, join_type=Inner, on=[(i_item_sk@0, ws_item_sk@1)], projection=[i_brand_id@1, i_class_id@2, i_category_id@3, i_manufact_id@4, ws_sold_date_sk@5, ws_item_sk@6, ws_order_number@7, ws_quantity@8, ws_ext_sales_price@9] │ CoalescePartitionsExec │ [Stage 19] => NetworkBroadcastExec: partitions_per_consumer=3, stage_partitions=9, input_tasks=2 - │ PartitionIsolatorExec: tasks=3 partitions=6 - │ DataSourceExec: file_groups={6 groups: [[/testdata/tpcds/plans_sf1_partitions4/web_sales/part-0.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/web_sales/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/web_sales/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/web_sales/part-2.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/web_sales/part-2.parquet:..], ...]}, projection=[ws_sold_date_sk, ws_item_sk, ws_order_number, ws_quantity, ws_ext_sales_price], file_type=parquet, predicate=DynamicFilter [ empty ] AND DynamicFilter [ empty ] + │ DistributedLeafExec: DataSourceExec: file_groups={3 groups: [[/testdata/tpcds/plans_sf1_partitions4/web_sales/part-0.parquet:.., /testdata/tpcds/plans_sf1_partitions4/web_sales/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/web_sales/part-1.parquet:.., /testdata/tpcds/plans_sf1_partitions4/web_sales/part-2.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/web_sales/part-2.parquet:.., /testdata/tpcds/plans_sf1_partitions4/web_sales/part-3.parquet:..]]}, projection=[ws_sold_date_sk, ws_item_sk, ws_order_number, ws_quantity, ws_ext_sales_price], file_type=parquet, predicate=DynamicFilter [ empty ] AND DynamicFilter [ empty ] └────────────────────────────────────────────────── ┌───── Stage 18 ── Tasks: t0:[p0..p8] t1:[p9..p17] │ BroadcastExec: input_partitions=3, consumer_tasks=3, output_partitions=9 │ ProjectionExec: expr=[d_date_sk@0 as d_date_sk, d_year@1 as d_year, CAST(d_date_sk@0 AS Float64) as CAST(date_dim.d_date_sk AS Float64)] │ FilterExec: d_year@1 = 2001 │ RepartitionExec: partitioning=RoundRobinBatch(3), input_partitions=2 - │ PartitionIsolatorExec: tasks=2 partitions=4 - │ DataSourceExec: file_groups={4 groups: [[/testdata/tpcds/plans_sf1_partitions4/date_dim/part-0.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-2.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-3.parquet]]}, projection=[d_date_sk, d_year], file_type=parquet, predicate=d_year@6 = 2001, pruning_predicate=d_year_null_count@2 != row_count@3 AND d_year_min@0 <= 2001 AND 2001 <= d_year_max@1, required_guarantees=[d_year in (2001)] + │ DistributedLeafExec: DataSourceExec: file_groups={2 groups: [[/testdata/tpcds/plans_sf1_partitions4/date_dim/part-0.parquet, /testdata/tpcds/plans_sf1_partitions4/date_dim/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-2.parquet, /testdata/tpcds/plans_sf1_partitions4/date_dim/part-3.parquet]]}, projection=[d_date_sk, d_year], file_type=parquet, predicate=d_year@6 = 2001, pruning_predicate=d_year_null_count@2 != row_count@3 AND d_year_min@0 <= 2001 AND 2001 <= d_year_max@1, required_guarantees=[d_year in (2001)] └────────────────────────────────────────────────── ┌───── Stage 19 ── Tasks: t0:[p0..p8] t1:[p9..p17] │ BroadcastExec: input_partitions=3, consumer_tasks=3, output_partitions=9 │ FilterExec: i_category@4 = Books, projection=[i_item_sk@0, i_brand_id@1, i_class_id@2, i_category_id@3, i_manufact_id@5] │ RepartitionExec: partitioning=RoundRobinBatch(3), input_partitions=2 - │ PartitionIsolatorExec: tasks=2 partitions=4 - │ DataSourceExec: file_groups={4 groups: [[/testdata/tpcds/plans_sf1_partitions4/item/part-0.parquet], [/testdata/tpcds/plans_sf1_partitions4/item/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/item/part-2.parquet], [/testdata/tpcds/plans_sf1_partitions4/item/part-3.parquet]]}, projection=[i_item_sk, i_brand_id, i_class_id, i_category_id, i_category, i_manufact_id], file_type=parquet, predicate=i_category@12 = Books AND DynamicFilter [ empty ], pruning_predicate=i_category_null_count@2 != row_count@3 AND i_category_min@0 <= Books AND Books <= i_category_max@1, required_guarantees=[i_category in (Books)] + │ DistributedLeafExec: DataSourceExec: file_groups={2 groups: [[/testdata/tpcds/plans_sf1_partitions4/item/part-0.parquet, /testdata/tpcds/plans_sf1_partitions4/item/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/item/part-2.parquet, /testdata/tpcds/plans_sf1_partitions4/item/part-3.parquet]]}, projection=[i_item_sk, i_brand_id, i_class_id, i_category_id, i_category, i_manufact_id], file_type=parquet, predicate=i_category@12 = Books AND DynamicFilter [ empty ], pruning_predicate=i_category_null_count@2 != row_count@3 AND i_category_min@0 <= Books AND Books <= i_category_max@1, required_guarantees=[i_category in (Books)] └────────────────────────────────────────────────── "); Ok(()) @@ -8446,67 +7800,58 @@ mod tests { │ RepartitionExec: partitioning=Hash([CAST(date_dim.d_date_sk AS Float64)@3], 6), input_partitions=3 │ ProjectionExec: expr=[d_date_sk@0 as d_date_sk, d_year@1 as d_year, d_qoy@2 as d_qoy, CAST(d_date_sk@0 AS Float64) as CAST(date_dim.d_date_sk AS Float64)] │ RepartitionExec: partitioning=RoundRobinBatch(3), input_partitions=2 - │ PartitionIsolatorExec: tasks=2 partitions=4 - │ DataSourceExec: file_groups={4 groups: [[/testdata/tpcds/plans_sf1_partitions4/date_dim/part-0.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-2.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-3.parquet]]}, projection=[d_date_sk, d_year, d_qoy], file_type=parquet + │ DistributedLeafExec: DataSourceExec: file_groups={2 groups: [[/testdata/tpcds/plans_sf1_partitions4/date_dim/part-0.parquet, /testdata/tpcds/plans_sf1_partitions4/date_dim/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-2.parquet, /testdata/tpcds/plans_sf1_partitions4/date_dim/part-3.parquet]]}, projection=[d_date_sk, d_year, d_qoy], file_type=parquet └────────────────────────────────────────────────── ┌───── Stage 3 ── Tasks: t0:[p0..p5] t1:[p0..p5] t2:[p0..p5] - │ RepartitionExec: partitioning=Hash([ss_sold_date_sk@0], 6), input_partitions=2 + │ RepartitionExec: partitioning=Hash([ss_sold_date_sk@0], 6), input_partitions=3 │ ProjectionExec: expr=[ss_sold_date_sk@1 as ss_sold_date_sk, ss_ext_sales_price@2 as ss_ext_sales_price, i_category@0 as i_category] │ HashJoinExec: mode=CollectLeft, join_type=Inner, on=[(i_item_sk@0, ss_item_sk@1)], projection=[i_category@1, ss_sold_date_sk@2, ss_ext_sales_price@4] │ CoalescePartitionsExec │ [Stage 2] => NetworkBroadcastExec: partitions_per_consumer=2, stage_partitions=6, input_tasks=2 │ FilterExec: ss_store_sk@2 IS NULL, projection=[ss_sold_date_sk@0, ss_item_sk@1, ss_ext_sales_price@3] - │ PartitionIsolatorExec: tasks=3 partitions=6 - │ DataSourceExec: file_groups={6 groups: [[/testdata/tpcds/plans_sf1_partitions4/store_sales/part-0.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_sales/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_sales/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_sales/part-2.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_sales/part-2.parquet:..], ...]}, projection=[ss_sold_date_sk, ss_item_sk, ss_store_sk, ss_ext_sales_price], file_type=parquet, predicate=ss_store_sk@7 IS NULL AND DynamicFilter [ empty ] AND DynamicFilter [ empty ], pruning_predicate=ss_store_sk_null_count@0 > 0, required_guarantees=[] + │ DistributedLeafExec: DataSourceExec: file_groups={3 groups: [[/testdata/tpcds/plans_sf1_partitions4/store_sales/part-0.parquet:.., /testdata/tpcds/plans_sf1_partitions4/store_sales/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_sales/part-1.parquet:.., /testdata/tpcds/plans_sf1_partitions4/store_sales/part-2.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_sales/part-2.parquet:.., /testdata/tpcds/plans_sf1_partitions4/store_sales/part-3.parquet:..]]}, projection=[ss_sold_date_sk, ss_item_sk, ss_store_sk, ss_ext_sales_price], file_type=parquet, predicate=ss_store_sk@7 IS NULL AND DynamicFilter [ empty ] AND DynamicFilter [ empty ], pruning_predicate=ss_store_sk_null_count@0 > 0, required_guarantees=[] └────────────────────────────────────────────────── ┌───── Stage 2 ── Tasks: t0:[p0..p5] t1:[p6..p11] │ BroadcastExec: input_partitions=2, consumer_tasks=3, output_partitions=6 - │ PartitionIsolatorExec: tasks=2 partitions=4 - │ DataSourceExec: file_groups={4 groups: [[/testdata/tpcds/plans_sf1_partitions4/item/part-0.parquet], [/testdata/tpcds/plans_sf1_partitions4/item/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/item/part-2.parquet], [/testdata/tpcds/plans_sf1_partitions4/item/part-3.parquet]]}, projection=[i_item_sk, i_category], file_type=parquet + │ DistributedLeafExec: DataSourceExec: file_groups={2 groups: [[/testdata/tpcds/plans_sf1_partitions4/item/part-0.parquet, /testdata/tpcds/plans_sf1_partitions4/item/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/item/part-2.parquet, /testdata/tpcds/plans_sf1_partitions4/item/part-3.parquet]]}, projection=[i_item_sk, i_category], file_type=parquet └────────────────────────────────────────────────── ┌───── Stage 4 ── Tasks: t0:[p0..p2] t1:[p0..p2] │ RepartitionExec: partitioning=Hash([CAST(date_dim.d_date_sk AS Float64)@3], 3), input_partitions=3 │ ProjectionExec: expr=[d_date_sk@0 as d_date_sk, d_year@1 as d_year, d_qoy@2 as d_qoy, CAST(d_date_sk@0 AS Float64) as CAST(date_dim.d_date_sk AS Float64)] │ RepartitionExec: partitioning=RoundRobinBatch(3), input_partitions=2 - │ PartitionIsolatorExec: tasks=2 partitions=4 - │ DataSourceExec: file_groups={4 groups: [[/testdata/tpcds/plans_sf1_partitions4/date_dim/part-0.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-2.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-3.parquet]]}, projection=[d_date_sk, d_year, d_qoy], file_type=parquet + │ DistributedLeafExec: DataSourceExec: file_groups={2 groups: [[/testdata/tpcds/plans_sf1_partitions4/date_dim/part-0.parquet, /testdata/tpcds/plans_sf1_partitions4/date_dim/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-2.parquet, /testdata/tpcds/plans_sf1_partitions4/date_dim/part-3.parquet]]}, projection=[d_date_sk, d_year, d_qoy], file_type=parquet └────────────────────────────────────────────────── ┌───── Stage 6 ── Tasks: t0:[p0..p2] t1:[p0..p2] t2:[p0..p2] - │ RepartitionExec: partitioning=Hash([ws_sold_date_sk@0], 3), input_partitions=2 + │ RepartitionExec: partitioning=Hash([ws_sold_date_sk@0], 3), input_partitions=3 │ ProjectionExec: expr=[ws_sold_date_sk@1 as ws_sold_date_sk, ws_ext_sales_price@2 as ws_ext_sales_price, i_category@0 as i_category] │ HashJoinExec: mode=CollectLeft, join_type=Inner, on=[(i_item_sk@0, ws_item_sk@1)], projection=[i_category@1, ws_sold_date_sk@2, ws_ext_sales_price@4] │ CoalescePartitionsExec │ [Stage 5] => NetworkBroadcastExec: partitions_per_consumer=2, stage_partitions=6, input_tasks=2 │ FilterExec: ws_ship_customer_sk@2 IS NULL, projection=[ws_sold_date_sk@0, ws_item_sk@1, ws_ext_sales_price@3] - │ PartitionIsolatorExec: tasks=3 partitions=6 - │ DataSourceExec: file_groups={6 groups: [[/testdata/tpcds/plans_sf1_partitions4/web_sales/part-0.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/web_sales/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/web_sales/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/web_sales/part-2.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/web_sales/part-2.parquet:..], ...]}, projection=[ws_sold_date_sk, ws_item_sk, ws_ship_customer_sk, ws_ext_sales_price], file_type=parquet, predicate=ws_ship_customer_sk@8 IS NULL AND DynamicFilter [ empty ] AND DynamicFilter [ empty ], pruning_predicate=ws_ship_customer_sk_null_count@0 > 0, required_guarantees=[] + │ DistributedLeafExec: DataSourceExec: file_groups={3 groups: [[/testdata/tpcds/plans_sf1_partitions4/web_sales/part-0.parquet:.., /testdata/tpcds/plans_sf1_partitions4/web_sales/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/web_sales/part-1.parquet:.., /testdata/tpcds/plans_sf1_partitions4/web_sales/part-2.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/web_sales/part-2.parquet:.., /testdata/tpcds/plans_sf1_partitions4/web_sales/part-3.parquet:..]]}, projection=[ws_sold_date_sk, ws_item_sk, ws_ship_customer_sk, ws_ext_sales_price], file_type=parquet, predicate=ws_ship_customer_sk@8 IS NULL AND DynamicFilter [ empty ] AND DynamicFilter [ empty ], pruning_predicate=ws_ship_customer_sk_null_count@0 > 0, required_guarantees=[] └────────────────────────────────────────────────── ┌───── Stage 5 ── Tasks: t0:[p0..p5] t1:[p6..p11] │ BroadcastExec: input_partitions=2, consumer_tasks=3, output_partitions=6 - │ PartitionIsolatorExec: tasks=2 partitions=4 - │ DataSourceExec: file_groups={4 groups: [[/testdata/tpcds/plans_sf1_partitions4/item/part-0.parquet], [/testdata/tpcds/plans_sf1_partitions4/item/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/item/part-2.parquet], [/testdata/tpcds/plans_sf1_partitions4/item/part-3.parquet]]}, projection=[i_item_sk, i_category], file_type=parquet + │ DistributedLeafExec: DataSourceExec: file_groups={2 groups: [[/testdata/tpcds/plans_sf1_partitions4/item/part-0.parquet, /testdata/tpcds/plans_sf1_partitions4/item/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/item/part-2.parquet, /testdata/tpcds/plans_sf1_partitions4/item/part-3.parquet]]}, projection=[i_item_sk, i_category], file_type=parquet └────────────────────────────────────────────────── ┌───── Stage 7 ── Tasks: t0:[p0..p2] t1:[p0..p2] │ RepartitionExec: partitioning=Hash([CAST(date_dim.d_date_sk AS Float64)@3], 3), input_partitions=3 │ ProjectionExec: expr=[d_date_sk@0 as d_date_sk, d_year@1 as d_year, d_qoy@2 as d_qoy, CAST(d_date_sk@0 AS Float64) as CAST(date_dim.d_date_sk AS Float64)] │ RepartitionExec: partitioning=RoundRobinBatch(3), input_partitions=2 - │ PartitionIsolatorExec: tasks=2 partitions=4 - │ DataSourceExec: file_groups={4 groups: [[/testdata/tpcds/plans_sf1_partitions4/date_dim/part-0.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-2.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-3.parquet]]}, projection=[d_date_sk, d_year, d_qoy], file_type=parquet + │ DistributedLeafExec: DataSourceExec: file_groups={2 groups: [[/testdata/tpcds/plans_sf1_partitions4/date_dim/part-0.parquet, /testdata/tpcds/plans_sf1_partitions4/date_dim/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-2.parquet, /testdata/tpcds/plans_sf1_partitions4/date_dim/part-3.parquet]]}, projection=[d_date_sk, d_year, d_qoy], file_type=parquet └────────────────────────────────────────────────── ┌───── Stage 9 ── Tasks: t0:[p0..p2] t1:[p0..p2] t2:[p0..p2] - │ RepartitionExec: partitioning=Hash([cs_sold_date_sk@0], 3), input_partitions=2 + │ RepartitionExec: partitioning=Hash([cs_sold_date_sk@0], 3), input_partitions=3 │ ProjectionExec: expr=[cs_sold_date_sk@1 as cs_sold_date_sk, cs_ext_sales_price@2 as cs_ext_sales_price, i_category@0 as i_category] │ HashJoinExec: mode=CollectLeft, join_type=Inner, on=[(i_item_sk@0, cs_item_sk@1)], projection=[i_category@1, cs_sold_date_sk@2, cs_ext_sales_price@4] │ CoalescePartitionsExec │ [Stage 8] => NetworkBroadcastExec: partitions_per_consumer=2, stage_partitions=6, input_tasks=2 │ FilterExec: cs_ship_addr_sk@1 IS NULL, projection=[cs_sold_date_sk@0, cs_item_sk@2, cs_ext_sales_price@3] - │ PartitionIsolatorExec: tasks=3 partitions=6 - │ DataSourceExec: file_groups={6 groups: [[/testdata/tpcds/plans_sf1_partitions4/catalog_sales/part-0.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/catalog_sales/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/catalog_sales/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/catalog_sales/part-2.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/catalog_sales/part-2.parquet:..], ...]}, projection=[cs_sold_date_sk, cs_ship_addr_sk, cs_item_sk, cs_ext_sales_price], file_type=parquet, predicate=cs_ship_addr_sk@10 IS NULL AND DynamicFilter [ empty ] AND DynamicFilter [ empty ], pruning_predicate=cs_ship_addr_sk_null_count@0 > 0, required_guarantees=[] + │ DistributedLeafExec: DataSourceExec: file_groups={3 groups: [[/testdata/tpcds/plans_sf1_partitions4/catalog_sales/part-0.parquet:.., /testdata/tpcds/plans_sf1_partitions4/catalog_sales/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/catalog_sales/part-1.parquet:.., /testdata/tpcds/plans_sf1_partitions4/catalog_sales/part-2.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/catalog_sales/part-2.parquet:.., /testdata/tpcds/plans_sf1_partitions4/catalog_sales/part-3.parquet:..]]}, projection=[cs_sold_date_sk, cs_ship_addr_sk, cs_item_sk, cs_ext_sales_price], file_type=parquet, predicate=cs_ship_addr_sk@10 IS NULL AND DynamicFilter [ empty ] AND DynamicFilter [ empty ], pruning_predicate=cs_ship_addr_sk_null_count@0 > 0, required_guarantees=[] └────────────────────────────────────────────────── ┌───── Stage 8 ── Tasks: t0:[p0..p5] t1:[p6..p11] │ BroadcastExec: input_partitions=2, consumer_tasks=3, output_partitions=6 - │ PartitionIsolatorExec: tasks=2 partitions=4 - │ DataSourceExec: file_groups={4 groups: [[/testdata/tpcds/plans_sf1_partitions4/item/part-0.parquet], [/testdata/tpcds/plans_sf1_partitions4/item/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/item/part-2.parquet], [/testdata/tpcds/plans_sf1_partitions4/item/part-3.parquet]]}, projection=[i_item_sk, i_category], file_type=parquet + │ DistributedLeafExec: DataSourceExec: file_groups={2 groups: [[/testdata/tpcds/plans_sf1_partitions4/item/part-0.parquet, /testdata/tpcds/plans_sf1_partitions4/item/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/item/part-2.parquet, /testdata/tpcds/plans_sf1_partitions4/item/part-3.parquet]]}, projection=[i_item_sk, i_category], file_type=parquet └────────────────────────────────────────────────── "); Ok(()) @@ -8514,7 +7859,7 @@ mod tests { #[tokio::test] async fn test_tpcds_77() -> Result<()> { let display = test_tpcds_query("q77").await?; - assert_snapshot!(display, @" + assert_snapshot!(display, @r" ┌───── DistributedExec ── Tasks: t0:[p0] │ SortPreservingMergeExec: [channel@0 ASC, id@1 ASC, returns_@3 DESC], fetch=100 │ [Stage 21] => NetworkCoalesceExec: output_partitions=6, input_tasks=2 @@ -8560,7 +7905,7 @@ mod tests { │ [Stage 3] => NetworkShuffleExec: output_partitions=3, input_tasks=3 └────────────────────────────────────────────────── ┌───── Stage 3 ── Tasks: t0:[p0..p5] t1:[p0..p5] t2:[p0..p5] - │ RepartitionExec: partitioning=Hash([s_store_sk@0], 6), input_partitions=2 + │ RepartitionExec: partitioning=Hash([s_store_sk@0], 6), input_partitions=3 │ AggregateExec: mode=Partial, gby=[s_store_sk@2 as s_store_sk], aggr=[sum(store_returns.sr_return_amt), sum(store_returns.sr_net_loss)] │ ProjectionExec: expr=[sr_return_amt@1 as sr_return_amt, sr_net_loss@2 as sr_net_loss, s_store_sk@0 as s_store_sk] │ HashJoinExec: mode=CollectLeft, join_type=Inner, on=[(CAST(store.s_store_sk AS Float64)@1, sr_store_sk@0)], projection=[s_store_sk@0, sr_return_amt@3, sr_net_loss@4] @@ -8569,24 +7914,21 @@ mod tests { │ HashJoinExec: mode=CollectLeft, join_type=Inner, on=[(CAST(date_dim.d_date_sk AS Float64)@1, sr_returned_date_sk@0)], projection=[sr_store_sk@3, sr_return_amt@4, sr_net_loss@5] │ CoalescePartitionsExec │ [Stage 2] => NetworkBroadcastExec: partitions_per_consumer=3, stage_partitions=9, input_tasks=2 - │ PartitionIsolatorExec: tasks=3 partitions=6 - │ DataSourceExec: file_groups={6 groups: [[/testdata/tpcds/plans_sf1_partitions4/store_returns/part-0.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_returns/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_returns/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_returns/part-2.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_returns/part-2.parquet:..], ...]}, projection=[sr_returned_date_sk, sr_store_sk, sr_return_amt, sr_net_loss], file_type=parquet, predicate=DynamicFilter [ empty ] AND DynamicFilter [ empty ] + │ DistributedLeafExec: DataSourceExec: file_groups={3 groups: [[/testdata/tpcds/plans_sf1_partitions4/store_returns/part-0.parquet:.., /testdata/tpcds/plans_sf1_partitions4/store_returns/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_returns/part-1.parquet:.., /testdata/tpcds/plans_sf1_partitions4/store_returns/part-2.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_returns/part-2.parquet:.., /testdata/tpcds/plans_sf1_partitions4/store_returns/part-3.parquet:..]]}, projection=[sr_returned_date_sk, sr_store_sk, sr_return_amt, sr_net_loss], file_type=parquet, predicate=DynamicFilter [ empty ] AND DynamicFilter [ empty ] └────────────────────────────────────────────────── ┌───── Stage 1 ── Tasks: t0:[p0..p5] t1:[p6..p11] │ BroadcastExec: input_partitions=2, consumer_tasks=3, output_partitions=6 - │ PartitionIsolatorExec: tasks=2 partitions=4 - │ DataSourceExec: file_groups={4 groups: [[/testdata/tpcds/plans_sf1_partitions4/store/part-0.parquet], [/testdata/tpcds/plans_sf1_partitions4/store/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/store/part-2.parquet], [/testdata/tpcds/plans_sf1_partitions4/store/part-3.parquet]]}, projection=[s_store_sk, CAST(s_store_sk@0 AS Float64) as CAST(store.s_store_sk AS Float64)], file_type=parquet + │ DistributedLeafExec: DataSourceExec: file_groups={2 groups: [[/testdata/tpcds/plans_sf1_partitions4/store/part-0.parquet, /testdata/tpcds/plans_sf1_partitions4/store/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/store/part-2.parquet, /testdata/tpcds/plans_sf1_partitions4/store/part-3.parquet]]}, projection=[s_store_sk, CAST(s_store_sk@0 AS Float64) as CAST(store.s_store_sk AS Float64)], file_type=parquet └────────────────────────────────────────────────── ┌───── Stage 2 ── Tasks: t0:[p0..p8] t1:[p9..p17] │ BroadcastExec: input_partitions=3, consumer_tasks=3, output_partitions=9 │ ProjectionExec: expr=[d_date_sk@0 as d_date_sk, CAST(d_date_sk@0 AS Float64) as CAST(date_dim.d_date_sk AS Float64)] │ FilterExec: d_date@1 >= 2000-08-23 AND d_date@1 <= 2000-09-22, projection=[d_date_sk@0] │ RepartitionExec: partitioning=RoundRobinBatch(3), input_partitions=2 - │ PartitionIsolatorExec: tasks=2 partitions=4 - │ DataSourceExec: file_groups={4 groups: [[/testdata/tpcds/plans_sf1_partitions4/date_dim/part-0.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-2.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-3.parquet]]}, projection=[d_date_sk, d_date], file_type=parquet, predicate=d_date@2 >= 2000-08-23 AND d_date@2 <= 2000-09-22, pruning_predicate=d_date_null_count@1 != row_count@2 AND d_date_max@0 >= 2000-08-23 AND d_date_null_count@1 != row_count@2 AND d_date_min@3 <= 2000-09-22, required_guarantees=[] + │ DistributedLeafExec: DataSourceExec: file_groups={2 groups: [[/testdata/tpcds/plans_sf1_partitions4/date_dim/part-0.parquet, /testdata/tpcds/plans_sf1_partitions4/date_dim/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-2.parquet, /testdata/tpcds/plans_sf1_partitions4/date_dim/part-3.parquet]]}, projection=[d_date_sk, d_date], file_type=parquet, predicate=d_date@2 >= 2000-08-23 AND d_date@2 <= 2000-09-22, pruning_predicate=d_date_null_count@1 != row_count@2 AND d_date_max@0 >= 2000-08-23 AND d_date_null_count@1 != row_count@2 AND d_date_min@3 <= 2000-09-22, required_guarantees=[] └────────────────────────────────────────────────── ┌───── Stage 7 ── Tasks: t0:[p0..p5] t1:[p0..p5] t2:[p0..p5] - │ RepartitionExec: partitioning=Hash([s_store_sk@0], 6), input_partitions=2 + │ RepartitionExec: partitioning=Hash([s_store_sk@0], 6), input_partitions=3 │ AggregateExec: mode=Partial, gby=[s_store_sk@2 as s_store_sk], aggr=[sum(store_sales.ss_ext_sales_price), sum(store_sales.ss_net_profit)] │ ProjectionExec: expr=[ss_ext_sales_price@1 as ss_ext_sales_price, ss_net_profit@2 as ss_net_profit, s_store_sk@0 as s_store_sk] │ HashJoinExec: mode=CollectLeft, join_type=Inner, on=[(CAST(store.s_store_sk AS Float64)@1, ss_store_sk@0)], projection=[s_store_sk@0, ss_ext_sales_price@3, ss_net_profit@4] @@ -8595,21 +7937,18 @@ mod tests { │ HashJoinExec: mode=CollectLeft, join_type=Inner, on=[(CAST(date_dim.d_date_sk AS Float64)@1, ss_sold_date_sk@0)], projection=[ss_store_sk@3, ss_ext_sales_price@4, ss_net_profit@5] │ CoalescePartitionsExec │ [Stage 6] => NetworkBroadcastExec: partitions_per_consumer=3, stage_partitions=9, input_tasks=2 - │ PartitionIsolatorExec: tasks=3 partitions=6 - │ DataSourceExec: file_groups={6 groups: [[/testdata/tpcds/plans_sf1_partitions4/store_sales/part-0.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_sales/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_sales/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_sales/part-2.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_sales/part-2.parquet:..], ...]}, projection=[ss_sold_date_sk, ss_store_sk, ss_ext_sales_price, ss_net_profit], file_type=parquet, predicate=DynamicFilter [ empty ] AND DynamicFilter [ empty ] + │ DistributedLeafExec: DataSourceExec: file_groups={3 groups: [[/testdata/tpcds/plans_sf1_partitions4/store_sales/part-0.parquet:.., /testdata/tpcds/plans_sf1_partitions4/store_sales/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_sales/part-1.parquet:.., /testdata/tpcds/plans_sf1_partitions4/store_sales/part-2.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_sales/part-2.parquet:.., /testdata/tpcds/plans_sf1_partitions4/store_sales/part-3.parquet:..]]}, projection=[ss_sold_date_sk, ss_store_sk, ss_ext_sales_price, ss_net_profit], file_type=parquet, predicate=DynamicFilter [ empty ] AND DynamicFilter [ empty ] └────────────────────────────────────────────────── ┌───── Stage 5 ── Tasks: t0:[p0..p5] t1:[p6..p11] │ BroadcastExec: input_partitions=2, consumer_tasks=3, output_partitions=6 - │ PartitionIsolatorExec: tasks=2 partitions=4 - │ DataSourceExec: file_groups={4 groups: [[/testdata/tpcds/plans_sf1_partitions4/store/part-0.parquet], [/testdata/tpcds/plans_sf1_partitions4/store/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/store/part-2.parquet], [/testdata/tpcds/plans_sf1_partitions4/store/part-3.parquet]]}, projection=[s_store_sk, CAST(s_store_sk@0 AS Float64) as CAST(store.s_store_sk AS Float64)], file_type=parquet + │ DistributedLeafExec: DataSourceExec: file_groups={2 groups: [[/testdata/tpcds/plans_sf1_partitions4/store/part-0.parquet, /testdata/tpcds/plans_sf1_partitions4/store/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/store/part-2.parquet, /testdata/tpcds/plans_sf1_partitions4/store/part-3.parquet]]}, projection=[s_store_sk, CAST(s_store_sk@0 AS Float64) as CAST(store.s_store_sk AS Float64)], file_type=parquet └────────────────────────────────────────────────── ┌───── Stage 6 ── Tasks: t0:[p0..p8] t1:[p9..p17] │ BroadcastExec: input_partitions=3, consumer_tasks=3, output_partitions=9 │ ProjectionExec: expr=[d_date_sk@0 as d_date_sk, CAST(d_date_sk@0 AS Float64) as CAST(date_dim.d_date_sk AS Float64)] │ FilterExec: d_date@1 >= 2000-08-23 AND d_date@1 <= 2000-09-22, projection=[d_date_sk@0] │ RepartitionExec: partitioning=RoundRobinBatch(3), input_partitions=2 - │ PartitionIsolatorExec: tasks=2 partitions=4 - │ DataSourceExec: file_groups={4 groups: [[/testdata/tpcds/plans_sf1_partitions4/date_dim/part-0.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-2.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-3.parquet]]}, projection=[d_date_sk, d_date], file_type=parquet, predicate=d_date@2 >= 2000-08-23 AND d_date@2 <= 2000-09-22, pruning_predicate=d_date_null_count@1 != row_count@2 AND d_date_max@0 >= 2000-08-23 AND d_date_null_count@1 != row_count@2 AND d_date_min@3 <= 2000-09-22, required_guarantees=[] + │ DistributedLeafExec: DataSourceExec: file_groups={2 groups: [[/testdata/tpcds/plans_sf1_partitions4/date_dim/part-0.parquet, /testdata/tpcds/plans_sf1_partitions4/date_dim/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-2.parquet, /testdata/tpcds/plans_sf1_partitions4/date_dim/part-3.parquet]]}, projection=[d_date_sk, d_date], file_type=parquet, predicate=d_date@2 >= 2000-08-23 AND d_date@2 <= 2000-09-22, pruning_predicate=d_date_null_count@1 != row_count@2 AND d_date_max@0 >= 2000-08-23 AND d_date_null_count@1 != row_count@2 AND d_date_min@3 <= 2000-09-22, required_guarantees=[] └────────────────────────────────────────────────── ┌───── Stage 10 ── Tasks: t0:[p0..p2] t1:[p0..p2] │ ProjectionExec: expr=[cs_call_center_sk@0 as cs_call_center_sk, sum(catalog_sales.cs_ext_sales_price)@1 as sales, sum(catalog_sales.cs_net_profit)@2 as profit] @@ -8617,37 +7956,33 @@ mod tests { │ [Stage 9] => NetworkShuffleExec: output_partitions=3, input_tasks=3 └────────────────────────────────────────────────── ┌───── Stage 9 ── Tasks: t0:[p0..p5] t1:[p0..p5] t2:[p0..p5] - │ RepartitionExec: partitioning=Hash([cs_call_center_sk@0], 6), input_partitions=2 + │ RepartitionExec: partitioning=Hash([cs_call_center_sk@0], 6), input_partitions=3 │ AggregateExec: mode=Partial, gby=[cs_call_center_sk@0 as cs_call_center_sk], aggr=[sum(catalog_sales.cs_ext_sales_price), sum(catalog_sales.cs_net_profit)] │ HashJoinExec: mode=CollectLeft, join_type=Inner, on=[(CAST(date_dim.d_date_sk AS Float64)@1, cs_sold_date_sk@0)], projection=[cs_call_center_sk@3, cs_ext_sales_price@4, cs_net_profit@5] │ CoalescePartitionsExec │ [Stage 8] => NetworkBroadcastExec: partitions_per_consumer=3, stage_partitions=9, input_tasks=2 - │ PartitionIsolatorExec: tasks=3 partitions=6 - │ DataSourceExec: file_groups={6 groups: [[/testdata/tpcds/plans_sf1_partitions4/catalog_sales/part-0.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/catalog_sales/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/catalog_sales/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/catalog_sales/part-2.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/catalog_sales/part-2.parquet:..], ...]}, projection=[cs_sold_date_sk, cs_call_center_sk, cs_ext_sales_price, cs_net_profit], file_type=parquet, predicate=DynamicFilter [ empty ] + │ DistributedLeafExec: DataSourceExec: file_groups={3 groups: [[/testdata/tpcds/plans_sf1_partitions4/catalog_sales/part-0.parquet:.., /testdata/tpcds/plans_sf1_partitions4/catalog_sales/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/catalog_sales/part-1.parquet:.., /testdata/tpcds/plans_sf1_partitions4/catalog_sales/part-2.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/catalog_sales/part-2.parquet:.., /testdata/tpcds/plans_sf1_partitions4/catalog_sales/part-3.parquet:..]]}, projection=[cs_sold_date_sk, cs_call_center_sk, cs_ext_sales_price, cs_net_profit], file_type=parquet, predicate=DynamicFilter [ empty ] └────────────────────────────────────────────────── ┌───── Stage 8 ── Tasks: t0:[p0..p8] t1:[p9..p17] │ BroadcastExec: input_partitions=3, consumer_tasks=3, output_partitions=9 │ ProjectionExec: expr=[d_date_sk@0 as d_date_sk, CAST(d_date_sk@0 AS Float64) as CAST(date_dim.d_date_sk AS Float64)] │ FilterExec: d_date@1 >= 2000-08-23 AND d_date@1 <= 2000-09-22, projection=[d_date_sk@0] │ RepartitionExec: partitioning=RoundRobinBatch(3), input_partitions=2 - │ PartitionIsolatorExec: tasks=2 partitions=4 - │ DataSourceExec: file_groups={4 groups: [[/testdata/tpcds/plans_sf1_partitions4/date_dim/part-0.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-2.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-3.parquet]]}, projection=[d_date_sk, d_date], file_type=parquet, predicate=d_date@2 >= 2000-08-23 AND d_date@2 <= 2000-09-22, pruning_predicate=d_date_null_count@1 != row_count@2 AND d_date_max@0 >= 2000-08-23 AND d_date_null_count@1 != row_count@2 AND d_date_min@3 <= 2000-09-22, required_guarantees=[] + │ DistributedLeafExec: DataSourceExec: file_groups={2 groups: [[/testdata/tpcds/plans_sf1_partitions4/date_dim/part-0.parquet, /testdata/tpcds/plans_sf1_partitions4/date_dim/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-2.parquet, /testdata/tpcds/plans_sf1_partitions4/date_dim/part-3.parquet]]}, projection=[d_date_sk, d_date], file_type=parquet, predicate=d_date@2 >= 2000-08-23 AND d_date@2 <= 2000-09-22, pruning_predicate=d_date_null_count@1 != row_count@2 AND d_date_max@0 >= 2000-08-23 AND d_date_null_count@1 != row_count@2 AND d_date_min@3 <= 2000-09-22, required_guarantees=[] └────────────────────────────────────────────────── ┌───── Stage 12 ── Tasks: t0:[p0..p2] t1:[p0..p2] t2:[p0..p2] - │ RepartitionExec: partitioning=Hash([cr_call_center_sk@0], 3), input_partitions=2 + │ RepartitionExec: partitioning=Hash([cr_call_center_sk@0], 3), input_partitions=3 │ AggregateExec: mode=Partial, gby=[cr_call_center_sk@0 as cr_call_center_sk], aggr=[sum(catalog_returns.cr_return_amount), sum(catalog_returns.cr_net_loss)] │ HashJoinExec: mode=CollectLeft, join_type=Inner, on=[(d_date_sk@0, cr_returned_date_sk@0)], projection=[cr_call_center_sk@2, cr_return_amount@3, cr_net_loss@4] │ CoalescePartitionsExec │ [Stage 11] => NetworkBroadcastExec: partitions_per_consumer=3, stage_partitions=9, input_tasks=2 - │ PartitionIsolatorExec: tasks=3 partitions=6 - │ DataSourceExec: file_groups={6 groups: [[/testdata/tpcds/plans_sf1_partitions4/catalog_returns/part-0.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/catalog_returns/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/catalog_returns/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/catalog_returns/part-2.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/catalog_returns/part-2.parquet:..], ...]}, projection=[cr_returned_date_sk, cr_call_center_sk, cr_return_amount, cr_net_loss], file_type=parquet, predicate=DynamicFilter [ empty ] + │ DistributedLeafExec: DataSourceExec: file_groups={3 groups: [[/testdata/tpcds/plans_sf1_partitions4/catalog_returns/part-0.parquet:.., /testdata/tpcds/plans_sf1_partitions4/catalog_returns/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/catalog_returns/part-1.parquet:.., /testdata/tpcds/plans_sf1_partitions4/catalog_returns/part-2.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/catalog_returns/part-2.parquet:.., /testdata/tpcds/plans_sf1_partitions4/catalog_returns/part-3.parquet:..]]}, projection=[cr_returned_date_sk, cr_call_center_sk, cr_return_amount, cr_net_loss], file_type=parquet, predicate=DynamicFilter [ empty ] └────────────────────────────────────────────────── ┌───── Stage 11 ── Tasks: t0:[p0..p8] t1:[p9..p17] │ BroadcastExec: input_partitions=3, consumer_tasks=3, output_partitions=9 │ FilterExec: d_date@1 >= 2000-08-23 AND d_date@1 <= 2000-09-22, projection=[d_date_sk@0] │ RepartitionExec: partitioning=RoundRobinBatch(3), input_partitions=2 - │ PartitionIsolatorExec: tasks=2 partitions=4 - │ DataSourceExec: file_groups={4 groups: [[/testdata/tpcds/plans_sf1_partitions4/date_dim/part-0.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-2.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-3.parquet]]}, projection=[d_date_sk, d_date], file_type=parquet, predicate=d_date@2 >= 2000-08-23 AND d_date@2 <= 2000-09-22, pruning_predicate=d_date_null_count@1 != row_count@2 AND d_date_max@0 >= 2000-08-23 AND d_date_null_count@1 != row_count@2 AND d_date_min@3 <= 2000-09-22, required_guarantees=[] + │ DistributedLeafExec: DataSourceExec: file_groups={2 groups: [[/testdata/tpcds/plans_sf1_partitions4/date_dim/part-0.parquet, /testdata/tpcds/plans_sf1_partitions4/date_dim/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-2.parquet, /testdata/tpcds/plans_sf1_partitions4/date_dim/part-3.parquet]]}, projection=[d_date_sk, d_date], file_type=parquet, predicate=d_date@2 >= 2000-08-23 AND d_date@2 <= 2000-09-22, pruning_predicate=d_date_null_count@1 != row_count@2 AND d_date_max@0 >= 2000-08-23 AND d_date_null_count@1 != row_count@2 AND d_date_min@3 <= 2000-09-22, required_guarantees=[] └────────────────────────────────────────────────── ┌───── Stage 16 ── Tasks: t0:[p0..p2] t1:[p0..p2] │ ProjectionExec: expr=[wp_web_page_sk@0 as wp_web_page_sk, sum(web_sales.ws_ext_sales_price)@1 as sales, sum(web_sales.ws_net_profit)@2 as profit] @@ -8655,7 +7990,7 @@ mod tests { │ [Stage 15] => NetworkShuffleExec: output_partitions=3, input_tasks=3 └────────────────────────────────────────────────── ┌───── Stage 15 ── Tasks: t0:[p0..p5] t1:[p0..p5] t2:[p0..p5] - │ RepartitionExec: partitioning=Hash([wp_web_page_sk@0], 6), input_partitions=2 + │ RepartitionExec: partitioning=Hash([wp_web_page_sk@0], 6), input_partitions=3 │ AggregateExec: mode=Partial, gby=[wp_web_page_sk@2 as wp_web_page_sk], aggr=[sum(web_sales.ws_ext_sales_price), sum(web_sales.ws_net_profit)] │ ProjectionExec: expr=[ws_ext_sales_price@1 as ws_ext_sales_price, ws_net_profit@2 as ws_net_profit, wp_web_page_sk@0 as wp_web_page_sk] │ HashJoinExec: mode=CollectLeft, join_type=Inner, on=[(CAST(web_page.wp_web_page_sk AS Float64)@1, ws_web_page_sk@0)], projection=[wp_web_page_sk@0, ws_ext_sales_price@3, ws_net_profit@4] @@ -8664,21 +7999,18 @@ mod tests { │ HashJoinExec: mode=CollectLeft, join_type=Inner, on=[(CAST(date_dim.d_date_sk AS Float64)@1, ws_sold_date_sk@0)], projection=[ws_web_page_sk@3, ws_ext_sales_price@4, ws_net_profit@5] │ CoalescePartitionsExec │ [Stage 14] => NetworkBroadcastExec: partitions_per_consumer=3, stage_partitions=9, input_tasks=2 - │ PartitionIsolatorExec: tasks=3 partitions=6 - │ DataSourceExec: file_groups={6 groups: [[/testdata/tpcds/plans_sf1_partitions4/web_sales/part-0.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/web_sales/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/web_sales/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/web_sales/part-2.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/web_sales/part-2.parquet:..], ...]}, projection=[ws_sold_date_sk, ws_web_page_sk, ws_ext_sales_price, ws_net_profit], file_type=parquet, predicate=DynamicFilter [ empty ] AND DynamicFilter [ empty ] + │ DistributedLeafExec: DataSourceExec: file_groups={3 groups: [[/testdata/tpcds/plans_sf1_partitions4/web_sales/part-0.parquet:.., /testdata/tpcds/plans_sf1_partitions4/web_sales/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/web_sales/part-1.parquet:.., /testdata/tpcds/plans_sf1_partitions4/web_sales/part-2.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/web_sales/part-2.parquet:.., /testdata/tpcds/plans_sf1_partitions4/web_sales/part-3.parquet:..]]}, projection=[ws_sold_date_sk, ws_web_page_sk, ws_ext_sales_price, ws_net_profit], file_type=parquet, predicate=DynamicFilter [ empty ] AND DynamicFilter [ empty ] └────────────────────────────────────────────────── ┌───── Stage 13 ── Tasks: t0:[p0..p5] t1:[p6..p11] │ BroadcastExec: input_partitions=2, consumer_tasks=3, output_partitions=6 - │ PartitionIsolatorExec: tasks=2 partitions=4 - │ DataSourceExec: file_groups={4 groups: [[/testdata/tpcds/plans_sf1_partitions4/web_page/part-0.parquet], [/testdata/tpcds/plans_sf1_partitions4/web_page/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/web_page/part-2.parquet], [/testdata/tpcds/plans_sf1_partitions4/web_page/part-3.parquet]]}, projection=[wp_web_page_sk, CAST(wp_web_page_sk@0 AS Float64) as CAST(web_page.wp_web_page_sk AS Float64)], file_type=parquet + │ DistributedLeafExec: DataSourceExec: file_groups={2 groups: [[/testdata/tpcds/plans_sf1_partitions4/web_page/part-0.parquet, /testdata/tpcds/plans_sf1_partitions4/web_page/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/web_page/part-2.parquet, /testdata/tpcds/plans_sf1_partitions4/web_page/part-3.parquet]]}, projection=[wp_web_page_sk, CAST(wp_web_page_sk@0 AS Float64) as CAST(web_page.wp_web_page_sk AS Float64)], file_type=parquet └────────────────────────────────────────────────── ┌───── Stage 14 ── Tasks: t0:[p0..p8] t1:[p9..p17] │ BroadcastExec: input_partitions=3, consumer_tasks=3, output_partitions=9 │ ProjectionExec: expr=[d_date_sk@0 as d_date_sk, CAST(d_date_sk@0 AS Float64) as CAST(date_dim.d_date_sk AS Float64)] │ FilterExec: d_date@1 >= 2000-08-23 AND d_date@1 <= 2000-09-22, projection=[d_date_sk@0] │ RepartitionExec: partitioning=RoundRobinBatch(3), input_partitions=2 - │ PartitionIsolatorExec: tasks=2 partitions=4 - │ DataSourceExec: file_groups={4 groups: [[/testdata/tpcds/plans_sf1_partitions4/date_dim/part-0.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-2.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-3.parquet]]}, projection=[d_date_sk, d_date], file_type=parquet, predicate=d_date@2 >= 2000-08-23 AND d_date@2 <= 2000-09-22, pruning_predicate=d_date_null_count@1 != row_count@2 AND d_date_max@0 >= 2000-08-23 AND d_date_null_count@1 != row_count@2 AND d_date_min@3 <= 2000-09-22, required_guarantees=[] + │ DistributedLeafExec: DataSourceExec: file_groups={2 groups: [[/testdata/tpcds/plans_sf1_partitions4/date_dim/part-0.parquet, /testdata/tpcds/plans_sf1_partitions4/date_dim/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-2.parquet, /testdata/tpcds/plans_sf1_partitions4/date_dim/part-3.parquet]]}, projection=[d_date_sk, d_date], file_type=parquet, predicate=d_date@2 >= 2000-08-23 AND d_date@2 <= 2000-09-22, pruning_predicate=d_date_null_count@1 != row_count@2 AND d_date_max@0 >= 2000-08-23 AND d_date_null_count@1 != row_count@2 AND d_date_min@3 <= 2000-09-22, required_guarantees=[] └────────────────────────────────────────────────── ┌───── Stage 19 ── Tasks: t0:[p0..p2] t1:[p0..p2] │ RepartitionExec: partitioning=Hash([wp_web_page_sk@0], 3), input_partitions=3 @@ -8691,21 +8023,18 @@ mod tests { │ CoalescePartitionsExec │ [Stage 18] => NetworkBroadcastExec: partitions_per_consumer=3, stage_partitions=6, input_tasks=2 │ RepartitionExec: partitioning=RoundRobinBatch(3), input_partitions=2 - │ PartitionIsolatorExec: tasks=2 partitions=4 - │ DataSourceExec: file_groups={4 groups: [[/testdata/tpcds/plans_sf1_partitions4/web_returns/part-0.parquet], [/testdata/tpcds/plans_sf1_partitions4/web_returns/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/web_returns/part-2.parquet], [/testdata/tpcds/plans_sf1_partitions4/web_returns/part-3.parquet]]}, projection=[wr_returned_date_sk, wr_web_page_sk, wr_return_amt, wr_net_loss], file_type=parquet, predicate=DynamicFilter [ empty ] AND DynamicFilter [ empty ] + │ DistributedLeafExec: DataSourceExec: file_groups={2 groups: [[/testdata/tpcds/plans_sf1_partitions4/web_returns/part-0.parquet, /testdata/tpcds/plans_sf1_partitions4/web_returns/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/web_returns/part-2.parquet, /testdata/tpcds/plans_sf1_partitions4/web_returns/part-3.parquet]]}, projection=[wr_returned_date_sk, wr_web_page_sk, wr_return_amt, wr_net_loss], file_type=parquet, predicate=DynamicFilter [ empty ] AND DynamicFilter [ empty ] └────────────────────────────────────────────────── ┌───── Stage 17 ── Tasks: t0:[p0..p3] t1:[p4..p7] │ BroadcastExec: input_partitions=2, consumer_tasks=2, output_partitions=4 - │ PartitionIsolatorExec: tasks=2 partitions=4 - │ DataSourceExec: file_groups={4 groups: [[/testdata/tpcds/plans_sf1_partitions4/web_page/part-0.parquet], [/testdata/tpcds/plans_sf1_partitions4/web_page/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/web_page/part-2.parquet], [/testdata/tpcds/plans_sf1_partitions4/web_page/part-3.parquet]]}, projection=[wp_web_page_sk, CAST(wp_web_page_sk@0 AS Float64) as CAST(web_page.wp_web_page_sk AS Float64)], file_type=parquet + │ DistributedLeafExec: DataSourceExec: file_groups={2 groups: [[/testdata/tpcds/plans_sf1_partitions4/web_page/part-0.parquet, /testdata/tpcds/plans_sf1_partitions4/web_page/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/web_page/part-2.parquet, /testdata/tpcds/plans_sf1_partitions4/web_page/part-3.parquet]]}, projection=[wp_web_page_sk, CAST(wp_web_page_sk@0 AS Float64) as CAST(web_page.wp_web_page_sk AS Float64)], file_type=parquet └────────────────────────────────────────────────── ┌───── Stage 18 ── Tasks: t0:[p0..p5] t1:[p6..p11] │ BroadcastExec: input_partitions=3, consumer_tasks=2, output_partitions=6 │ ProjectionExec: expr=[d_date_sk@0 as d_date_sk, CAST(d_date_sk@0 AS Float64) as CAST(date_dim.d_date_sk AS Float64)] │ FilterExec: d_date@1 >= 2000-08-23 AND d_date@1 <= 2000-09-22, projection=[d_date_sk@0] │ RepartitionExec: partitioning=RoundRobinBatch(3), input_partitions=2 - │ PartitionIsolatorExec: tasks=2 partitions=4 - │ DataSourceExec: file_groups={4 groups: [[/testdata/tpcds/plans_sf1_partitions4/date_dim/part-0.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-2.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-3.parquet]]}, projection=[d_date_sk, d_date], file_type=parquet, predicate=d_date@2 >= 2000-08-23 AND d_date@2 <= 2000-09-22, pruning_predicate=d_date_null_count@1 != row_count@2 AND d_date_max@0 >= 2000-08-23 AND d_date_null_count@1 != row_count@2 AND d_date_min@3 <= 2000-09-22, required_guarantees=[] + │ DistributedLeafExec: DataSourceExec: file_groups={2 groups: [[/testdata/tpcds/plans_sf1_partitions4/date_dim/part-0.parquet, /testdata/tpcds/plans_sf1_partitions4/date_dim/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-2.parquet, /testdata/tpcds/plans_sf1_partitions4/date_dim/part-3.parquet]]}, projection=[d_date_sk, d_date], file_type=parquet, predicate=d_date@2 >= 2000-08-23 AND d_date@2 <= 2000-09-22, pruning_predicate=d_date_null_count@1 != row_count@2 AND d_date_max@0 >= 2000-08-23 AND d_date_null_count@1 != row_count@2 AND d_date_min@3 <= 2000-09-22, required_guarantees=[] └────────────────────────────────────────────────── "); Ok(()) @@ -8713,7 +8042,7 @@ mod tests { #[tokio::test] async fn test_tpcds_78() -> Result<()> { let display = test_tpcds_query("q78").await?; - assert_snapshot!(display, @" + assert_snapshot!(display, @r" ┌───── DistributedExec ── Tasks: t0:[p0] │ ProjectionExec: expr=[ss_sold_year@0 as ss_sold_year, ss_item_sk@1 as ss_item_sk, ss_customer_sk@2 as ss_customer_sk, ratio@3 as ratio, store_qty@4 as store_qty, store_wholesale_cost@5 as store_wholesale_cost, store_sales_price@6 as store_sales_price, other_chan_qty@7 as other_chan_qty, other_chan_wholesale_cost@8 as other_chan_wholesale_cost, other_chan_sales_price@9 as other_chan_sales_price] │ SortPreservingMergeExec: [ss_sold_year@0 ASC NULLS LAST, ss_item_sk@1 ASC NULLS LAST, ss_customer_sk@2 ASC NULLS LAST, ss_qty@10 DESC, ss_wc@11 DESC, ss_sp@12 DESC, other_chan_qty@7 ASC NULLS LAST, other_chan_wholesale_cost@8 ASC NULLS LAST, other_chan_sales_price@9 ASC NULLS LAST, ratio@3 ASC NULLS LAST], fetch=100 @@ -8760,18 +8089,15 @@ mod tests { │ ProjectionExec: expr=[d_date_sk@0 as d_date_sk, d_year@1 as d_year, CAST(d_date_sk@0 AS Float64) as CAST(date_dim.d_date_sk AS Float64)] │ FilterExec: d_year@1 = 2000 │ RepartitionExec: partitioning=RoundRobinBatch(3), input_partitions=2 - │ PartitionIsolatorExec: tasks=2 partitions=4 - │ DataSourceExec: file_groups={4 groups: [[/testdata/tpcds/plans_sf1_partitions4/date_dim/part-0.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-2.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-3.parquet]]}, projection=[d_date_sk, d_year], file_type=parquet, predicate=d_year@6 = 2000, pruning_predicate=d_year_null_count@2 != row_count@3 AND d_year_min@0 <= 2000 AND 2000 <= d_year_max@1, required_guarantees=[d_year in (2000)] + │ DistributedLeafExec: DataSourceExec: file_groups={2 groups: [[/testdata/tpcds/plans_sf1_partitions4/date_dim/part-0.parquet, /testdata/tpcds/plans_sf1_partitions4/date_dim/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-2.parquet, /testdata/tpcds/plans_sf1_partitions4/date_dim/part-3.parquet]]}, projection=[d_date_sk, d_year], file_type=parquet, predicate=d_year@6 = 2000, pruning_predicate=d_year_null_count@2 != row_count@3 AND d_year_min@0 <= 2000 AND 2000 <= d_year_max@1, required_guarantees=[d_year in (2000)] └────────────────────────────────────────────────── ┌───── Stage 2 ── Tasks: t0:[p0..p8] t1:[p0..p8] t2:[p0..p8] - │ RepartitionExec: partitioning=Hash([cr_order_number@1, cr_item_sk@0], 9), input_partitions=2 - │ PartitionIsolatorExec: tasks=3 partitions=6 - │ DataSourceExec: file_groups={6 groups: [[/testdata/tpcds/plans_sf1_partitions4/catalog_returns/part-0.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/catalog_returns/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/catalog_returns/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/catalog_returns/part-2.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/catalog_returns/part-2.parquet:..], ...]}, projection=[cr_item_sk, cr_order_number], file_type=parquet + │ RepartitionExec: partitioning=Hash([cr_order_number@1, cr_item_sk@0], 9), input_partitions=3 + │ DistributedLeafExec: DataSourceExec: file_groups={3 groups: [[/testdata/tpcds/plans_sf1_partitions4/catalog_returns/part-0.parquet:.., /testdata/tpcds/plans_sf1_partitions4/catalog_returns/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/catalog_returns/part-1.parquet:.., /testdata/tpcds/plans_sf1_partitions4/catalog_returns/part-2.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/catalog_returns/part-2.parquet:.., /testdata/tpcds/plans_sf1_partitions4/catalog_returns/part-3.parquet:..]]}, projection=[cr_item_sk, cr_order_number], file_type=parquet └────────────────────────────────────────────────── ┌───── Stage 3 ── Tasks: t0:[p0..p8] t1:[p0..p8] t2:[p0..p8] - │ RepartitionExec: partitioning=Hash([cs_order_number@3, cs_item_sk@2], 9), input_partitions=2 - │ PartitionIsolatorExec: tasks=3 partitions=6 - │ DataSourceExec: file_groups={6 groups: [[/testdata/tpcds/plans_sf1_partitions4/catalog_sales/part-0.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/catalog_sales/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/catalog_sales/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/catalog_sales/part-2.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/catalog_sales/part-2.parquet:..], ...]}, projection=[cs_sold_date_sk, cs_bill_customer_sk, cs_item_sk, cs_order_number, cs_quantity, cs_wholesale_cost, cs_sales_price], file_type=parquet, predicate=DynamicFilter [ empty ] + │ RepartitionExec: partitioning=Hash([cs_order_number@3, cs_item_sk@2], 9), input_partitions=3 + │ DistributedLeafExec: DataSourceExec: file_groups={3 groups: [[/testdata/tpcds/plans_sf1_partitions4/catalog_sales/part-0.parquet:.., /testdata/tpcds/plans_sf1_partitions4/catalog_sales/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/catalog_sales/part-1.parquet:.., /testdata/tpcds/plans_sf1_partitions4/catalog_sales/part-2.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/catalog_sales/part-2.parquet:.., /testdata/tpcds/plans_sf1_partitions4/catalog_sales/part-3.parquet:..]]}, projection=[cs_sold_date_sk, cs_bill_customer_sk, cs_item_sk, cs_order_number, cs_quantity, cs_wholesale_cost, cs_sales_price], file_type=parquet, predicate=DynamicFilter [ empty ] └────────────────────────────────────────────────── ┌───── Stage 9 ── Tasks: t0:[p0..p5] t1:[p6..p11] │ BroadcastExec: input_partitions=3, consumer_tasks=2, output_partitions=6 @@ -8780,7 +8106,7 @@ mod tests { │ [Stage 8] => NetworkShuffleExec: output_partitions=3, input_tasks=3 └────────────────────────────────────────────────── ┌───── Stage 8 ── Tasks: t0:[p0..p5] t1:[p0..p5] t2:[p0..p5] - │ RepartitionExec: partitioning=Hash([d_year@0, ws_item_sk@1, ws_bill_customer_sk@2], 6), input_partitions=2 + │ RepartitionExec: partitioning=Hash([d_year@0, ws_item_sk@1, ws_bill_customer_sk@2], 6), input_partitions=3 │ AggregateExec: mode=Partial, gby=[d_year@5 as d_year, ws_item_sk@0 as ws_item_sk, ws_bill_customer_sk@1 as ws_bill_customer_sk], aggr=[sum(web_sales.ws_quantity), sum(web_sales.ws_wholesale_cost), sum(web_sales.ws_sales_price)], ordering_mode=PartiallySorted([0]) │ ProjectionExec: expr=[ws_item_sk@1 as ws_item_sk, ws_bill_customer_sk@2 as ws_bill_customer_sk, ws_quantity@3 as ws_quantity, ws_wholesale_cost@4 as ws_wholesale_cost, ws_sales_price@5 as ws_sales_price, d_year@0 as d_year] │ HashJoinExec: mode=CollectLeft, join_type=Inner, on=[(CAST(date_dim.d_date_sk AS Float64)@2, ws_sold_date_sk@0)], projection=[d_year@1, ws_item_sk@4, ws_bill_customer_sk@5, ws_quantity@6, ws_wholesale_cost@7, ws_sales_price@8] @@ -8791,21 +8117,18 @@ mod tests { │ HashJoinExec: mode=CollectLeft, join_type=Right, on=[(wr_order_number@1, ws_order_number@3), (wr_item_sk@0, ws_item_sk@1)], projection=[wr_order_number@1, ws_sold_date_sk@2, ws_item_sk@3, ws_bill_customer_sk@4, ws_quantity@6, ws_wholesale_cost@7, ws_sales_price@8] │ CoalescePartitionsExec │ [Stage 7] => NetworkBroadcastExec: partitions_per_consumer=2, stage_partitions=6, input_tasks=2 - │ PartitionIsolatorExec: tasks=3 partitions=6 - │ DataSourceExec: file_groups={6 groups: [[/testdata/tpcds/plans_sf1_partitions4/web_sales/part-0.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/web_sales/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/web_sales/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/web_sales/part-2.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/web_sales/part-2.parquet:..], ...]}, projection=[ws_sold_date_sk, ws_item_sk, ws_bill_customer_sk, ws_order_number, ws_quantity, ws_wholesale_cost, ws_sales_price], file_type=parquet, predicate=DynamicFilter [ empty ] + │ DistributedLeafExec: DataSourceExec: file_groups={3 groups: [[/testdata/tpcds/plans_sf1_partitions4/web_sales/part-0.parquet:.., /testdata/tpcds/plans_sf1_partitions4/web_sales/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/web_sales/part-1.parquet:.., /testdata/tpcds/plans_sf1_partitions4/web_sales/part-2.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/web_sales/part-2.parquet:.., /testdata/tpcds/plans_sf1_partitions4/web_sales/part-3.parquet:..]]}, projection=[ws_sold_date_sk, ws_item_sk, ws_bill_customer_sk, ws_order_number, ws_quantity, ws_wholesale_cost, ws_sales_price], file_type=parquet, predicate=DynamicFilter [ empty ] └────────────────────────────────────────────────── ┌───── Stage 6 ── Tasks: t0:[p0..p8] t1:[p9..p17] │ BroadcastExec: input_partitions=3, consumer_tasks=3, output_partitions=9 │ ProjectionExec: expr=[d_date_sk@0 as d_date_sk, d_year@1 as d_year, CAST(d_date_sk@0 AS Float64) as CAST(date_dim.d_date_sk AS Float64)] │ FilterExec: d_year@1 = 2000 │ RepartitionExec: partitioning=RoundRobinBatch(3), input_partitions=2 - │ PartitionIsolatorExec: tasks=2 partitions=4 - │ DataSourceExec: file_groups={4 groups: [[/testdata/tpcds/plans_sf1_partitions4/date_dim/part-0.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-2.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-3.parquet]]}, projection=[d_date_sk, d_year], file_type=parquet, predicate=d_year@6 = 2000, pruning_predicate=d_year_null_count@2 != row_count@3 AND d_year_min@0 <= 2000 AND 2000 <= d_year_max@1, required_guarantees=[d_year in (2000)] + │ DistributedLeafExec: DataSourceExec: file_groups={2 groups: [[/testdata/tpcds/plans_sf1_partitions4/date_dim/part-0.parquet, /testdata/tpcds/plans_sf1_partitions4/date_dim/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-2.parquet, /testdata/tpcds/plans_sf1_partitions4/date_dim/part-3.parquet]]}, projection=[d_date_sk, d_year], file_type=parquet, predicate=d_year@6 = 2000, pruning_predicate=d_year_null_count@2 != row_count@3 AND d_year_min@0 <= 2000 AND 2000 <= d_year_max@1, required_guarantees=[d_year in (2000)] └────────────────────────────────────────────────── ┌───── Stage 7 ── Tasks: t0:[p0..p5] t1:[p6..p11] │ BroadcastExec: input_partitions=2, consumer_tasks=3, output_partitions=6 - │ PartitionIsolatorExec: tasks=2 partitions=4 - │ DataSourceExec: file_groups={4 groups: [[/testdata/tpcds/plans_sf1_partitions4/web_returns/part-0.parquet], [/testdata/tpcds/plans_sf1_partitions4/web_returns/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/web_returns/part-2.parquet], [/testdata/tpcds/plans_sf1_partitions4/web_returns/part-3.parquet]]}, projection=[wr_item_sk, wr_order_number], file_type=parquet + │ DistributedLeafExec: DataSourceExec: file_groups={2 groups: [[/testdata/tpcds/plans_sf1_partitions4/web_returns/part-0.parquet, /testdata/tpcds/plans_sf1_partitions4/web_returns/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/web_returns/part-2.parquet, /testdata/tpcds/plans_sf1_partitions4/web_returns/part-3.parquet]]}, projection=[wr_item_sk, wr_order_number], file_type=parquet └────────────────────────────────────────────────── ┌───── Stage 13 ── Tasks: t0:[p0..p5] t1:[p0..p5] t2:[p0..p5] │ RepartitionExec: partitioning=Hash([d_year@0, ss_item_sk@1, ss_customer_sk@2], 6), input_partitions=3 @@ -8825,18 +8148,15 @@ mod tests { │ ProjectionExec: expr=[d_date_sk@0 as d_date_sk, d_year@1 as d_year, CAST(d_date_sk@0 AS Float64) as CAST(date_dim.d_date_sk AS Float64)] │ FilterExec: d_year@1 = 2000 │ RepartitionExec: partitioning=RoundRobinBatch(3), input_partitions=2 - │ PartitionIsolatorExec: tasks=2 partitions=4 - │ DataSourceExec: file_groups={4 groups: [[/testdata/tpcds/plans_sf1_partitions4/date_dim/part-0.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-2.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-3.parquet]]}, projection=[d_date_sk, d_year], file_type=parquet, predicate=d_year@6 = 2000, pruning_predicate=d_year_null_count@2 != row_count@3 AND d_year_min@0 <= 2000 AND 2000 <= d_year_max@1, required_guarantees=[d_year in (2000)] + │ DistributedLeafExec: DataSourceExec: file_groups={2 groups: [[/testdata/tpcds/plans_sf1_partitions4/date_dim/part-0.parquet, /testdata/tpcds/plans_sf1_partitions4/date_dim/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-2.parquet, /testdata/tpcds/plans_sf1_partitions4/date_dim/part-3.parquet]]}, projection=[d_date_sk, d_year], file_type=parquet, predicate=d_year@6 = 2000, pruning_predicate=d_year_null_count@2 != row_count@3 AND d_year_min@0 <= 2000 AND 2000 <= d_year_max@1, required_guarantees=[d_year in (2000)] └────────────────────────────────────────────────── ┌───── Stage 11 ── Tasks: t0:[p0..p8] t1:[p0..p8] t2:[p0..p8] - │ RepartitionExec: partitioning=Hash([sr_ticket_number@1, sr_item_sk@0], 9), input_partitions=2 - │ PartitionIsolatorExec: tasks=3 partitions=6 - │ DataSourceExec: file_groups={6 groups: [[/testdata/tpcds/plans_sf1_partitions4/store_returns/part-0.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_returns/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_returns/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_returns/part-2.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_returns/part-2.parquet:..], ...]}, projection=[sr_item_sk, sr_ticket_number], file_type=parquet + │ RepartitionExec: partitioning=Hash([sr_ticket_number@1, sr_item_sk@0], 9), input_partitions=3 + │ DistributedLeafExec: DataSourceExec: file_groups={3 groups: [[/testdata/tpcds/plans_sf1_partitions4/store_returns/part-0.parquet:.., /testdata/tpcds/plans_sf1_partitions4/store_returns/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_returns/part-1.parquet:.., /testdata/tpcds/plans_sf1_partitions4/store_returns/part-2.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_returns/part-2.parquet:.., /testdata/tpcds/plans_sf1_partitions4/store_returns/part-3.parquet:..]]}, projection=[sr_item_sk, sr_ticket_number], file_type=parquet └────────────────────────────────────────────────── ┌───── Stage 12 ── Tasks: t0:[p0..p8] t1:[p0..p8] t2:[p0..p8] - │ RepartitionExec: partitioning=Hash([ss_ticket_number@3, ss_item_sk@1], 9), input_partitions=2 - │ PartitionIsolatorExec: tasks=3 partitions=6 - │ DataSourceExec: file_groups={6 groups: [[/testdata/tpcds/plans_sf1_partitions4/store_sales/part-0.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_sales/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_sales/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_sales/part-2.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_sales/part-2.parquet:..], ...]}, projection=[ss_sold_date_sk, ss_item_sk, ss_customer_sk, ss_ticket_number, ss_quantity, ss_wholesale_cost, ss_sales_price], file_type=parquet, predicate=DynamicFilter [ empty ] + │ RepartitionExec: partitioning=Hash([ss_ticket_number@3, ss_item_sk@1], 9), input_partitions=3 + │ DistributedLeafExec: DataSourceExec: file_groups={3 groups: [[/testdata/tpcds/plans_sf1_partitions4/store_sales/part-0.parquet:.., /testdata/tpcds/plans_sf1_partitions4/store_sales/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_sales/part-1.parquet:.., /testdata/tpcds/plans_sf1_partitions4/store_sales/part-2.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_sales/part-2.parquet:.., /testdata/tpcds/plans_sf1_partitions4/store_sales/part-3.parquet:..]]}, projection=[ss_sold_date_sk, ss_item_sk, ss_customer_sk, ss_ticket_number, ss_quantity, ss_wholesale_cost, ss_sales_price], file_type=parquet, predicate=DynamicFilter [ empty ] └────────────────────────────────────────────────── "); Ok(()) @@ -8844,7 +8164,7 @@ mod tests { #[tokio::test] async fn test_tpcds_79() -> Result<()> { let display = test_tpcds_query("q79").await?; - assert_snapshot!(display, @" + assert_snapshot!(display, @r" ┌───── DistributedExec ── Tasks: t0:[p0] │ SortPreservingMergeExec: [c_last_name@0 ASC, c_first_name@1 ASC, substr(ms.s_city,Int64(1),Int64(30))@2 ASC, profit@5 ASC, ss_ticket_number@3 ASC NULLS LAST], fetch=100 │ [Stage 6] => NetworkCoalesceExec: output_partitions=6, input_tasks=2 @@ -8857,8 +8177,7 @@ mod tests { │ [Stage 5] => NetworkBroadcastExec: partitions_per_consumer=3, stage_partitions=6, input_tasks=2 │ ProjectionExec: expr=[c_customer_sk@0 as c_customer_sk, c_first_name@1 as c_first_name, c_last_name@2 as c_last_name, CAST(c_customer_sk@0 AS Float64) as CAST(customer.c_customer_sk AS Float64)] │ RepartitionExec: partitioning=RoundRobinBatch(3), input_partitions=2 - │ PartitionIsolatorExec: tasks=2 partitions=4 - │ DataSourceExec: file_groups={4 groups: [[/testdata/tpcds/plans_sf1_partitions4/customer/part-0.parquet], [/testdata/tpcds/plans_sf1_partitions4/customer/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/customer/part-2.parquet], [/testdata/tpcds/plans_sf1_partitions4/customer/part-3.parquet]]}, projection=[c_customer_sk, c_first_name, c_last_name], file_type=parquet, predicate=DynamicFilter [ empty ] + │ DistributedLeafExec: DataSourceExec: file_groups={2 groups: [[/testdata/tpcds/plans_sf1_partitions4/customer/part-0.parquet, /testdata/tpcds/plans_sf1_partitions4/customer/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/customer/part-2.parquet, /testdata/tpcds/plans_sf1_partitions4/customer/part-3.parquet]]}, projection=[c_customer_sk, c_first_name, c_last_name], file_type=parquet, predicate=DynamicFilter [ empty ] └────────────────────────────────────────────────── ┌───── Stage 5 ── Tasks: t0:[p0..p5] t1:[p6..p11] │ BroadcastExec: input_partitions=3, consumer_tasks=2, output_partitions=6 @@ -8867,7 +8186,7 @@ mod tests { │ [Stage 4] => NetworkShuffleExec: output_partitions=3, input_tasks=3 └────────────────────────────────────────────────── ┌───── Stage 4 ── Tasks: t0:[p0..p5] t1:[p0..p5] t2:[p0..p5] - │ RepartitionExec: partitioning=Hash([ss_ticket_number@0, ss_customer_sk@1, ss_addr_sk@2, s_city@3], 6), input_partitions=2 + │ RepartitionExec: partitioning=Hash([ss_ticket_number@0, ss_customer_sk@1, ss_addr_sk@2, s_city@3], 6), input_partitions=3 │ AggregateExec: mode=Partial, gby=[ss_ticket_number@2 as ss_ticket_number, ss_customer_sk@0 as ss_customer_sk, ss_addr_sk@1 as ss_addr_sk, s_city@5 as s_city], aggr=[sum(store_sales.ss_coupon_amt), sum(store_sales.ss_net_profit)] │ HashJoinExec: mode=CollectLeft, join_type=Inner, on=[(CAST(household_demographics.hd_demo_sk AS Float64)@1, ss_hdemo_sk@1)], projection=[ss_customer_sk@2, ss_addr_sk@4, ss_ticket_number@5, ss_coupon_amt@6, ss_net_profit@7, s_city@8] │ CoalescePartitionsExec @@ -8879,32 +8198,28 @@ mod tests { │ HashJoinExec: mode=CollectLeft, join_type=Inner, on=[(CAST(date_dim.d_date_sk AS Float64)@1, ss_sold_date_sk@0)], projection=[ss_customer_sk@3, ss_hdemo_sk@4, ss_addr_sk@5, ss_store_sk@6, ss_ticket_number@7, ss_coupon_amt@8, ss_net_profit@9] │ CoalescePartitionsExec │ [Stage 3] => NetworkBroadcastExec: partitions_per_consumer=3, stage_partitions=9, input_tasks=2 - │ PartitionIsolatorExec: tasks=3 partitions=6 - │ DataSourceExec: file_groups={6 groups: [[/testdata/tpcds/plans_sf1_partitions4/store_sales/part-0.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_sales/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_sales/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_sales/part-2.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_sales/part-2.parquet:..], ...]}, projection=[ss_sold_date_sk, ss_customer_sk, ss_hdemo_sk, ss_addr_sk, ss_store_sk, ss_ticket_number, ss_coupon_amt, ss_net_profit], file_type=parquet, predicate=DynamicFilter [ empty ] AND DynamicFilter [ empty ] AND DynamicFilter [ empty ] + │ DistributedLeafExec: DataSourceExec: file_groups={3 groups: [[/testdata/tpcds/plans_sf1_partitions4/store_sales/part-0.parquet:.., /testdata/tpcds/plans_sf1_partitions4/store_sales/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_sales/part-1.parquet:.., /testdata/tpcds/plans_sf1_partitions4/store_sales/part-2.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_sales/part-2.parquet:.., /testdata/tpcds/plans_sf1_partitions4/store_sales/part-3.parquet:..]]}, projection=[ss_sold_date_sk, ss_customer_sk, ss_hdemo_sk, ss_addr_sk, ss_store_sk, ss_ticket_number, ss_coupon_amt, ss_net_profit], file_type=parquet, predicate=DynamicFilter [ empty ] AND DynamicFilter [ empty ] AND DynamicFilter [ empty ] └────────────────────────────────────────────────── ┌───── Stage 1 ── Tasks: t0:[p0..p8] t1:[p9..p17] │ BroadcastExec: input_partitions=3, consumer_tasks=3, output_partitions=9 │ ProjectionExec: expr=[hd_demo_sk@0 as hd_demo_sk, CAST(hd_demo_sk@0 AS Float64) as CAST(household_demographics.hd_demo_sk AS Float64)] │ FilterExec: hd_dep_count@1 = 6 OR hd_vehicle_count@2 > 2, projection=[hd_demo_sk@0] │ RepartitionExec: partitioning=RoundRobinBatch(3), input_partitions=2 - │ PartitionIsolatorExec: tasks=2 partitions=4 - │ DataSourceExec: file_groups={4 groups: [[/testdata/tpcds/plans_sf1_partitions4/household_demographics/part-0.parquet], [/testdata/tpcds/plans_sf1_partitions4/household_demographics/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/household_demographics/part-2.parquet], [/testdata/tpcds/plans_sf1_partitions4/household_demographics/part-3.parquet]]}, projection=[hd_demo_sk, hd_dep_count, hd_vehicle_count], file_type=parquet, predicate=hd_dep_count@3 = 6 OR hd_vehicle_count@4 > 2, pruning_predicate=hd_dep_count_null_count@2 != row_count@3 AND hd_dep_count_min@0 <= 6 AND 6 <= hd_dep_count_max@1 OR hd_vehicle_count_null_count@5 != row_count@3 AND hd_vehicle_count_max@4 > 2, required_guarantees=[] + │ DistributedLeafExec: DataSourceExec: file_groups={2 groups: [[/testdata/tpcds/plans_sf1_partitions4/household_demographics/part-0.parquet, /testdata/tpcds/plans_sf1_partitions4/household_demographics/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/household_demographics/part-2.parquet, /testdata/tpcds/plans_sf1_partitions4/household_demographics/part-3.parquet]]}, projection=[hd_demo_sk, hd_dep_count, hd_vehicle_count], file_type=parquet, predicate=hd_dep_count@3 = 6 OR hd_vehicle_count@4 > 2, pruning_predicate=hd_dep_count_null_count@2 != row_count@3 AND hd_dep_count_min@0 <= 6 AND 6 <= hd_dep_count_max@1 OR hd_vehicle_count_null_count@5 != row_count@3 AND hd_vehicle_count_max@4 > 2, required_guarantees=[] └────────────────────────────────────────────────── ┌───── Stage 2 ── Tasks: t0:[p0..p8] t1:[p9..p17] │ BroadcastExec: input_partitions=3, consumer_tasks=3, output_partitions=9 │ ProjectionExec: expr=[s_store_sk@0 as s_store_sk, s_city@1 as s_city, CAST(s_store_sk@0 AS Float64) as CAST(store.s_store_sk AS Float64)] │ FilterExec: s_number_employees@1 >= 200 AND s_number_employees@1 <= 295, projection=[s_store_sk@0, s_city@2] │ RepartitionExec: partitioning=RoundRobinBatch(3), input_partitions=2 - │ PartitionIsolatorExec: tasks=2 partitions=4 - │ DataSourceExec: file_groups={4 groups: [[/testdata/tpcds/plans_sf1_partitions4/store/part-0.parquet], [/testdata/tpcds/plans_sf1_partitions4/store/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/store/part-2.parquet], [/testdata/tpcds/plans_sf1_partitions4/store/part-3.parquet]]}, projection=[s_store_sk, s_number_employees, s_city], file_type=parquet, predicate=s_number_employees@6 >= 200 AND s_number_employees@6 <= 295, pruning_predicate=s_number_employees_null_count@1 != row_count@2 AND s_number_employees_max@0 >= 200 AND s_number_employees_null_count@1 != row_count@2 AND s_number_employees_min@3 <= 295, required_guarantees=[] + │ DistributedLeafExec: DataSourceExec: file_groups={2 groups: [[/testdata/tpcds/plans_sf1_partitions4/store/part-0.parquet, /testdata/tpcds/plans_sf1_partitions4/store/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/store/part-2.parquet, /testdata/tpcds/plans_sf1_partitions4/store/part-3.parquet]]}, projection=[s_store_sk, s_number_employees, s_city], file_type=parquet, predicate=s_number_employees@6 >= 200 AND s_number_employees@6 <= 295, pruning_predicate=s_number_employees_null_count@1 != row_count@2 AND s_number_employees_max@0 >= 200 AND s_number_employees_null_count@1 != row_count@2 AND s_number_employees_min@3 <= 295, required_guarantees=[] └────────────────────────────────────────────────── ┌───── Stage 3 ── Tasks: t0:[p0..p8] t1:[p9..p17] │ BroadcastExec: input_partitions=3, consumer_tasks=3, output_partitions=9 │ ProjectionExec: expr=[d_date_sk@0 as d_date_sk, CAST(d_date_sk@0 AS Float64) as CAST(date_dim.d_date_sk AS Float64)] │ FilterExec: d_dow@2 = 1 AND (d_year@1 = 1999 OR d_year@1 = 2000 OR d_year@1 = 2001), projection=[d_date_sk@0] │ RepartitionExec: partitioning=RoundRobinBatch(3), input_partitions=2 - │ PartitionIsolatorExec: tasks=2 partitions=4 - │ DataSourceExec: file_groups={4 groups: [[/testdata/tpcds/plans_sf1_partitions4/date_dim/part-0.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-2.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-3.parquet]]}, projection=[d_date_sk, d_year, d_dow], file_type=parquet, predicate=d_dow@7 = 1 AND (d_year@6 = 1999 OR d_year@6 = 2000 OR d_year@6 = 2001), pruning_predicate=d_dow_null_count@2 != row_count@3 AND d_dow_min@0 <= 1 AND 1 <= d_dow_max@1 AND (d_year_null_count@6 != row_count@3 AND d_year_min@4 <= 1999 AND 1999 <= d_year_max@5 OR d_year_null_count@6 != row_count@3 AND d_year_min@4 <= 2000 AND 2000 <= d_year_max@5 OR d_year_null_count@6 != row_count@3 AND d_year_min@4 <= 2001 AND 2001 <= d_year_max@5), required_guarantees=[d_dow in (1), d_year in (1999, 2000, 2001)] + │ DistributedLeafExec: DataSourceExec: file_groups={2 groups: [[/testdata/tpcds/plans_sf1_partitions4/date_dim/part-0.parquet, /testdata/tpcds/plans_sf1_partitions4/date_dim/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-2.parquet, /testdata/tpcds/plans_sf1_partitions4/date_dim/part-3.parquet]]}, projection=[d_date_sk, d_year, d_dow], file_type=parquet, predicate=d_dow@7 = 1 AND (d_year@6 = 1999 OR d_year@6 = 2000 OR d_year@6 = 2001), pruning_predicate=d_dow_null_count@2 != row_count@3 AND d_dow_min@0 <= 1 AND 1 <= d_dow_max@1 AND (d_year_null_count@6 != row_count@3 AND d_year_min@4 <= 1999 AND 1999 <= d_year_max@5 OR d_year_null_count@6 != row_count@3 AND d_year_min@4 <= 2000 AND 2000 <= d_year_max@5 OR d_year_null_count@6 != row_count@3 AND d_year_min@4 <= 2001 AND 2001 <= d_year_max@5), required_guarantees=[d_dow in (1), d_year in (1999, 2000, 2001)] └────────────────────────────────────────────────── "); Ok(()) @@ -8961,38 +8276,32 @@ mod tests { │ ProjectionExec: expr=[p_promo_sk@0 as p_promo_sk, CAST(p_promo_sk@0 AS Float64) as CAST(promotion.p_promo_sk AS Float64)] │ FilterExec: p_channel_tv@1 = N, projection=[p_promo_sk@0] │ RepartitionExec: partitioning=RoundRobinBatch(3), input_partitions=2 - │ PartitionIsolatorExec: tasks=2 partitions=4 - │ DataSourceExec: file_groups={4 groups: [[/testdata/tpcds/plans_sf1_partitions4/promotion/part-0.parquet], [/testdata/tpcds/plans_sf1_partitions4/promotion/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/promotion/part-2.parquet], [/testdata/tpcds/plans_sf1_partitions4/promotion/part-3.parquet]]}, projection=[p_promo_sk, p_channel_tv], file_type=parquet, predicate=p_channel_tv@11 = N, pruning_predicate=p_channel_tv_null_count@2 != row_count@3 AND p_channel_tv_min@0 <= N AND N <= p_channel_tv_max@1, required_guarantees=[p_channel_tv in (N)] + │ DistributedLeafExec: DataSourceExec: file_groups={2 groups: [[/testdata/tpcds/plans_sf1_partitions4/promotion/part-0.parquet, /testdata/tpcds/plans_sf1_partitions4/promotion/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/promotion/part-2.parquet, /testdata/tpcds/plans_sf1_partitions4/promotion/part-3.parquet]]}, projection=[p_promo_sk, p_channel_tv], file_type=parquet, predicate=p_channel_tv@11 = N, pruning_predicate=p_channel_tv_null_count@2 != row_count@3 AND p_channel_tv_min@0 <= N AND N <= p_channel_tv_max@1, required_guarantees=[p_channel_tv in (N)] └────────────────────────────────────────────────── ┌───── Stage 2 ── Tasks: t0:[p0..p8] t1:[p9..p17] │ BroadcastExec: input_partitions=3, consumer_tasks=3, output_partitions=9 │ FilterExec: i_current_price@1 > Some(5000),4,2, projection=[i_item_sk@0] │ RepartitionExec: partitioning=RoundRobinBatch(3), input_partitions=2 - │ PartitionIsolatorExec: tasks=2 partitions=4 - │ DataSourceExec: file_groups={4 groups: [[/testdata/tpcds/plans_sf1_partitions4/item/part-0.parquet], [/testdata/tpcds/plans_sf1_partitions4/item/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/item/part-2.parquet], [/testdata/tpcds/plans_sf1_partitions4/item/part-3.parquet]]}, projection=[i_item_sk, i_current_price], file_type=parquet, predicate=i_current_price@5 > Some(5000),4,2, pruning_predicate=i_current_price_null_count@1 != row_count@2 AND i_current_price_max@0 > Some(5000),4,2, required_guarantees=[] + │ DistributedLeafExec: DataSourceExec: file_groups={2 groups: [[/testdata/tpcds/plans_sf1_partitions4/item/part-0.parquet, /testdata/tpcds/plans_sf1_partitions4/item/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/item/part-2.parquet, /testdata/tpcds/plans_sf1_partitions4/item/part-3.parquet]]}, projection=[i_item_sk, i_current_price], file_type=parquet, predicate=i_current_price@5 > Some(5000),4,2, pruning_predicate=i_current_price_null_count@1 != row_count@2 AND i_current_price_max@0 > Some(5000),4,2, required_guarantees=[] └────────────────────────────────────────────────── ┌───── Stage 3 ── Tasks: t0:[p0..p5] t1:[p6..p11] │ BroadcastExec: input_partitions=2, consumer_tasks=3, output_partitions=6 - │ PartitionIsolatorExec: tasks=2 partitions=4 - │ DataSourceExec: file_groups={4 groups: [[/testdata/tpcds/plans_sf1_partitions4/store/part-0.parquet], [/testdata/tpcds/plans_sf1_partitions4/store/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/store/part-2.parquet], [/testdata/tpcds/plans_sf1_partitions4/store/part-3.parquet]]}, projection=[s_store_sk, s_store_id, CAST(s_store_sk@0 AS Float64) as CAST(store.s_store_sk AS Float64)], file_type=parquet + │ DistributedLeafExec: DataSourceExec: file_groups={2 groups: [[/testdata/tpcds/plans_sf1_partitions4/store/part-0.parquet, /testdata/tpcds/plans_sf1_partitions4/store/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/store/part-2.parquet, /testdata/tpcds/plans_sf1_partitions4/store/part-3.parquet]]}, projection=[s_store_sk, s_store_id, CAST(s_store_sk@0 AS Float64) as CAST(store.s_store_sk AS Float64)], file_type=parquet └────────────────────────────────────────────────── ┌───── Stage 4 ── Tasks: t0:[p0..p8] t1:[p9..p17] │ BroadcastExec: input_partitions=3, consumer_tasks=3, output_partitions=9 │ ProjectionExec: expr=[d_date_sk@0 as d_date_sk, CAST(d_date_sk@0 AS Float64) as CAST(date_dim.d_date_sk AS Float64)] │ FilterExec: d_date@1 >= 2000-08-23 AND d_date@1 <= 2000-09-22, projection=[d_date_sk@0] │ RepartitionExec: partitioning=RoundRobinBatch(3), input_partitions=2 - │ PartitionIsolatorExec: tasks=2 partitions=4 - │ DataSourceExec: file_groups={4 groups: [[/testdata/tpcds/plans_sf1_partitions4/date_dim/part-0.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-2.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-3.parquet]]}, projection=[d_date_sk, d_date], file_type=parquet, predicate=d_date@2 >= 2000-08-23 AND d_date@2 <= 2000-09-22, pruning_predicate=d_date_null_count@1 != row_count@2 AND d_date_max@0 >= 2000-08-23 AND d_date_null_count@1 != row_count@2 AND d_date_min@3 <= 2000-09-22, required_guarantees=[] + │ DistributedLeafExec: DataSourceExec: file_groups={2 groups: [[/testdata/tpcds/plans_sf1_partitions4/date_dim/part-0.parquet, /testdata/tpcds/plans_sf1_partitions4/date_dim/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-2.parquet, /testdata/tpcds/plans_sf1_partitions4/date_dim/part-3.parquet]]}, projection=[d_date_sk, d_date], file_type=parquet, predicate=d_date@2 >= 2000-08-23 AND d_date@2 <= 2000-09-22, pruning_predicate=d_date_null_count@1 != row_count@2 AND d_date_max@0 >= 2000-08-23 AND d_date_null_count@1 != row_count@2 AND d_date_min@3 <= 2000-09-22, required_guarantees=[] └────────────────────────────────────────────────── ┌───── Stage 5 ── Tasks: t0:[p0..p8] t1:[p0..p8] t2:[p0..p8] - │ RepartitionExec: partitioning=Hash([sr_item_sk@0, sr_ticket_number@1], 9), input_partitions=2 - │ PartitionIsolatorExec: tasks=3 partitions=6 - │ DataSourceExec: file_groups={6 groups: [[/testdata/tpcds/plans_sf1_partitions4/store_returns/part-0.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_returns/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_returns/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_returns/part-2.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_returns/part-2.parquet:..], ...]}, projection=[sr_item_sk, sr_ticket_number, sr_return_amt, sr_net_loss], file_type=parquet + │ RepartitionExec: partitioning=Hash([sr_item_sk@0, sr_ticket_number@1], 9), input_partitions=3 + │ DistributedLeafExec: DataSourceExec: file_groups={3 groups: [[/testdata/tpcds/plans_sf1_partitions4/store_returns/part-0.parquet:.., /testdata/tpcds/plans_sf1_partitions4/store_returns/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_returns/part-1.parquet:.., /testdata/tpcds/plans_sf1_partitions4/store_returns/part-2.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_returns/part-2.parquet:.., /testdata/tpcds/plans_sf1_partitions4/store_returns/part-3.parquet:..]]}, projection=[sr_item_sk, sr_ticket_number, sr_return_amt, sr_net_loss], file_type=parquet └────────────────────────────────────────────────── ┌───── Stage 6 ── Tasks: t0:[p0..p8] t1:[p0..p8] t2:[p0..p8] - │ RepartitionExec: partitioning=Hash([ss_item_sk@1, ss_ticket_number@4], 9), input_partitions=2 - │ PartitionIsolatorExec: tasks=3 partitions=6 - │ DataSourceExec: file_groups={6 groups: [[/testdata/tpcds/plans_sf1_partitions4/store_sales/part-0.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_sales/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_sales/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_sales/part-2.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_sales/part-2.parquet:..], ...]}, projection=[ss_sold_date_sk, ss_item_sk, ss_store_sk, ss_promo_sk, ss_ticket_number, ss_ext_sales_price, ss_net_profit], file_type=parquet, predicate=DynamicFilter [ empty ] AND DynamicFilter [ empty ] AND DynamicFilter [ empty ] AND DynamicFilter [ empty ] + │ RepartitionExec: partitioning=Hash([ss_item_sk@1, ss_ticket_number@4], 9), input_partitions=3 + │ DistributedLeafExec: DataSourceExec: file_groups={3 groups: [[/testdata/tpcds/plans_sf1_partitions4/store_sales/part-0.parquet:.., /testdata/tpcds/plans_sf1_partitions4/store_sales/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_sales/part-1.parquet:.., /testdata/tpcds/plans_sf1_partitions4/store_sales/part-2.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_sales/part-2.parquet:.., /testdata/tpcds/plans_sf1_partitions4/store_sales/part-3.parquet:..]]}, projection=[ss_sold_date_sk, ss_item_sk, ss_store_sk, ss_promo_sk, ss_ticket_number, ss_ext_sales_price, ss_net_profit], file_type=parquet, predicate=DynamicFilter [ empty ] AND DynamicFilter [ empty ] AND DynamicFilter [ empty ] AND DynamicFilter [ empty ] └────────────────────────────────────────────────── ┌───── Stage 14 ── Tasks: t0:[p0..p2] t1:[p0..p2] t2:[p0..p2] │ RepartitionExec: partitioning=Hash([cp_catalog_page_id@0], 3), input_partitions=3 @@ -9021,40 +8330,34 @@ mod tests { │ ProjectionExec: expr=[p_promo_sk@0 as p_promo_sk, CAST(p_promo_sk@0 AS Float64) as CAST(promotion.p_promo_sk AS Float64)] │ FilterExec: p_channel_tv@1 = N, projection=[p_promo_sk@0] │ RepartitionExec: partitioning=RoundRobinBatch(3), input_partitions=2 - │ PartitionIsolatorExec: tasks=2 partitions=4 - │ DataSourceExec: file_groups={4 groups: [[/testdata/tpcds/plans_sf1_partitions4/promotion/part-0.parquet], [/testdata/tpcds/plans_sf1_partitions4/promotion/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/promotion/part-2.parquet], [/testdata/tpcds/plans_sf1_partitions4/promotion/part-3.parquet]]}, projection=[p_promo_sk, p_channel_tv], file_type=parquet, predicate=p_channel_tv@11 = N, pruning_predicate=p_channel_tv_null_count@2 != row_count@3 AND p_channel_tv_min@0 <= N AND N <= p_channel_tv_max@1, required_guarantees=[p_channel_tv in (N)] + │ DistributedLeafExec: DataSourceExec: file_groups={2 groups: [[/testdata/tpcds/plans_sf1_partitions4/promotion/part-0.parquet, /testdata/tpcds/plans_sf1_partitions4/promotion/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/promotion/part-2.parquet, /testdata/tpcds/plans_sf1_partitions4/promotion/part-3.parquet]]}, projection=[p_promo_sk, p_channel_tv], file_type=parquet, predicate=p_channel_tv@11 = N, pruning_predicate=p_channel_tv_null_count@2 != row_count@3 AND p_channel_tv_min@0 <= N AND N <= p_channel_tv_max@1, required_guarantees=[p_channel_tv in (N)] └────────────────────────────────────────────────── ┌───── Stage 9 ── Tasks: t0:[p0..p8] t1:[p9..p17] │ BroadcastExec: input_partitions=3, consumer_tasks=3, output_partitions=9 │ FilterExec: i_current_price@1 > Some(5000),4,2, projection=[i_item_sk@0] │ RepartitionExec: partitioning=RoundRobinBatch(3), input_partitions=2 - │ PartitionIsolatorExec: tasks=2 partitions=4 - │ DataSourceExec: file_groups={4 groups: [[/testdata/tpcds/plans_sf1_partitions4/item/part-0.parquet], [/testdata/tpcds/plans_sf1_partitions4/item/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/item/part-2.parquet], [/testdata/tpcds/plans_sf1_partitions4/item/part-3.parquet]]}, projection=[i_item_sk, i_current_price], file_type=parquet, predicate=i_current_price@5 > Some(5000),4,2, pruning_predicate=i_current_price_null_count@1 != row_count@2 AND i_current_price_max@0 > Some(5000),4,2, required_guarantees=[] + │ DistributedLeafExec: DataSourceExec: file_groups={2 groups: [[/testdata/tpcds/plans_sf1_partitions4/item/part-0.parquet, /testdata/tpcds/plans_sf1_partitions4/item/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/item/part-2.parquet, /testdata/tpcds/plans_sf1_partitions4/item/part-3.parquet]]}, projection=[i_item_sk, i_current_price], file_type=parquet, predicate=i_current_price@5 > Some(5000),4,2, pruning_predicate=i_current_price_null_count@1 != row_count@2 AND i_current_price_max@0 > Some(5000),4,2, required_guarantees=[] └────────────────────────────────────────────────── ┌───── Stage 10 ── Tasks: t0:[p0..p8] t1:[p9..p17] │ BroadcastExec: input_partitions=3, consumer_tasks=3, output_partitions=9 │ ProjectionExec: expr=[cp_catalog_page_sk@0 as cp_catalog_page_sk, cp_catalog_page_id@1 as cp_catalog_page_id, CAST(cp_catalog_page_sk@0 AS Float64) as CAST(catalog_page.cp_catalog_page_sk AS Float64)] │ RepartitionExec: partitioning=RoundRobinBatch(3), input_partitions=2 - │ PartitionIsolatorExec: tasks=2 partitions=4 - │ DataSourceExec: file_groups={4 groups: [[/testdata/tpcds/plans_sf1_partitions4/catalog_page/part-0.parquet], [/testdata/tpcds/plans_sf1_partitions4/catalog_page/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/catalog_page/part-2.parquet], [/testdata/tpcds/plans_sf1_partitions4/catalog_page/part-3.parquet]]}, projection=[cp_catalog_page_sk, cp_catalog_page_id], file_type=parquet + │ DistributedLeafExec: DataSourceExec: file_groups={2 groups: [[/testdata/tpcds/plans_sf1_partitions4/catalog_page/part-0.parquet, /testdata/tpcds/plans_sf1_partitions4/catalog_page/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/catalog_page/part-2.parquet, /testdata/tpcds/plans_sf1_partitions4/catalog_page/part-3.parquet]]}, projection=[cp_catalog_page_sk, cp_catalog_page_id], file_type=parquet └────────────────────────────────────────────────── ┌───── Stage 11 ── Tasks: t0:[p0..p8] t1:[p9..p17] │ BroadcastExec: input_partitions=3, consumer_tasks=3, output_partitions=9 │ ProjectionExec: expr=[d_date_sk@0 as d_date_sk, CAST(d_date_sk@0 AS Float64) as CAST(date_dim.d_date_sk AS Float64)] │ FilterExec: d_date@1 >= 2000-08-23 AND d_date@1 <= 2000-09-22, projection=[d_date_sk@0] │ RepartitionExec: partitioning=RoundRobinBatch(3), input_partitions=2 - │ PartitionIsolatorExec: tasks=2 partitions=4 - │ DataSourceExec: file_groups={4 groups: [[/testdata/tpcds/plans_sf1_partitions4/date_dim/part-0.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-2.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-3.parquet]]}, projection=[d_date_sk, d_date], file_type=parquet, predicate=d_date@2 >= 2000-08-23 AND d_date@2 <= 2000-09-22, pruning_predicate=d_date_null_count@1 != row_count@2 AND d_date_max@0 >= 2000-08-23 AND d_date_null_count@1 != row_count@2 AND d_date_min@3 <= 2000-09-22, required_guarantees=[] + │ DistributedLeafExec: DataSourceExec: file_groups={2 groups: [[/testdata/tpcds/plans_sf1_partitions4/date_dim/part-0.parquet, /testdata/tpcds/plans_sf1_partitions4/date_dim/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-2.parquet, /testdata/tpcds/plans_sf1_partitions4/date_dim/part-3.parquet]]}, projection=[d_date_sk, d_date], file_type=parquet, predicate=d_date@2 >= 2000-08-23 AND d_date@2 <= 2000-09-22, pruning_predicate=d_date_null_count@1 != row_count@2 AND d_date_max@0 >= 2000-08-23 AND d_date_null_count@1 != row_count@2 AND d_date_min@3 <= 2000-09-22, required_guarantees=[] └────────────────────────────────────────────────── ┌───── Stage 12 ── Tasks: t0:[p0..p8] t1:[p0..p8] t2:[p0..p8] - │ RepartitionExec: partitioning=Hash([cr_item_sk@0, cr_order_number@1], 9), input_partitions=2 - │ PartitionIsolatorExec: tasks=3 partitions=6 - │ DataSourceExec: file_groups={6 groups: [[/testdata/tpcds/plans_sf1_partitions4/catalog_returns/part-0.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/catalog_returns/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/catalog_returns/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/catalog_returns/part-2.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/catalog_returns/part-2.parquet:..], ...]}, projection=[cr_item_sk, cr_order_number, cr_return_amount, cr_net_loss], file_type=parquet + │ RepartitionExec: partitioning=Hash([cr_item_sk@0, cr_order_number@1], 9), input_partitions=3 + │ DistributedLeafExec: DataSourceExec: file_groups={3 groups: [[/testdata/tpcds/plans_sf1_partitions4/catalog_returns/part-0.parquet:.., /testdata/tpcds/plans_sf1_partitions4/catalog_returns/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/catalog_returns/part-1.parquet:.., /testdata/tpcds/plans_sf1_partitions4/catalog_returns/part-2.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/catalog_returns/part-2.parquet:.., /testdata/tpcds/plans_sf1_partitions4/catalog_returns/part-3.parquet:..]]}, projection=[cr_item_sk, cr_order_number, cr_return_amount, cr_net_loss], file_type=parquet └────────────────────────────────────────────────── ┌───── Stage 13 ── Tasks: t0:[p0..p8] t1:[p0..p8] t2:[p0..p8] - │ RepartitionExec: partitioning=Hash([cs_item_sk@2, cs_order_number@4], 9), input_partitions=2 - │ PartitionIsolatorExec: tasks=3 partitions=6 - │ DataSourceExec: file_groups={6 groups: [[/testdata/tpcds/plans_sf1_partitions4/catalog_sales/part-0.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/catalog_sales/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/catalog_sales/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/catalog_sales/part-2.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/catalog_sales/part-2.parquet:..], ...]}, projection=[cs_sold_date_sk, cs_catalog_page_sk, cs_item_sk, cs_promo_sk, cs_order_number, cs_ext_sales_price, cs_net_profit], file_type=parquet, predicate=DynamicFilter [ empty ] AND DynamicFilter [ empty ] AND DynamicFilter [ empty ] AND DynamicFilter [ empty ] + │ RepartitionExec: partitioning=Hash([cs_item_sk@2, cs_order_number@4], 9), input_partitions=3 + │ DistributedLeafExec: DataSourceExec: file_groups={3 groups: [[/testdata/tpcds/plans_sf1_partitions4/catalog_sales/part-0.parquet:.., /testdata/tpcds/plans_sf1_partitions4/catalog_sales/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/catalog_sales/part-1.parquet:.., /testdata/tpcds/plans_sf1_partitions4/catalog_sales/part-2.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/catalog_sales/part-2.parquet:.., /testdata/tpcds/plans_sf1_partitions4/catalog_sales/part-3.parquet:..]]}, projection=[cs_sold_date_sk, cs_catalog_page_sk, cs_item_sk, cs_promo_sk, cs_order_number, cs_ext_sales_price, cs_net_profit], file_type=parquet, predicate=DynamicFilter [ empty ] AND DynamicFilter [ empty ] AND DynamicFilter [ empty ] AND DynamicFilter [ empty ] └────────────────────────────────────────────────── ┌───── Stage 21 ── Tasks: t0:[p0..p2] t1:[p0..p2] t2:[p0..p2] │ RepartitionExec: partitioning=Hash([web_site_id@0], 3), input_partitions=3 @@ -9066,8 +8369,7 @@ mod tests { │ ProjectionExec: expr=[p_promo_sk@0 as p_promo_sk, CAST(p_promo_sk@0 AS Float64) as CAST(promotion.p_promo_sk AS Float64)] │ FilterExec: p_channel_tv@1 = N, projection=[p_promo_sk@0] │ RepartitionExec: partitioning=RoundRobinBatch(3), input_partitions=2 - │ PartitionIsolatorExec: tasks=3 partitions=4 - │ DataSourceExec: file_groups={4 groups: [[/testdata/tpcds/plans_sf1_partitions4/promotion/part-0.parquet], [/testdata/tpcds/plans_sf1_partitions4/promotion/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/promotion/part-2.parquet], [/testdata/tpcds/plans_sf1_partitions4/promotion/part-3.parquet]]}, projection=[p_promo_sk, p_channel_tv], file_type=parquet, predicate=p_channel_tv@11 = N AND DynamicFilter [ empty ], pruning_predicate=p_channel_tv_null_count@2 != row_count@3 AND p_channel_tv_min@0 <= N AND N <= p_channel_tv_max@1, required_guarantees=[p_channel_tv in (N)] + │ DistributedLeafExec: DataSourceExec: file_groups={2 groups: [[/testdata/tpcds/plans_sf1_partitions4/promotion/part-0.parquet, /testdata/tpcds/plans_sf1_partitions4/promotion/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/promotion/part-2.parquet, /testdata/tpcds/plans_sf1_partitions4/promotion/part-3.parquet]]}, projection=[p_promo_sk, p_channel_tv], file_type=parquet, predicate=p_channel_tv@11 = N AND DynamicFilter [ empty ], pruning_predicate=p_channel_tv_null_count@2 != row_count@3 AND p_channel_tv_min@0 <= N AND N <= p_channel_tv_max@1, required_guarantees=[p_channel_tv in (N)] └────────────────────────────────────────────────── ┌───── Stage 20 ── Tasks: t0:[p0..p8] t1:[p9..p17] t2:[p18..p26] │ BroadcastExec: input_partitions=3, consumer_tasks=3, output_partitions=9 @@ -9076,8 +8378,7 @@ mod tests { │ [Stage 19] => NetworkBroadcastExec: partitions_per_consumer=3, stage_partitions=9, input_tasks=3 │ FilterExec: i_current_price@1 > Some(5000),4,2, projection=[i_item_sk@0] │ RepartitionExec: partitioning=RoundRobinBatch(3), input_partitions=2 - │ PartitionIsolatorExec: tasks=3 partitions=4 - │ DataSourceExec: file_groups={4 groups: [[/testdata/tpcds/plans_sf1_partitions4/item/part-0.parquet], [/testdata/tpcds/plans_sf1_partitions4/item/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/item/part-2.parquet], [/testdata/tpcds/plans_sf1_partitions4/item/part-3.parquet]]}, projection=[i_item_sk, i_current_price], file_type=parquet, predicate=i_current_price@5 > Some(5000),4,2 AND DynamicFilter [ empty ], pruning_predicate=i_current_price_null_count@1 != row_count@2 AND i_current_price_max@0 > Some(5000),4,2, required_guarantees=[] + │ DistributedLeafExec: DataSourceExec: file_groups={2 groups: [[/testdata/tpcds/plans_sf1_partitions4/item/part-0.parquet, /testdata/tpcds/plans_sf1_partitions4/item/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/item/part-2.parquet, /testdata/tpcds/plans_sf1_partitions4/item/part-3.parquet]]}, projection=[i_item_sk, i_current_price], file_type=parquet, predicate=i_current_price@5 > Some(5000),4,2 AND DynamicFilter [ empty ], pruning_predicate=i_current_price_null_count@1 != row_count@2 AND i_current_price_max@0 > Some(5000),4,2, required_guarantees=[] └────────────────────────────────────────────────── ┌───── Stage 19 ── Tasks: t0:[p0..p8] t1:[p9..p17] t2:[p18..p26] │ BroadcastExec: input_partitions=3, consumer_tasks=3, output_partitions=9 @@ -9095,26 +8396,22 @@ mod tests { └────────────────────────────────────────────────── ┌───── Stage 15 ── Tasks: t0:[p0..p5] t1:[p6..p11] │ BroadcastExec: input_partitions=2, consumer_tasks=3, output_partitions=6 - │ PartitionIsolatorExec: tasks=2 partitions=4 - │ DataSourceExec: file_groups={4 groups: [[/testdata/tpcds/plans_sf1_partitions4/web_site/part-0.parquet], [/testdata/tpcds/plans_sf1_partitions4/web_site/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/web_site/part-2.parquet], [/testdata/tpcds/plans_sf1_partitions4/web_site/part-3.parquet]]}, projection=[web_site_sk, web_site_id, CAST(web_site_sk@0 AS Float64) as CAST(web_site.web_site_sk AS Float64)], file_type=parquet + │ DistributedLeafExec: DataSourceExec: file_groups={2 groups: [[/testdata/tpcds/plans_sf1_partitions4/web_site/part-0.parquet, /testdata/tpcds/plans_sf1_partitions4/web_site/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/web_site/part-2.parquet, /testdata/tpcds/plans_sf1_partitions4/web_site/part-3.parquet]]}, projection=[web_site_sk, web_site_id, CAST(web_site_sk@0 AS Float64) as CAST(web_site.web_site_sk AS Float64)], file_type=parquet └────────────────────────────────────────────────── ┌───── Stage 16 ── Tasks: t0:[p0..p8] t1:[p9..p17] │ BroadcastExec: input_partitions=3, consumer_tasks=3, output_partitions=9 │ ProjectionExec: expr=[d_date_sk@0 as d_date_sk, CAST(d_date_sk@0 AS Float64) as CAST(date_dim.d_date_sk AS Float64)] │ FilterExec: d_date@1 >= 2000-08-23 AND d_date@1 <= 2000-09-22, projection=[d_date_sk@0] │ RepartitionExec: partitioning=RoundRobinBatch(3), input_partitions=2 - │ PartitionIsolatorExec: tasks=2 partitions=4 - │ DataSourceExec: file_groups={4 groups: [[/testdata/tpcds/plans_sf1_partitions4/date_dim/part-0.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-2.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-3.parquet]]}, projection=[d_date_sk, d_date], file_type=parquet, predicate=d_date@2 >= 2000-08-23 AND d_date@2 <= 2000-09-22, pruning_predicate=d_date_null_count@1 != row_count@2 AND d_date_max@0 >= 2000-08-23 AND d_date_null_count@1 != row_count@2 AND d_date_min@3 <= 2000-09-22, required_guarantees=[] + │ DistributedLeafExec: DataSourceExec: file_groups={2 groups: [[/testdata/tpcds/plans_sf1_partitions4/date_dim/part-0.parquet, /testdata/tpcds/plans_sf1_partitions4/date_dim/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-2.parquet, /testdata/tpcds/plans_sf1_partitions4/date_dim/part-3.parquet]]}, projection=[d_date_sk, d_date], file_type=parquet, predicate=d_date@2 >= 2000-08-23 AND d_date@2 <= 2000-09-22, pruning_predicate=d_date_null_count@1 != row_count@2 AND d_date_max@0 >= 2000-08-23 AND d_date_null_count@1 != row_count@2 AND d_date_min@3 <= 2000-09-22, required_guarantees=[] └────────────────────────────────────────────────── ┌───── Stage 17 ── Tasks: t0:[p0..p8] t1:[p0..p8] │ RepartitionExec: partitioning=Hash([wr_item_sk@0, wr_order_number@1], 9), input_partitions=2 - │ PartitionIsolatorExec: tasks=2 partitions=4 - │ DataSourceExec: file_groups={4 groups: [[/testdata/tpcds/plans_sf1_partitions4/web_returns/part-0.parquet], [/testdata/tpcds/plans_sf1_partitions4/web_returns/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/web_returns/part-2.parquet], [/testdata/tpcds/plans_sf1_partitions4/web_returns/part-3.parquet]]}, projection=[wr_item_sk, wr_order_number, wr_return_amt, wr_net_loss], file_type=parquet + │ DistributedLeafExec: DataSourceExec: file_groups={2 groups: [[/testdata/tpcds/plans_sf1_partitions4/web_returns/part-0.parquet, /testdata/tpcds/plans_sf1_partitions4/web_returns/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/web_returns/part-2.parquet, /testdata/tpcds/plans_sf1_partitions4/web_returns/part-3.parquet]]}, projection=[wr_item_sk, wr_order_number, wr_return_amt, wr_net_loss], file_type=parquet └────────────────────────────────────────────────── ┌───── Stage 18 ── Tasks: t0:[p0..p8] t1:[p0..p8] t2:[p0..p8] - │ RepartitionExec: partitioning=Hash([ws_item_sk@1, ws_order_number@4], 9), input_partitions=2 - │ PartitionIsolatorExec: tasks=3 partitions=6 - │ DataSourceExec: file_groups={6 groups: [[/testdata/tpcds/plans_sf1_partitions4/web_sales/part-0.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/web_sales/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/web_sales/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/web_sales/part-2.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/web_sales/part-2.parquet:..], ...]}, projection=[ws_sold_date_sk, ws_item_sk, ws_web_site_sk, ws_promo_sk, ws_order_number, ws_ext_sales_price, ws_net_profit], file_type=parquet, predicate=DynamicFilter [ empty ] AND DynamicFilter [ empty ] + │ RepartitionExec: partitioning=Hash([ws_item_sk@1, ws_order_number@4], 9), input_partitions=3 + │ DistributedLeafExec: DataSourceExec: file_groups={3 groups: [[/testdata/tpcds/plans_sf1_partitions4/web_sales/part-0.parquet:.., /testdata/tpcds/plans_sf1_partitions4/web_sales/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/web_sales/part-1.parquet:.., /testdata/tpcds/plans_sf1_partitions4/web_sales/part-2.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/web_sales/part-2.parquet:.., /testdata/tpcds/plans_sf1_partitions4/web_sales/part-3.parquet:..]]}, projection=[ws_sold_date_sk, ws_item_sk, ws_web_site_sk, ws_promo_sk, ws_order_number, ws_ext_sales_price, ws_net_profit], file_type=parquet, predicate=DynamicFilter [ empty ] AND DynamicFilter [ empty ] └────────────────────────────────────────────────── "); Ok(()) @@ -9122,7 +8419,7 @@ mod tests { #[tokio::test] async fn test_tpcds_81() -> Result<()> { let display = test_tpcds_query("q81").await?; - assert_snapshot!(display, @" + assert_snapshot!(display, @r" ┌───── DistributedExec ── Tasks: t0:[p0] │ SortPreservingMergeExec: [c_customer_id@0 ASC NULLS LAST, c_salutation@1 ASC NULLS LAST, c_first_name@2 ASC NULLS LAST, c_last_name@3 ASC NULLS LAST, ca_street_number@4 ASC NULLS LAST, ca_street_name@5 ASC NULLS LAST, ca_street_type@6 ASC NULLS LAST, ca_suite_number@7 ASC NULLS LAST, ca_city@8 ASC NULLS LAST, ca_county@9 ASC NULLS LAST, ca_state@10 ASC NULLS LAST, ca_zip@11 ASC NULLS LAST, ca_country@12 ASC NULLS LAST, ca_gmt_offset@13 ASC NULLS LAST, ca_location_type@14 ASC NULLS LAST, ctr_total_return@15 ASC NULLS LAST], fetch=100 │ [Stage 11] => NetworkCoalesceExec: output_partitions=6, input_tasks=2 @@ -9148,15 +8445,13 @@ mod tests { │ [Stage 5] => NetworkBroadcastExec: partitions_per_consumer=3, stage_partitions=6, input_tasks=2 │ ProjectionExec: expr=[c_customer_sk@0 as c_customer_sk, c_customer_id@1 as c_customer_id, c_current_addr_sk@2 as c_current_addr_sk, c_salutation@3 as c_salutation, c_first_name@4 as c_first_name, c_last_name@5 as c_last_name, CAST(c_customer_sk@0 AS Float64) as CAST(customer.c_customer_sk AS Float64)] │ RepartitionExec: partitioning=RoundRobinBatch(3), input_partitions=2 - │ PartitionIsolatorExec: tasks=2 partitions=4 - │ DataSourceExec: file_groups={4 groups: [[/testdata/tpcds/plans_sf1_partitions4/customer/part-0.parquet], [/testdata/tpcds/plans_sf1_partitions4/customer/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/customer/part-2.parquet], [/testdata/tpcds/plans_sf1_partitions4/customer/part-3.parquet]]}, projection=[c_customer_sk, c_customer_id, c_current_addr_sk, c_salutation, c_first_name, c_last_name], file_type=parquet, predicate=DynamicFilter [ empty ] AND DynamicFilter [ empty ] + │ DistributedLeafExec: DataSourceExec: file_groups={2 groups: [[/testdata/tpcds/plans_sf1_partitions4/customer/part-0.parquet, /testdata/tpcds/plans_sf1_partitions4/customer/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/customer/part-2.parquet, /testdata/tpcds/plans_sf1_partitions4/customer/part-3.parquet]]}, projection=[c_customer_sk, c_customer_id, c_current_addr_sk, c_salutation, c_first_name, c_last_name], file_type=parquet, predicate=DynamicFilter [ empty ] AND DynamicFilter [ empty ] └────────────────────────────────────────────────── ┌───── Stage 1 ── Tasks: t0:[p0..p5] t1:[p6..p11] │ BroadcastExec: input_partitions=3, consumer_tasks=2, output_partitions=6 │ FilterExec: ca_state@7 = GA │ RepartitionExec: partitioning=RoundRobinBatch(3), input_partitions=2 - │ PartitionIsolatorExec: tasks=2 partitions=4 - │ DataSourceExec: file_groups={4 groups: [[/testdata/tpcds/plans_sf1_partitions4/customer_address/part-0.parquet], [/testdata/tpcds/plans_sf1_partitions4/customer_address/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/customer_address/part-2.parquet], [/testdata/tpcds/plans_sf1_partitions4/customer_address/part-3.parquet]]}, projection=[ca_address_sk, ca_street_number, ca_street_name, ca_street_type, ca_suite_number, ca_city, ca_county, ca_state, ca_zip, ca_country, ca_gmt_offset, ca_location_type], file_type=parquet, predicate=ca_state@8 = GA, pruning_predicate=ca_state_null_count@2 != row_count@3 AND ca_state_min@0 <= GA AND GA <= ca_state_max@1, required_guarantees=[ca_state in (GA)] + │ DistributedLeafExec: DataSourceExec: file_groups={2 groups: [[/testdata/tpcds/plans_sf1_partitions4/customer_address/part-0.parquet, /testdata/tpcds/plans_sf1_partitions4/customer_address/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/customer_address/part-2.parquet, /testdata/tpcds/plans_sf1_partitions4/customer_address/part-3.parquet]]}, projection=[ca_address_sk, ca_street_number, ca_street_name, ca_street_type, ca_suite_number, ca_city, ca_county, ca_state, ca_zip, ca_country, ca_gmt_offset, ca_location_type], file_type=parquet, predicate=ca_state@8 = GA, pruning_predicate=ca_state_null_count@2 != row_count@3 AND ca_state_min@0 <= GA AND GA <= ca_state_max@1, required_guarantees=[ca_state in (GA)] └────────────────────────────────────────────────── ┌───── Stage 5 ── Tasks: t0:[p0..p5] t1:[p6..p11] │ BroadcastExec: input_partitions=3, consumer_tasks=2, output_partitions=6 @@ -9169,26 +8464,23 @@ mod tests { │ AggregateExec: mode=Partial, gby=[cr_returning_customer_sk@0 as cr_returning_customer_sk, ca_state@2 as ca_state], aggr=[sum(catalog_returns.cr_return_amt_inc_tax)] │ HashJoinExec: mode=CollectLeft, join_type=Inner, on=[(cr_returning_addr_sk@1, CAST(customer_address.ca_address_sk AS Float64)@2)], projection=[cr_returning_customer_sk@0, cr_return_amt_inc_tax@2, ca_state@4] │ CoalescePartitionsExec - │ [Stage 3] => NetworkBroadcastExec: partitions_per_consumer=2, stage_partitions=6, input_tasks=3 + │ [Stage 3] => NetworkBroadcastExec: partitions_per_consumer=3, stage_partitions=9, input_tasks=3 │ ProjectionExec: expr=[ca_address_sk@0 as ca_address_sk, ca_state@1 as ca_state, CAST(ca_address_sk@0 AS Float64) as CAST(customer_address.ca_address_sk AS Float64)] │ RepartitionExec: partitioning=RoundRobinBatch(3), input_partitions=2 - │ PartitionIsolatorExec: tasks=3 partitions=4 - │ DataSourceExec: file_groups={4 groups: [[/testdata/tpcds/plans_sf1_partitions4/customer_address/part-0.parquet], [/testdata/tpcds/plans_sf1_partitions4/customer_address/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/customer_address/part-2.parquet], [/testdata/tpcds/plans_sf1_partitions4/customer_address/part-3.parquet]]}, projection=[ca_address_sk, ca_state], file_type=parquet, predicate=DynamicFilter [ empty ] + │ DistributedLeafExec: DataSourceExec: file_groups={2 groups: [[/testdata/tpcds/plans_sf1_partitions4/customer_address/part-0.parquet, /testdata/tpcds/plans_sf1_partitions4/customer_address/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/customer_address/part-2.parquet, /testdata/tpcds/plans_sf1_partitions4/customer_address/part-3.parquet]]}, projection=[ca_address_sk, ca_state], file_type=parquet, predicate=DynamicFilter [ empty ] └────────────────────────────────────────────────── - ┌───── Stage 3 ── Tasks: t0:[p0..p5] t1:[p6..p11] t2:[p12..p17] - │ BroadcastExec: input_partitions=2, consumer_tasks=3, output_partitions=6 + ┌───── Stage 3 ── Tasks: t0:[p0..p8] t1:[p9..p17] t2:[p18..p26] + │ BroadcastExec: input_partitions=3, consumer_tasks=3, output_partitions=9 │ HashJoinExec: mode=CollectLeft, join_type=Inner, on=[(d_date_sk@0, cr_returned_date_sk@0)], projection=[cr_returning_customer_sk@2, cr_returning_addr_sk@3, cr_return_amt_inc_tax@4] │ CoalescePartitionsExec │ [Stage 2] => NetworkBroadcastExec: partitions_per_consumer=3, stage_partitions=9, input_tasks=2 - │ PartitionIsolatorExec: tasks=3 partitions=6 - │ DataSourceExec: file_groups={6 groups: [[/testdata/tpcds/plans_sf1_partitions4/catalog_returns/part-0.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/catalog_returns/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/catalog_returns/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/catalog_returns/part-2.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/catalog_returns/part-2.parquet:..], ...]}, projection=[cr_returned_date_sk, cr_returning_customer_sk, cr_returning_addr_sk, cr_return_amt_inc_tax], file_type=parquet, predicate=DynamicFilter [ empty ] + │ DistributedLeafExec: DataSourceExec: file_groups={3 groups: [[/testdata/tpcds/plans_sf1_partitions4/catalog_returns/part-0.parquet:.., /testdata/tpcds/plans_sf1_partitions4/catalog_returns/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/catalog_returns/part-1.parquet:.., /testdata/tpcds/plans_sf1_partitions4/catalog_returns/part-2.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/catalog_returns/part-2.parquet:.., /testdata/tpcds/plans_sf1_partitions4/catalog_returns/part-3.parquet:..]]}, projection=[cr_returned_date_sk, cr_returning_customer_sk, cr_returning_addr_sk, cr_return_amt_inc_tax], file_type=parquet, predicate=DynamicFilter [ empty ] └────────────────────────────────────────────────── ┌───── Stage 2 ── Tasks: t0:[p0..p8] t1:[p9..p17] │ BroadcastExec: input_partitions=3, consumer_tasks=3, output_partitions=9 │ FilterExec: d_year@1 = 2000, projection=[d_date_sk@0] │ RepartitionExec: partitioning=RoundRobinBatch(3), input_partitions=2 - │ PartitionIsolatorExec: tasks=2 partitions=4 - │ DataSourceExec: file_groups={4 groups: [[/testdata/tpcds/plans_sf1_partitions4/date_dim/part-0.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-2.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-3.parquet]]}, projection=[d_date_sk, d_year], file_type=parquet, predicate=d_year@6 = 2000, pruning_predicate=d_year_null_count@2 != row_count@3 AND d_year_min@0 <= 2000 AND 2000 <= d_year_max@1, required_guarantees=[d_year in (2000)] + │ DistributedLeafExec: DataSourceExec: file_groups={2 groups: [[/testdata/tpcds/plans_sf1_partitions4/date_dim/part-0.parquet, /testdata/tpcds/plans_sf1_partitions4/date_dim/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-2.parquet, /testdata/tpcds/plans_sf1_partitions4/date_dim/part-3.parquet]]}, projection=[d_date_sk, d_year], file_type=parquet, predicate=d_year@6 = 2000, pruning_predicate=d_year_null_count@2 != row_count@3 AND d_year_min@0 <= 2000 AND 2000 <= d_year_max@1, required_guarantees=[d_year in (2000)] └────────────────────────────────────────────────── ┌───── Stage 10 ── Tasks: t0:[p0..p5] t1:[p0..p5] │ RepartitionExec: partitioning=Hash([ctr_state@0], 6), input_partitions=3 @@ -9202,26 +8494,23 @@ mod tests { │ AggregateExec: mode=Partial, gby=[cr_returning_customer_sk@0 as cr_returning_customer_sk, ca_state@2 as ca_state], aggr=[sum(catalog_returns.cr_return_amt_inc_tax)] │ HashJoinExec: mode=CollectLeft, join_type=Inner, on=[(cr_returning_addr_sk@1, CAST(customer_address.ca_address_sk AS Float64)@2)], projection=[cr_returning_customer_sk@0, cr_return_amt_inc_tax@2, ca_state@4] │ CoalescePartitionsExec - │ [Stage 8] => NetworkBroadcastExec: partitions_per_consumer=2, stage_partitions=6, input_tasks=3 + │ [Stage 8] => NetworkBroadcastExec: partitions_per_consumer=3, stage_partitions=9, input_tasks=3 │ ProjectionExec: expr=[ca_address_sk@0 as ca_address_sk, ca_state@1 as ca_state, CAST(ca_address_sk@0 AS Float64) as CAST(customer_address.ca_address_sk AS Float64)] │ RepartitionExec: partitioning=RoundRobinBatch(3), input_partitions=2 - │ PartitionIsolatorExec: tasks=3 partitions=4 - │ DataSourceExec: file_groups={4 groups: [[/testdata/tpcds/plans_sf1_partitions4/customer_address/part-0.parquet], [/testdata/tpcds/plans_sf1_partitions4/customer_address/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/customer_address/part-2.parquet], [/testdata/tpcds/plans_sf1_partitions4/customer_address/part-3.parquet]]}, projection=[ca_address_sk, ca_state], file_type=parquet, predicate=DynamicFilter [ empty ] + │ DistributedLeafExec: DataSourceExec: file_groups={2 groups: [[/testdata/tpcds/plans_sf1_partitions4/customer_address/part-0.parquet, /testdata/tpcds/plans_sf1_partitions4/customer_address/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/customer_address/part-2.parquet, /testdata/tpcds/plans_sf1_partitions4/customer_address/part-3.parquet]]}, projection=[ca_address_sk, ca_state], file_type=parquet, predicate=DynamicFilter [ empty ] └────────────────────────────────────────────────── - ┌───── Stage 8 ── Tasks: t0:[p0..p5] t1:[p6..p11] t2:[p12..p17] - │ BroadcastExec: input_partitions=2, consumer_tasks=3, output_partitions=6 + ┌───── Stage 8 ── Tasks: t0:[p0..p8] t1:[p9..p17] t2:[p18..p26] + │ BroadcastExec: input_partitions=3, consumer_tasks=3, output_partitions=9 │ HashJoinExec: mode=CollectLeft, join_type=Inner, on=[(d_date_sk@0, cr_returned_date_sk@0)], projection=[cr_returning_customer_sk@2, cr_returning_addr_sk@3, cr_return_amt_inc_tax@4] │ CoalescePartitionsExec │ [Stage 7] => NetworkBroadcastExec: partitions_per_consumer=3, stage_partitions=9, input_tasks=2 - │ PartitionIsolatorExec: tasks=3 partitions=6 - │ DataSourceExec: file_groups={6 groups: [[/testdata/tpcds/plans_sf1_partitions4/catalog_returns/part-0.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/catalog_returns/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/catalog_returns/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/catalog_returns/part-2.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/catalog_returns/part-2.parquet:..], ...]}, projection=[cr_returned_date_sk, cr_returning_customer_sk, cr_returning_addr_sk, cr_return_amt_inc_tax], file_type=parquet, predicate=DynamicFilter [ empty ] + │ DistributedLeafExec: DataSourceExec: file_groups={3 groups: [[/testdata/tpcds/plans_sf1_partitions4/catalog_returns/part-0.parquet:.., /testdata/tpcds/plans_sf1_partitions4/catalog_returns/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/catalog_returns/part-1.parquet:.., /testdata/tpcds/plans_sf1_partitions4/catalog_returns/part-2.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/catalog_returns/part-2.parquet:.., /testdata/tpcds/plans_sf1_partitions4/catalog_returns/part-3.parquet:..]]}, projection=[cr_returned_date_sk, cr_returning_customer_sk, cr_returning_addr_sk, cr_return_amt_inc_tax], file_type=parquet, predicate=DynamicFilter [ empty ] └────────────────────────────────────────────────── ┌───── Stage 7 ── Tasks: t0:[p0..p8] t1:[p9..p17] │ BroadcastExec: input_partitions=3, consumer_tasks=3, output_partitions=9 │ FilterExec: d_year@1 = 2000, projection=[d_date_sk@0] │ RepartitionExec: partitioning=RoundRobinBatch(3), input_partitions=2 - │ PartitionIsolatorExec: tasks=2 partitions=4 - │ DataSourceExec: file_groups={4 groups: [[/testdata/tpcds/plans_sf1_partitions4/date_dim/part-0.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-2.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-3.parquet]]}, projection=[d_date_sk, d_year], file_type=parquet, predicate=d_year@6 = 2000, pruning_predicate=d_year_null_count@2 != row_count@3 AND d_year_min@0 <= 2000 AND 2000 <= d_year_max@1, required_guarantees=[d_year in (2000)] + │ DistributedLeafExec: DataSourceExec: file_groups={2 groups: [[/testdata/tpcds/plans_sf1_partitions4/date_dim/part-0.parquet, /testdata/tpcds/plans_sf1_partitions4/date_dim/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-2.parquet, /testdata/tpcds/plans_sf1_partitions4/date_dim/part-3.parquet]]}, projection=[d_date_sk, d_year], file_type=parquet, predicate=d_year@6 = 2000, pruning_predicate=d_year_null_count@2 != row_count@3 AND d_year_min@0 <= 2000 AND 2000 <= d_year_max@1, required_guarantees=[d_year in (2000)] └────────────────────────────────────────────────── "); Ok(()) @@ -9229,7 +8518,7 @@ mod tests { #[tokio::test] async fn test_tpcds_82() -> Result<()> { let display = test_tpcds_query("q82").await?; - assert_snapshot!(display, @" + assert_snapshot!(display, @r" ┌───── DistributedExec ── Tasks: t0:[p0] │ SortPreservingMergeExec: [i_item_id@0 ASC NULLS LAST], fetch=100 │ [Stage 5] => NetworkCoalesceExec: output_partitions=6, input_tasks=2 @@ -9240,39 +8529,35 @@ mod tests { │ [Stage 4] => NetworkShuffleExec: output_partitions=3, input_tasks=3 └────────────────────────────────────────────────── ┌───── Stage 4 ── Tasks: t0:[p0..p5] t1:[p0..p5] t2:[p0..p5] - │ RepartitionExec: partitioning=Hash([i_item_id@0, i_item_desc@1, i_current_price@2], 6), input_partitions=2 + │ RepartitionExec: partitioning=Hash([i_item_id@0, i_item_desc@1, i_current_price@2], 6), input_partitions=3 │ AggregateExec: mode=Partial, gby=[i_item_id@0 as i_item_id, i_item_desc@1 as i_item_desc, i_current_price@2 as i_current_price], aggr=[] │ HashJoinExec: mode=CollectLeft, join_type=Inner, on=[(i_item_sk@0, ss_item_sk@0)], projection=[i_item_id@1, i_item_desc@2, i_current_price@3] │ CoalescePartitionsExec │ [Stage 3] => NetworkBroadcastExec: partitions_per_consumer=3, stage_partitions=9, input_tasks=3 - │ PartitionIsolatorExec: tasks=3 partitions=6 - │ DataSourceExec: file_groups={6 groups: [[/testdata/tpcds/plans_sf1_partitions4/store_sales/part-0.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_sales/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_sales/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_sales/part-2.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_sales/part-2.parquet:..], ...]}, projection=[ss_item_sk], file_type=parquet, predicate=DynamicFilter [ empty ] + │ DistributedLeafExec: DataSourceExec: file_groups={3 groups: [[/testdata/tpcds/plans_sf1_partitions4/store_sales/part-0.parquet:.., /testdata/tpcds/plans_sf1_partitions4/store_sales/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_sales/part-1.parquet:.., /testdata/tpcds/plans_sf1_partitions4/store_sales/part-2.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_sales/part-2.parquet:.., /testdata/tpcds/plans_sf1_partitions4/store_sales/part-3.parquet:..]]}, projection=[ss_item_sk], file_type=parquet, predicate=DynamicFilter [ empty ] └────────────────────────────────────────────────── ┌───── Stage 3 ── Tasks: t0:[p0..p8] t1:[p9..p17] t2:[p18..p26] │ BroadcastExec: input_partitions=3, consumer_tasks=3, output_partitions=9 │ HashJoinExec: mode=CollectLeft, join_type=Inner, on=[(inv_date_sk@4, d_date_sk@0)], projection=[i_item_sk@0, i_item_id@1, i_item_desc@2, i_current_price@3] │ CoalescePartitionsExec - │ [Stage 2] => NetworkBroadcastExec: partitions_per_consumer=2, stage_partitions=6, input_tasks=3 + │ [Stage 2] => NetworkBroadcastExec: partitions_per_consumer=3, stage_partitions=9, input_tasks=3 │ FilterExec: d_date@1 >= 2000-05-25 AND d_date@1 <= 2000-07-24, projection=[d_date_sk@0] │ RepartitionExec: partitioning=RoundRobinBatch(3), input_partitions=2 - │ PartitionIsolatorExec: tasks=3 partitions=4 - │ DataSourceExec: file_groups={4 groups: [[/testdata/tpcds/plans_sf1_partitions4/date_dim/part-0.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-2.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-3.parquet]]}, projection=[d_date_sk, d_date], file_type=parquet, predicate=d_date@2 >= 2000-05-25 AND d_date@2 <= 2000-07-24 AND DynamicFilter [ empty ], pruning_predicate=d_date_null_count@1 != row_count@2 AND d_date_max@0 >= 2000-05-25 AND d_date_null_count@1 != row_count@2 AND d_date_min@3 <= 2000-07-24, required_guarantees=[] + │ DistributedLeafExec: DataSourceExec: file_groups={2 groups: [[/testdata/tpcds/plans_sf1_partitions4/date_dim/part-0.parquet, /testdata/tpcds/plans_sf1_partitions4/date_dim/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-2.parquet, /testdata/tpcds/plans_sf1_partitions4/date_dim/part-3.parquet]]}, projection=[d_date_sk, d_date], file_type=parquet, predicate=d_date@2 >= 2000-05-25 AND d_date@2 <= 2000-07-24 AND DynamicFilter [ empty ], pruning_predicate=d_date_null_count@1 != row_count@2 AND d_date_max@0 >= 2000-05-25 AND d_date_null_count@1 != row_count@2 AND d_date_min@3 <= 2000-07-24, required_guarantees=[] └────────────────────────────────────────────────── - ┌───── Stage 2 ── Tasks: t0:[p0..p5] t1:[p6..p11] t2:[p12..p17] - │ BroadcastExec: input_partitions=2, consumer_tasks=3, output_partitions=6 + ┌───── Stage 2 ── Tasks: t0:[p0..p8] t1:[p9..p17] t2:[p18..p26] + │ BroadcastExec: input_partitions=3, consumer_tasks=3, output_partitions=9 │ HashJoinExec: mode=CollectLeft, join_type=Inner, on=[(i_item_sk@0, inv_item_sk@1)], projection=[i_item_sk@0, i_item_id@1, i_item_desc@2, i_current_price@3, inv_date_sk@4] │ CoalescePartitionsExec │ [Stage 1] => NetworkBroadcastExec: partitions_per_consumer=3, stage_partitions=9, input_tasks=2 │ FilterExec: inv_quantity_on_hand@2 >= 100 AND inv_quantity_on_hand@2 <= 500, projection=[inv_date_sk@0, inv_item_sk@1] - │ PartitionIsolatorExec: tasks=3 partitions=6 - │ DataSourceExec: file_groups={6 groups: [[/testdata/tpcds/plans_sf1_partitions4/inventory/part-0.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/inventory/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/inventory/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/inventory/part-2.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/inventory/part-2.parquet:..], ...]}, projection=[inv_date_sk, inv_item_sk, inv_quantity_on_hand], file_type=parquet, predicate=inv_quantity_on_hand@3 >= 100 AND inv_quantity_on_hand@3 <= 500 AND DynamicFilter [ empty ], pruning_predicate=inv_quantity_on_hand_null_count@1 != row_count@2 AND inv_quantity_on_hand_max@0 >= 100 AND inv_quantity_on_hand_null_count@1 != row_count@2 AND inv_quantity_on_hand_min@3 <= 500, required_guarantees=[] + │ DistributedLeafExec: DataSourceExec: file_groups={3 groups: [[/testdata/tpcds/plans_sf1_partitions4/inventory/part-0.parquet:.., /testdata/tpcds/plans_sf1_partitions4/inventory/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/inventory/part-1.parquet:.., /testdata/tpcds/plans_sf1_partitions4/inventory/part-2.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/inventory/part-2.parquet:.., /testdata/tpcds/plans_sf1_partitions4/inventory/part-3.parquet:..]]}, projection=[inv_date_sk, inv_item_sk, inv_quantity_on_hand], file_type=parquet, predicate=inv_quantity_on_hand@3 >= 100 AND inv_quantity_on_hand@3 <= 500 AND DynamicFilter [ empty ], pruning_predicate=inv_quantity_on_hand_null_count@1 != row_count@2 AND inv_quantity_on_hand_max@0 >= 100 AND inv_quantity_on_hand_null_count@1 != row_count@2 AND inv_quantity_on_hand_min@3 <= 500, required_guarantees=[] └────────────────────────────────────────────────── ┌───── Stage 1 ── Tasks: t0:[p0..p8] t1:[p9..p17] │ BroadcastExec: input_partitions=3, consumer_tasks=3, output_partitions=9 │ FilterExec: i_current_price@3 >= Some(6200),4,2 AND i_current_price@3 <= Some(9200),4,2 AND i_manufact_id@4 IN (SET) ([129, 270, 821, 423]), projection=[i_item_sk@0, i_item_id@1, i_item_desc@2, i_current_price@3] │ RepartitionExec: partitioning=RoundRobinBatch(3), input_partitions=2 - │ PartitionIsolatorExec: tasks=2 partitions=4 - │ DataSourceExec: file_groups={4 groups: [[/testdata/tpcds/plans_sf1_partitions4/item/part-0.parquet], [/testdata/tpcds/plans_sf1_partitions4/item/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/item/part-2.parquet], [/testdata/tpcds/plans_sf1_partitions4/item/part-3.parquet]]}, projection=[i_item_sk, i_item_id, i_item_desc, i_current_price, i_manufact_id], file_type=parquet, predicate=i_current_price@5 >= Some(6200),4,2 AND i_current_price@5 <= Some(9200),4,2 AND i_manufact_id@13 IN (SET) ([129, 270, 821, 423]) AND DynamicFilter [ empty ], pruning_predicate=i_current_price_null_count@1 != row_count@2 AND i_current_price_max@0 >= Some(6200),4,2 AND i_current_price_null_count@1 != row_count@2 AND i_current_price_min@3 <= Some(9200),4,2 AND (i_manufact_id_null_count@6 != row_count@2 AND i_manufact_id_min@4 <= 129 AND 129 <= i_manufact_id_max@5 OR i_manufact_id_null_count@6 != row_count@2 AND i_manufact_id_min@4 <= 270 AND 270 <= i_manufact_id_max@5 OR i_manufact_id_null_count@6 != row_count@2 AND i_manufact_id_min@4 <= 821 AND 821 <= i_manufact_id_max@5 OR i_manufact_id_null_count@6 != row_count@2 AND i_manufact_id_min@4 <= 423 AND 423 <= i_manufact_id_max@5), required_guarantees=[i_manufact_id in (129, 270, 423, 821)] + │ DistributedLeafExec: DataSourceExec: file_groups={2 groups: [[/testdata/tpcds/plans_sf1_partitions4/item/part-0.parquet, /testdata/tpcds/plans_sf1_partitions4/item/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/item/part-2.parquet, /testdata/tpcds/plans_sf1_partitions4/item/part-3.parquet]]}, projection=[i_item_sk, i_item_id, i_item_desc, i_current_price, i_manufact_id], file_type=parquet, predicate=i_current_price@5 >= Some(6200),4,2 AND i_current_price@5 <= Some(9200),4,2 AND i_manufact_id@13 IN (SET) ([129, 270, 821, 423]) AND DynamicFilter [ empty ], pruning_predicate=i_current_price_null_count@1 != row_count@2 AND i_current_price_max@0 >= Some(6200),4,2 AND i_current_price_null_count@1 != row_count@2 AND i_current_price_min@3 <= Some(9200),4,2 AND (i_manufact_id_null_count@6 != row_count@2 AND i_manufact_id_min@4 <= 129 AND 129 <= i_manufact_id_max@5 OR i_manufact_id_null_count@6 != row_count@2 AND i_manufact_id_min@4 <= 270 AND 270 <= i_manufact_id_max@5 OR i_manufact_id_null_count@6 != row_count@2 AND i_manufact_id_min@4 <= 821 AND 821 <= i_manufact_id_max@5 OR i_manufact_id_null_count@6 != row_count@2 AND i_manufact_id_min@4 <= 423 AND 423 <= i_manufact_id_max@5), required_guarantees=[i_manufact_id in (129, 270, 423, 821)] └────────────────────────────────────────────────── "); Ok(()) @@ -9320,8 +8605,7 @@ mod tests { │ [Stage 2] => NetworkBroadcastExec: partitions_per_consumer=3, stage_partitions=6, input_tasks=2 │ ProjectionExec: expr=[d_date_sk@0 as d_date_sk, d_date@1 as d_date, CAST(d_date_sk@0 AS Float64) as CAST(date_dim.d_date_sk AS Float64)] │ RepartitionExec: partitioning=RoundRobinBatch(3), input_partitions=2 - │ PartitionIsolatorExec: tasks=2 partitions=4 - │ DataSourceExec: file_groups={4 groups: [[/testdata/tpcds/plans_sf1_partitions4/date_dim/part-0.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-2.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-3.parquet]]}, projection=[d_date_sk, d_date], file_type=parquet, predicate=DynamicFilter [ empty ] + │ DistributedLeafExec: DataSourceExec: file_groups={2 groups: [[/testdata/tpcds/plans_sf1_partitions4/date_dim/part-0.parquet, /testdata/tpcds/plans_sf1_partitions4/date_dim/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-2.parquet, /testdata/tpcds/plans_sf1_partitions4/date_dim/part-3.parquet]]}, projection=[d_date_sk, d_date], file_type=parquet, predicate=DynamicFilter [ empty ] └────────────────────────────────────────────────── ┌───── Stage 2 ── Tasks: t0:[p0..p5] t1:[p6..p11] │ BroadcastExec: input_partitions=3, consumer_tasks=2, output_partitions=6 @@ -9330,20 +8614,17 @@ mod tests { │ CoalescePartitionsExec │ [Stage 1] => NetworkBroadcastExec: partitions_per_consumer=2, stage_partitions=4, input_tasks=2 │ RepartitionExec: partitioning=RoundRobinBatch(3), input_partitions=2 - │ PartitionIsolatorExec: tasks=2 partitions=4 - │ DataSourceExec: file_groups={4 groups: [[/testdata/tpcds/plans_sf1_partitions4/web_returns/part-0.parquet], [/testdata/tpcds/plans_sf1_partitions4/web_returns/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/web_returns/part-2.parquet], [/testdata/tpcds/plans_sf1_partitions4/web_returns/part-3.parquet]]}, projection=[wr_returned_date_sk, wr_item_sk, wr_return_quantity], file_type=parquet, predicate=DynamicFilter [ empty ] + │ DistributedLeafExec: DataSourceExec: file_groups={2 groups: [[/testdata/tpcds/plans_sf1_partitions4/web_returns/part-0.parquet, /testdata/tpcds/plans_sf1_partitions4/web_returns/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/web_returns/part-2.parquet, /testdata/tpcds/plans_sf1_partitions4/web_returns/part-3.parquet]]}, projection=[wr_returned_date_sk, wr_item_sk, wr_return_quantity], file_type=parquet, predicate=DynamicFilter [ empty ] └────────────────────────────────────────────────── ┌───── Stage 1 ── Tasks: t0:[p0..p3] t1:[p4..p7] │ BroadcastExec: input_partitions=2, consumer_tasks=2, output_partitions=4 - │ PartitionIsolatorExec: tasks=2 partitions=4 - │ DataSourceExec: file_groups={4 groups: [[/testdata/tpcds/plans_sf1_partitions4/item/part-0.parquet], [/testdata/tpcds/plans_sf1_partitions4/item/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/item/part-2.parquet], [/testdata/tpcds/plans_sf1_partitions4/item/part-3.parquet]]}, projection=[i_item_sk, i_item_id], file_type=parquet + │ DistributedLeafExec: DataSourceExec: file_groups={2 groups: [[/testdata/tpcds/plans_sf1_partitions4/item/part-0.parquet, /testdata/tpcds/plans_sf1_partitions4/item/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/item/part-2.parquet, /testdata/tpcds/plans_sf1_partitions4/item/part-3.parquet]]}, projection=[i_item_sk, i_item_id], file_type=parquet └────────────────────────────────────────────────── ┌───── Stage 4 ── Tasks: t0:[p0..p2] t1:[p3..p5] │ BroadcastExec: input_partitions=3, consumer_tasks=1, output_partitions=3 │ FilterExec: d_date@0 = 2000-06-30 OR d_date@0 = 2000-09-27 OR d_date@0 = 2000-11-17, projection=[d_week_seq@1] │ RepartitionExec: partitioning=RoundRobinBatch(3), input_partitions=2 - │ PartitionIsolatorExec: tasks=2 partitions=4 - │ DataSourceExec: file_groups={4 groups: [[/testdata/tpcds/plans_sf1_partitions4/date_dim/part-0.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-2.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-3.parquet]]}, projection=[d_date, d_week_seq], file_type=parquet, predicate=d_date@2 = 2000-06-30 OR d_date@2 = 2000-09-27 OR d_date@2 = 2000-11-17, pruning_predicate=d_date_null_count@2 != row_count@3 AND d_date_min@0 <= 2000-06-30 AND 2000-06-30 <= d_date_max@1 OR d_date_null_count@2 != row_count@3 AND d_date_min@0 <= 2000-09-27 AND 2000-09-27 <= d_date_max@1 OR d_date_null_count@2 != row_count@3 AND d_date_min@0 <= 2000-11-17 AND 2000-11-17 <= d_date_max@1, required_guarantees=[d_date in (2000-06-30, 2000-09-27, 2000-11-17)] + │ DistributedLeafExec: DataSourceExec: file_groups={2 groups: [[/testdata/tpcds/plans_sf1_partitions4/date_dim/part-0.parquet, /testdata/tpcds/plans_sf1_partitions4/date_dim/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-2.parquet, /testdata/tpcds/plans_sf1_partitions4/date_dim/part-3.parquet]]}, projection=[d_date, d_week_seq], file_type=parquet, predicate=d_date@2 = 2000-06-30 OR d_date@2 = 2000-09-27 OR d_date@2 = 2000-11-17, pruning_predicate=d_date_null_count@2 != row_count@3 AND d_date_min@0 <= 2000-06-30 AND 2000-06-30 <= d_date_max@1 OR d_date_null_count@2 != row_count@3 AND d_date_min@0 <= 2000-09-27 AND 2000-09-27 <= d_date_max@1 OR d_date_null_count@2 != row_count@3 AND d_date_min@0 <= 2000-11-17 AND 2000-11-17 <= d_date_max@1, required_guarantees=[d_date in (2000-06-30, 2000-09-27, 2000-11-17)] └────────────────────────────────────────────────── ┌───── Stage 12 ── Tasks: t0:[p0..p5] t1:[p6..p11] │ BroadcastExec: input_partitions=3, consumer_tasks=2, output_partitions=6 @@ -9368,39 +8649,34 @@ mod tests { │ CoalescePartitionsExec │ [Stage 6] => NetworkBroadcastExec: partitions_per_consumer=3, stage_partitions=6, input_tasks=2 │ RepartitionExec: partitioning=RoundRobinBatch(3), input_partitions=2 - │ PartitionIsolatorExec: tasks=2 partitions=4 - │ DataSourceExec: file_groups={4 groups: [[/testdata/tpcds/plans_sf1_partitions4/date_dim/part-0.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-2.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-3.parquet]]}, projection=[d_date, d_week_seq], file_type=parquet + │ DistributedLeafExec: DataSourceExec: file_groups={2 groups: [[/testdata/tpcds/plans_sf1_partitions4/date_dim/part-0.parquet, /testdata/tpcds/plans_sf1_partitions4/date_dim/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-2.parquet, /testdata/tpcds/plans_sf1_partitions4/date_dim/part-3.parquet]]}, projection=[d_date, d_week_seq], file_type=parquet └────────────────────────────────────────────────── ┌───── Stage 6 ── Tasks: t0:[p0..p5] t1:[p6..p11] │ BroadcastExec: input_partitions=3, consumer_tasks=2, output_partitions=6 │ FilterExec: d_date@0 = 2000-06-30 OR d_date@0 = 2000-09-27 OR d_date@0 = 2000-11-17, projection=[d_week_seq@1] │ RepartitionExec: partitioning=RoundRobinBatch(3), input_partitions=2 - │ PartitionIsolatorExec: tasks=2 partitions=4 - │ DataSourceExec: file_groups={4 groups: [[/testdata/tpcds/plans_sf1_partitions4/date_dim/part-0.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-2.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-3.parquet]]}, projection=[d_date, d_week_seq], file_type=parquet, predicate=d_date@2 = 2000-06-30 OR d_date@2 = 2000-09-27 OR d_date@2 = 2000-11-17, pruning_predicate=d_date_null_count@2 != row_count@3 AND d_date_min@0 <= 2000-06-30 AND 2000-06-30 <= d_date_max@1 OR d_date_null_count@2 != row_count@3 AND d_date_min@0 <= 2000-09-27 AND 2000-09-27 <= d_date_max@1 OR d_date_null_count@2 != row_count@3 AND d_date_min@0 <= 2000-11-17 AND 2000-11-17 <= d_date_max@1, required_guarantees=[d_date in (2000-06-30, 2000-09-27, 2000-11-17)] + │ DistributedLeafExec: DataSourceExec: file_groups={2 groups: [[/testdata/tpcds/plans_sf1_partitions4/date_dim/part-0.parquet, /testdata/tpcds/plans_sf1_partitions4/date_dim/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-2.parquet, /testdata/tpcds/plans_sf1_partitions4/date_dim/part-3.parquet]]}, projection=[d_date, d_week_seq], file_type=parquet, predicate=d_date@2 = 2000-06-30 OR d_date@2 = 2000-09-27 OR d_date@2 = 2000-11-17, pruning_predicate=d_date_null_count@2 != row_count@3 AND d_date_min@0 <= 2000-06-30 AND 2000-06-30 <= d_date_max@1 OR d_date_null_count@2 != row_count@3 AND d_date_min@0 <= 2000-09-27 AND 2000-09-27 <= d_date_max@1 OR d_date_null_count@2 != row_count@3 AND d_date_min@0 <= 2000-11-17 AND 2000-11-17 <= d_date_max@1, required_guarantees=[d_date in (2000-06-30, 2000-09-27, 2000-11-17)] └────────────────────────────────────────────────── ┌───── Stage 8 ── Tasks: t0:[p0..p8] t1:[p0..p8] │ RepartitionExec: partitioning=Hash([CAST(date_dim.d_date_sk AS Float64)@2], 9), input_partitions=3 │ ProjectionExec: expr=[d_date_sk@0 as d_date_sk, d_date@1 as d_date, CAST(d_date_sk@0 AS Float64) as CAST(date_dim.d_date_sk AS Float64)] │ RepartitionExec: partitioning=RoundRobinBatch(3), input_partitions=2 - │ PartitionIsolatorExec: tasks=2 partitions=4 - │ DataSourceExec: file_groups={4 groups: [[/testdata/tpcds/plans_sf1_partitions4/date_dim/part-0.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-2.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-3.parquet]]}, projection=[d_date_sk, d_date], file_type=parquet + │ DistributedLeafExec: DataSourceExec: file_groups={2 groups: [[/testdata/tpcds/plans_sf1_partitions4/date_dim/part-0.parquet, /testdata/tpcds/plans_sf1_partitions4/date_dim/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-2.parquet, /testdata/tpcds/plans_sf1_partitions4/date_dim/part-3.parquet]]}, projection=[d_date_sk, d_date], file_type=parquet └────────────────────────────────────────────────── ┌───── Stage 10 ── Tasks: t0:[p0..p8] t1:[p0..p8] t2:[p0..p8] - │ RepartitionExec: partitioning=Hash([sr_returned_date_sk@0], 9), input_partitions=2 + │ RepartitionExec: partitioning=Hash([sr_returned_date_sk@0], 9), input_partitions=3 │ ProjectionExec: expr=[sr_returned_date_sk@1 as sr_returned_date_sk, sr_return_quantity@2 as sr_return_quantity, i_item_id@0 as i_item_id] │ HashJoinExec: mode=CollectLeft, join_type=Inner, on=[(i_item_sk@0, sr_item_sk@1)], projection=[i_item_id@1, sr_returned_date_sk@2, sr_return_quantity@4] │ CoalescePartitionsExec │ [Stage 9] => NetworkBroadcastExec: partitions_per_consumer=2, stage_partitions=6, input_tasks=2 - │ PartitionIsolatorExec: tasks=3 partitions=6 - │ DataSourceExec: file_groups={6 groups: [[/testdata/tpcds/plans_sf1_partitions4/store_returns/part-0.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_returns/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_returns/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_returns/part-2.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_returns/part-2.parquet:..], ...]}, projection=[sr_returned_date_sk, sr_item_sk, sr_return_quantity], file_type=parquet, predicate=DynamicFilter [ empty ] AND DynamicFilter [ empty ] + │ DistributedLeafExec: DataSourceExec: file_groups={3 groups: [[/testdata/tpcds/plans_sf1_partitions4/store_returns/part-0.parquet:.., /testdata/tpcds/plans_sf1_partitions4/store_returns/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_returns/part-1.parquet:.., /testdata/tpcds/plans_sf1_partitions4/store_returns/part-2.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_returns/part-2.parquet:.., /testdata/tpcds/plans_sf1_partitions4/store_returns/part-3.parquet:..]]}, projection=[sr_returned_date_sk, sr_item_sk, sr_return_quantity], file_type=parquet, predicate=DynamicFilter [ empty ] AND DynamicFilter [ empty ] └────────────────────────────────────────────────── ┌───── Stage 9 ── Tasks: t0:[p0..p5] t1:[p6..p11] │ BroadcastExec: input_partitions=2, consumer_tasks=3, output_partitions=6 - │ PartitionIsolatorExec: tasks=2 partitions=4 - │ DataSourceExec: file_groups={4 groups: [[/testdata/tpcds/plans_sf1_partitions4/item/part-0.parquet], [/testdata/tpcds/plans_sf1_partitions4/item/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/item/part-2.parquet], [/testdata/tpcds/plans_sf1_partitions4/item/part-3.parquet]]}, projection=[i_item_sk, i_item_id], file_type=parquet + │ DistributedLeafExec: DataSourceExec: file_groups={2 groups: [[/testdata/tpcds/plans_sf1_partitions4/item/part-0.parquet, /testdata/tpcds/plans_sf1_partitions4/item/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/item/part-2.parquet, /testdata/tpcds/plans_sf1_partitions4/item/part-3.parquet]]}, projection=[i_item_sk, i_item_id], file_type=parquet └────────────────────────────────────────────────── ┌───── Stage 17 ── Tasks: t0:[p0..p5] t1:[p0..p5] t2:[p0..p5] - │ RepartitionExec: partitioning=Hash([i_item_id@0], 6), input_partitions=2 + │ RepartitionExec: partitioning=Hash([i_item_id@0], 6), input_partitions=3 │ AggregateExec: mode=Partial, gby=[i_item_id@1 as i_item_id], aggr=[sum(catalog_returns.cr_return_quantity)] │ HashJoinExec: mode=CollectLeft, join_type=RightSemi, on=[(d_date@0, d_date@2)], projection=[cr_return_quantity@0, i_item_id@1] │ CoalescePartitionsExec @@ -9413,8 +8689,7 @@ mod tests { │ HashJoinExec: mode=CollectLeft, join_type=Inner, on=[(i_item_sk@0, cr_item_sk@1)], projection=[i_item_id@1, cr_returned_date_sk@2, cr_return_quantity@4] │ CoalescePartitionsExec │ [Stage 16] => NetworkBroadcastExec: partitions_per_consumer=2, stage_partitions=6, input_tasks=2 - │ PartitionIsolatorExec: tasks=3 partitions=6 - │ DataSourceExec: file_groups={6 groups: [[/testdata/tpcds/plans_sf1_partitions4/catalog_returns/part-0.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/catalog_returns/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/catalog_returns/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/catalog_returns/part-2.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/catalog_returns/part-2.parquet:..], ...]}, projection=[cr_returned_date_sk, cr_item_sk, cr_return_quantity], file_type=parquet, predicate=DynamicFilter [ empty ] AND DynamicFilter [ empty ] + │ DistributedLeafExec: DataSourceExec: file_groups={3 groups: [[/testdata/tpcds/plans_sf1_partitions4/catalog_returns/part-0.parquet:.., /testdata/tpcds/plans_sf1_partitions4/catalog_returns/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/catalog_returns/part-1.parquet:.., /testdata/tpcds/plans_sf1_partitions4/catalog_returns/part-2.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/catalog_returns/part-2.parquet:.., /testdata/tpcds/plans_sf1_partitions4/catalog_returns/part-3.parquet:..]]}, projection=[cr_returned_date_sk, cr_item_sk, cr_return_quantity], file_type=parquet, predicate=DynamicFilter [ empty ] AND DynamicFilter [ empty ] └────────────────────────────────────────────────── ┌───── Stage 14 ── Tasks: t0:[p0..p8] t1:[p9..p17] │ BroadcastExec: input_partitions=3, consumer_tasks=3, output_partitions=9 @@ -9422,25 +8697,21 @@ mod tests { │ CoalescePartitionsExec │ [Stage 13] => NetworkBroadcastExec: partitions_per_consumer=3, stage_partitions=6, input_tasks=2 │ RepartitionExec: partitioning=RoundRobinBatch(3), input_partitions=2 - │ PartitionIsolatorExec: tasks=2 partitions=4 - │ DataSourceExec: file_groups={4 groups: [[/testdata/tpcds/plans_sf1_partitions4/date_dim/part-0.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-2.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-3.parquet]]}, projection=[d_date, d_week_seq], file_type=parquet + │ DistributedLeafExec: DataSourceExec: file_groups={2 groups: [[/testdata/tpcds/plans_sf1_partitions4/date_dim/part-0.parquet, /testdata/tpcds/plans_sf1_partitions4/date_dim/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-2.parquet, /testdata/tpcds/plans_sf1_partitions4/date_dim/part-3.parquet]]}, projection=[d_date, d_week_seq], file_type=parquet └────────────────────────────────────────────────── ┌───── Stage 13 ── Tasks: t0:[p0..p5] t1:[p6..p11] │ BroadcastExec: input_partitions=3, consumer_tasks=2, output_partitions=6 │ FilterExec: d_date@0 = 2000-06-30 OR d_date@0 = 2000-09-27 OR d_date@0 = 2000-11-17, projection=[d_week_seq@1] │ RepartitionExec: partitioning=RoundRobinBatch(3), input_partitions=2 - │ PartitionIsolatorExec: tasks=2 partitions=4 - │ DataSourceExec: file_groups={4 groups: [[/testdata/tpcds/plans_sf1_partitions4/date_dim/part-0.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-2.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-3.parquet]]}, projection=[d_date, d_week_seq], file_type=parquet, predicate=d_date@2 = 2000-06-30 OR d_date@2 = 2000-09-27 OR d_date@2 = 2000-11-17, pruning_predicate=d_date_null_count@2 != row_count@3 AND d_date_min@0 <= 2000-06-30 AND 2000-06-30 <= d_date_max@1 OR d_date_null_count@2 != row_count@3 AND d_date_min@0 <= 2000-09-27 AND 2000-09-27 <= d_date_max@1 OR d_date_null_count@2 != row_count@3 AND d_date_min@0 <= 2000-11-17 AND 2000-11-17 <= d_date_max@1, required_guarantees=[d_date in (2000-06-30, 2000-09-27, 2000-11-17)] + │ DistributedLeafExec: DataSourceExec: file_groups={2 groups: [[/testdata/tpcds/plans_sf1_partitions4/date_dim/part-0.parquet, /testdata/tpcds/plans_sf1_partitions4/date_dim/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-2.parquet, /testdata/tpcds/plans_sf1_partitions4/date_dim/part-3.parquet]]}, projection=[d_date, d_week_seq], file_type=parquet, predicate=d_date@2 = 2000-06-30 OR d_date@2 = 2000-09-27 OR d_date@2 = 2000-11-17, pruning_predicate=d_date_null_count@2 != row_count@3 AND d_date_min@0 <= 2000-06-30 AND 2000-06-30 <= d_date_max@1 OR d_date_null_count@2 != row_count@3 AND d_date_min@0 <= 2000-09-27 AND 2000-09-27 <= d_date_max@1 OR d_date_null_count@2 != row_count@3 AND d_date_min@0 <= 2000-11-17 AND 2000-11-17 <= d_date_max@1, required_guarantees=[d_date in (2000-06-30, 2000-09-27, 2000-11-17)] └────────────────────────────────────────────────── ┌───── Stage 15 ── Tasks: t0:[p0..p5] t1:[p6..p11] │ BroadcastExec: input_partitions=2, consumer_tasks=3, output_partitions=6 - │ PartitionIsolatorExec: tasks=2 partitions=4 - │ DataSourceExec: file_groups={4 groups: [[/testdata/tpcds/plans_sf1_partitions4/date_dim/part-0.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-2.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-3.parquet]]}, projection=[d_date_sk, d_date], file_type=parquet + │ DistributedLeafExec: DataSourceExec: file_groups={2 groups: [[/testdata/tpcds/plans_sf1_partitions4/date_dim/part-0.parquet, /testdata/tpcds/plans_sf1_partitions4/date_dim/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-2.parquet, /testdata/tpcds/plans_sf1_partitions4/date_dim/part-3.parquet]]}, projection=[d_date_sk, d_date], file_type=parquet └────────────────────────────────────────────────── ┌───── Stage 16 ── Tasks: t0:[p0..p5] t1:[p6..p11] │ BroadcastExec: input_partitions=2, consumer_tasks=3, output_partitions=6 - │ PartitionIsolatorExec: tasks=2 partitions=4 - │ DataSourceExec: file_groups={4 groups: [[/testdata/tpcds/plans_sf1_partitions4/item/part-0.parquet], [/testdata/tpcds/plans_sf1_partitions4/item/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/item/part-2.parquet], [/testdata/tpcds/plans_sf1_partitions4/item/part-3.parquet]]}, projection=[i_item_sk, i_item_id], file_type=parquet + │ DistributedLeafExec: DataSourceExec: file_groups={2 groups: [[/testdata/tpcds/plans_sf1_partitions4/item/part-0.parquet, /testdata/tpcds/plans_sf1_partitions4/item/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/item/part-2.parquet, /testdata/tpcds/plans_sf1_partitions4/item/part-3.parquet]]}, projection=[i_item_sk, i_item_id], file_type=parquet └────────────────────────────────────────────────── "); Ok(()) @@ -9448,23 +8719,22 @@ mod tests { #[tokio::test] async fn test_tpcds_84() -> Result<()> { let display = test_tpcds_query("q84").await?; - assert_snapshot!(display, @" + assert_snapshot!(display, @r" ┌───── DistributedExec ── Tasks: t0:[p0] │ ProjectionExec: expr=[customer_id@0 as customer_id, customername@1 as customername] │ SortPreservingMergeExec: [c_customer_id@2 ASC], fetch=100 - │ [Stage 6] => NetworkCoalesceExec: output_partitions=6, input_tasks=3 + │ [Stage 6] => NetworkCoalesceExec: output_partitions=9, input_tasks=3 └────────────────────────────────────────────────── - ┌───── Stage 6 ── Tasks: t0:[p0..p1] t1:[p2..p3] t2:[p4..p5] + ┌───── Stage 6 ── Tasks: t0:[p0..p2] t1:[p3..p5] t2:[p6..p8] │ SortExec: TopK(fetch=100), expr=[customer_id@0 ASC], preserve_partitioning=[true] │ ProjectionExec: expr=[c_customer_id@0 as customer_id, concat(concat(CASE WHEN c_last_name@2 IS NOT NULL THEN c_last_name@2 ELSE END, , ), CASE WHEN c_first_name@1 IS NOT NULL THEN c_first_name@1 ELSE END) as customername, c_customer_id@0 as c_customer_id] │ HashJoinExec: mode=CollectLeft, join_type=Inner, on=[(CAST(customer_demographics.cd_demo_sk AS Float64)@4, sr_cdemo_sk@0)], projection=[c_customer_id@0, c_first_name@1, c_last_name@2] │ CoalescePartitionsExec - │ [Stage 5] => NetworkBroadcastExec: partitions_per_consumer=2, stage_partitions=6, input_tasks=3 - │ PartitionIsolatorExec: tasks=3 partitions=6 - │ DataSourceExec: file_groups={6 groups: [[/testdata/tpcds/plans_sf1_partitions4/store_returns/part-0.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_returns/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_returns/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_returns/part-2.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_returns/part-2.parquet:..], ...]}, projection=[sr_cdemo_sk], file_type=parquet, predicate=DynamicFilter [ empty ] + │ [Stage 5] => NetworkBroadcastExec: partitions_per_consumer=3, stage_partitions=9, input_tasks=3 + │ DistributedLeafExec: DataSourceExec: file_groups={3 groups: [[/testdata/tpcds/plans_sf1_partitions4/store_returns/part-0.parquet:.., /testdata/tpcds/plans_sf1_partitions4/store_returns/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_returns/part-1.parquet:.., /testdata/tpcds/plans_sf1_partitions4/store_returns/part-2.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_returns/part-2.parquet:.., /testdata/tpcds/plans_sf1_partitions4/store_returns/part-3.parquet:..]]}, projection=[sr_cdemo_sk], file_type=parquet, predicate=DynamicFilter [ empty ] └────────────────────────────────────────────────── - ┌───── Stage 5 ── Tasks: t0:[p0..p5] t1:[p6..p11] t2:[p12..p17] - │ BroadcastExec: input_partitions=2, consumer_tasks=3, output_partitions=6 + ┌───── Stage 5 ── Tasks: t0:[p0..p8] t1:[p9..p17] t2:[p18..p26] + │ BroadcastExec: input_partitions=3, consumer_tasks=3, output_partitions=9 │ ProjectionExec: expr=[c_customer_id@0 as c_customer_id, c_first_name@1 as c_first_name, c_last_name@2 as c_last_name, cd_demo_sk@3 as cd_demo_sk, CAST(cd_demo_sk@3 AS Float64) as CAST(customer_demographics.cd_demo_sk AS Float64)] │ HashJoinExec: mode=CollectLeft, join_type=Inner, on=[(ib_income_band_sk@0, hd_income_band_sk@4)], projection=[c_customer_id@1, c_first_name@2, c_last_name@3, cd_demo_sk@4] │ CoalescePartitionsExec @@ -9476,20 +8746,17 @@ mod tests { │ HashJoinExec: mode=CollectLeft, join_type=Inner, on=[(c_current_cdemo_sk@1, CAST(customer_demographics.cd_demo_sk AS Float64)@1)], projection=[c_customer_id@0, c_current_hdemo_sk@2, c_first_name@3, c_last_name@4, cd_demo_sk@5] │ CoalescePartitionsExec │ [Stage 4] => NetworkBroadcastExec: partitions_per_consumer=3, stage_partitions=9, input_tasks=2 - │ PartitionIsolatorExec: tasks=3 partitions=6 - │ DataSourceExec: file_groups={6 groups: [[/testdata/tpcds/plans_sf1_partitions4/customer_demographics/part-0.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/customer_demographics/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/customer_demographics/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/customer_demographics/part-2.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/customer_demographics/part-2.parquet:..], ...]}, projection=[cd_demo_sk, CAST(cd_demo_sk@0 AS Float64) as CAST(customer_demographics.cd_demo_sk AS Float64)], file_type=parquet, predicate=DynamicFilter [ empty ] + │ DistributedLeafExec: DataSourceExec: file_groups={3 groups: [[/testdata/tpcds/plans_sf1_partitions4/customer_demographics/part-0.parquet:.., /testdata/tpcds/plans_sf1_partitions4/customer_demographics/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/customer_demographics/part-1.parquet:.., /testdata/tpcds/plans_sf1_partitions4/customer_demographics/part-2.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/customer_demographics/part-2.parquet:.., /testdata/tpcds/plans_sf1_partitions4/customer_demographics/part-3.parquet:..]]}, projection=[cd_demo_sk, CAST(cd_demo_sk@0 AS Float64) as CAST(customer_demographics.cd_demo_sk AS Float64)], file_type=parquet, predicate=DynamicFilter [ empty ] └────────────────────────────────────────────────── ┌───── Stage 1 ── Tasks: t0:[p0..p8] t1:[p9..p17] │ BroadcastExec: input_partitions=3, consumer_tasks=3, output_partitions=9 │ FilterExec: ib_lower_bound@1 >= 38128 AND ib_upper_bound@2 <= 88128, projection=[ib_income_band_sk@0] │ RepartitionExec: partitioning=RoundRobinBatch(3), input_partitions=2 - │ PartitionIsolatorExec: tasks=2 partitions=4 - │ DataSourceExec: file_groups={4 groups: [[/testdata/tpcds/plans_sf1_partitions4/income_band/part-0.parquet], [/testdata/tpcds/plans_sf1_partitions4/income_band/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/income_band/part-2.parquet], [/testdata/tpcds/plans_sf1_partitions4/income_band/part-3.parquet]]}, projection=[ib_income_band_sk, ib_lower_bound, ib_upper_bound], file_type=parquet, predicate=ib_lower_bound@1 >= 38128 AND ib_upper_bound@2 <= 88128, pruning_predicate=ib_lower_bound_null_count@1 != row_count@2 AND ib_lower_bound_max@0 >= 38128 AND ib_upper_bound_null_count@4 != row_count@2 AND ib_upper_bound_min@3 <= 88128, required_guarantees=[] + │ DistributedLeafExec: DataSourceExec: file_groups={2 groups: [[/testdata/tpcds/plans_sf1_partitions4/income_band/part-0.parquet, /testdata/tpcds/plans_sf1_partitions4/income_band/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/income_band/part-2.parquet, /testdata/tpcds/plans_sf1_partitions4/income_band/part-3.parquet]]}, projection=[ib_income_band_sk, ib_lower_bound, ib_upper_bound], file_type=parquet, predicate=ib_lower_bound@1 >= 38128 AND ib_upper_bound@2 <= 88128, pruning_predicate=ib_lower_bound_null_count@1 != row_count@2 AND ib_lower_bound_max@0 >= 38128 AND ib_upper_bound_null_count@4 != row_count@2 AND ib_upper_bound_min@3 <= 88128, required_guarantees=[] └────────────────────────────────────────────────── ┌───── Stage 2 ── Tasks: t0:[p0..p5] t1:[p6..p11] │ BroadcastExec: input_partitions=2, consumer_tasks=3, output_partitions=6 - │ PartitionIsolatorExec: tasks=2 partitions=4 - │ DataSourceExec: file_groups={4 groups: [[/testdata/tpcds/plans_sf1_partitions4/household_demographics/part-0.parquet], [/testdata/tpcds/plans_sf1_partitions4/household_demographics/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/household_demographics/part-2.parquet], [/testdata/tpcds/plans_sf1_partitions4/household_demographics/part-3.parquet]]}, projection=[hd_demo_sk, hd_income_band_sk, CAST(hd_demo_sk@0 AS Float64) as CAST(household_demographics.hd_demo_sk AS Float64)], file_type=parquet, predicate=DynamicFilter [ empty ] + │ DistributedLeafExec: DataSourceExec: file_groups={2 groups: [[/testdata/tpcds/plans_sf1_partitions4/household_demographics/part-0.parquet, /testdata/tpcds/plans_sf1_partitions4/household_demographics/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/household_demographics/part-2.parquet, /testdata/tpcds/plans_sf1_partitions4/household_demographics/part-3.parquet]]}, projection=[hd_demo_sk, hd_income_band_sk, CAST(hd_demo_sk@0 AS Float64) as CAST(household_demographics.hd_demo_sk AS Float64)], file_type=parquet, predicate=DynamicFilter [ empty ] └────────────────────────────────────────────────── ┌───── Stage 4 ── Tasks: t0:[p0..p8] t1:[p9..p17] │ BroadcastExec: input_partitions=3, consumer_tasks=3, output_partitions=9 @@ -9497,15 +8764,13 @@ mod tests { │ CoalescePartitionsExec │ [Stage 3] => NetworkBroadcastExec: partitions_per_consumer=3, stage_partitions=6, input_tasks=2 │ RepartitionExec: partitioning=RoundRobinBatch(3), input_partitions=2 - │ PartitionIsolatorExec: tasks=2 partitions=4 - │ DataSourceExec: file_groups={4 groups: [[/testdata/tpcds/plans_sf1_partitions4/customer/part-0.parquet], [/testdata/tpcds/plans_sf1_partitions4/customer/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/customer/part-2.parquet], [/testdata/tpcds/plans_sf1_partitions4/customer/part-3.parquet]]}, projection=[c_customer_id, c_current_cdemo_sk, c_current_hdemo_sk, c_current_addr_sk, c_first_name, c_last_name], file_type=parquet, predicate=DynamicFilter [ empty ] AND DynamicFilter [ empty ] AND DynamicFilter [ empty ] + │ DistributedLeafExec: DataSourceExec: file_groups={2 groups: [[/testdata/tpcds/plans_sf1_partitions4/customer/part-0.parquet, /testdata/tpcds/plans_sf1_partitions4/customer/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/customer/part-2.parquet, /testdata/tpcds/plans_sf1_partitions4/customer/part-3.parquet]]}, projection=[c_customer_id, c_current_cdemo_sk, c_current_hdemo_sk, c_current_addr_sk, c_first_name, c_last_name], file_type=parquet, predicate=DynamicFilter [ empty ] AND DynamicFilter [ empty ] AND DynamicFilter [ empty ] └────────────────────────────────────────────────── ┌───── Stage 3 ── Tasks: t0:[p0..p5] t1:[p6..p11] │ BroadcastExec: input_partitions=3, consumer_tasks=2, output_partitions=6 │ FilterExec: ca_city@1 = Edgewood, projection=[ca_address_sk@0] │ RepartitionExec: partitioning=RoundRobinBatch(3), input_partitions=2 - │ PartitionIsolatorExec: tasks=2 partitions=4 - │ DataSourceExec: file_groups={4 groups: [[/testdata/tpcds/plans_sf1_partitions4/customer_address/part-0.parquet], [/testdata/tpcds/plans_sf1_partitions4/customer_address/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/customer_address/part-2.parquet], [/testdata/tpcds/plans_sf1_partitions4/customer_address/part-3.parquet]]}, projection=[ca_address_sk, ca_city], file_type=parquet, predicate=ca_city@6 = Edgewood, pruning_predicate=ca_city_null_count@2 != row_count@3 AND ca_city_min@0 <= Edgewood AND Edgewood <= ca_city_max@1, required_guarantees=[ca_city in (Edgewood)] + │ DistributedLeafExec: DataSourceExec: file_groups={2 groups: [[/testdata/tpcds/plans_sf1_partitions4/customer_address/part-0.parquet, /testdata/tpcds/plans_sf1_partitions4/customer_address/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/customer_address/part-2.parquet, /testdata/tpcds/plans_sf1_partitions4/customer_address/part-3.parquet]]}, projection=[ca_address_sk, ca_city], file_type=parquet, predicate=ca_city@6 = Edgewood, pruning_predicate=ca_city_null_count@2 != row_count@3 AND ca_city_min@0 <= Edgewood AND Edgewood <= ca_city_max@1, required_guarantees=[ca_city in (Edgewood)] └────────────────────────────────────────────────── "); Ok(()) @@ -9513,7 +8778,7 @@ mod tests { #[tokio::test] async fn test_tpcds_85() -> Result<()> { let display = test_tpcds_query("q85").await?; - assert_snapshot!(display, @" + assert_snapshot!(display, @r" ┌───── DistributedExec ── Tasks: t0:[p0] │ SortPreservingMergeExec: [substr(reason.r_reason_desc,Int64(1),Int64(20))@0 ASC NULLS LAST, avg1@1 ASC NULLS LAST, avg2@2 ASC NULLS LAST, avg(web_returns.wr_fee)@3 ASC NULLS LAST], fetch=100 │ [Stage 10] => NetworkCoalesceExec: output_partitions=6, input_tasks=2 @@ -9537,42 +8802,37 @@ mod tests { │ ProjectionExec: expr=[d_date_sk@0 as d_date_sk, CAST(d_date_sk@0 AS Float64) as CAST(date_dim.d_date_sk AS Float64)] │ FilterExec: d_year@1 = 2000, projection=[d_date_sk@0] │ RepartitionExec: partitioning=RoundRobinBatch(3), input_partitions=2 - │ PartitionIsolatorExec: tasks=3 partitions=4 - │ DataSourceExec: file_groups={4 groups: [[/testdata/tpcds/plans_sf1_partitions4/date_dim/part-0.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-2.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-3.parquet]]}, projection=[d_date_sk, d_year], file_type=parquet, predicate=d_year@6 = 2000 AND DynamicFilter [ empty ], pruning_predicate=d_year_null_count@2 != row_count@3 AND d_year_min@0 <= 2000 AND 2000 <= d_year_max@1, required_guarantees=[d_year in (2000)] + │ DistributedLeafExec: DataSourceExec: file_groups={2 groups: [[/testdata/tpcds/plans_sf1_partitions4/date_dim/part-0.parquet, /testdata/tpcds/plans_sf1_partitions4/date_dim/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-2.parquet, /testdata/tpcds/plans_sf1_partitions4/date_dim/part-3.parquet]]}, projection=[d_date_sk, d_year], file_type=parquet, predicate=d_year@6 = 2000 AND DynamicFilter [ empty ], pruning_predicate=d_year_null_count@2 != row_count@3 AND d_year_min@0 <= 2000 AND 2000 <= d_year_max@1, required_guarantees=[d_year in (2000)] └────────────────────────────────────────────────── ┌───── Stage 1 ── Tasks: t0:[p0..p5] t1:[p6..p11] │ BroadcastExec: input_partitions=2, consumer_tasks=3, output_partitions=6 - │ PartitionIsolatorExec: tasks=2 partitions=4 - │ DataSourceExec: file_groups={4 groups: [[/testdata/tpcds/plans_sf1_partitions4/reason/part-0.parquet], [/testdata/tpcds/plans_sf1_partitions4/reason/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/reason/part-2.parquet], [/testdata/tpcds/plans_sf1_partitions4/reason/part-3.parquet]]}, projection=[r_reason_sk, r_reason_desc, CAST(r_reason_sk@0 AS Float64) as CAST(reason.r_reason_sk AS Float64)], file_type=parquet + │ DistributedLeafExec: DataSourceExec: file_groups={2 groups: [[/testdata/tpcds/plans_sf1_partitions4/reason/part-0.parquet, /testdata/tpcds/plans_sf1_partitions4/reason/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/reason/part-2.parquet, /testdata/tpcds/plans_sf1_partitions4/reason/part-3.parquet]]}, projection=[r_reason_sk, r_reason_desc, CAST(r_reason_sk@0 AS Float64) as CAST(reason.r_reason_sk AS Float64)], file_type=parquet └────────────────────────────────────────────────── ┌───── Stage 8 ── Tasks: t0:[p0..p8] t1:[p9..p17] t2:[p18..p26] │ BroadcastExec: input_partitions=3, consumer_tasks=3, output_partitions=9 │ HashJoinExec: mode=CollectLeft, join_type=Inner, on=[(wr_refunded_addr_sk@3, CAST(customer_address.ca_address_sk AS Float64)@2)], filter=(ca_state@1 = IN OR ca_state@1 = OH OR ca_state@1 = NJ) AND ws_net_profit@0 >= Some(10000),7,2 AND ws_net_profit@0 <= Some(20000),7,2 OR (ca_state@1 = WI OR ca_state@1 = CT OR ca_state@1 = KY) AND ws_net_profit@0 >= Some(15000),7,2 AND ws_net_profit@0 <= Some(30000),7,2 OR (ca_state@1 = LA OR ca_state@1 = IA OR ca_state@1 = AR) AND ws_net_profit@0 >= Some(5000),7,2 AND ws_net_profit@0 <= Some(25000),7,2, projection=[ws_sold_date_sk@0, ws_quantity@1, wr_reason_sk@4, wr_fee@5, wr_refunded_cash@6] │ CoalescePartitionsExec - │ [Stage 7] => NetworkBroadcastExec: partitions_per_consumer=2, stage_partitions=6, input_tasks=3 + │ [Stage 7] => NetworkBroadcastExec: partitions_per_consumer=3, stage_partitions=9, input_tasks=3 │ ProjectionExec: expr=[ca_address_sk@0 as ca_address_sk, ca_state@1 as ca_state, CAST(ca_address_sk@0 AS Float64) as CAST(customer_address.ca_address_sk AS Float64)] │ FilterExec: (ca_state@1 = IN OR ca_state@1 = OH OR ca_state@1 = NJ OR ca_state@1 = WI OR ca_state@1 = CT OR ca_state@1 = KY OR ca_state@1 = LA OR ca_state@1 = IA OR ca_state@1 = AR) AND ca_state@1 IN (SET) ([IN, OH, NJ, WI, CT, KY, LA, IA, AR]) AND ca_country@2 = United States, projection=[ca_address_sk@0, ca_state@1] │ RepartitionExec: partitioning=RoundRobinBatch(3), input_partitions=2 - │ PartitionIsolatorExec: tasks=3 partitions=4 - │ DataSourceExec: file_groups={4 groups: [[/testdata/tpcds/plans_sf1_partitions4/customer_address/part-0.parquet], [/testdata/tpcds/plans_sf1_partitions4/customer_address/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/customer_address/part-2.parquet], [/testdata/tpcds/plans_sf1_partitions4/customer_address/part-3.parquet]]}, projection=[ca_address_sk, ca_state, ca_country], file_type=parquet, predicate=(ca_state@8 = IN OR ca_state@8 = OH OR ca_state@8 = NJ OR ca_state@8 = WI OR ca_state@8 = CT OR ca_state@8 = KY OR ca_state@8 = LA OR ca_state@8 = IA OR ca_state@8 = AR) AND ca_state@8 IN (SET) ([IN, OH, NJ, WI, CT, KY, LA, IA, AR]) AND ca_country@10 = United States AND DynamicFilter [ empty ], pruning_predicate=(ca_state_null_count@2 != row_count@3 AND ca_state_min@0 <= IN AND IN <= ca_state_max@1 OR ca_state_null_count@2 != row_count@3 AND ca_state_min@0 <= OH AND OH <= ca_state_max@1 OR ca_state_null_count@2 != row_count@3 AND ca_state_min@0 <= NJ AND NJ <= ca_state_max@1 OR ca_state_null_count@2 != row_count@3 AND ca_state_min@0 <= WI AND WI <= ca_state_max@1 OR ca_state_null_count@2 != row_count@3 AND ca_state_min@0 <= CT AND CT <= ca_state_max@1 OR ca_state_null_count@2 != row_count@3 AND ca_state_min@0 <= KY AND KY <= ca_state_max@1 OR ca_state_null_count@2 != row_count@3 AND ca_state_min@0 <= LA AND LA <= ca_state_max@1 OR ca_state_null_count@2 != row_count@3 AND ca_state_min@0 <= IA AND IA <= ca_state_max@1 OR ca_state_null_count@2 != row_count@3 AND ca_state_min@0 <= AR AND AR <= ca_state_max@1) AND (ca_state_null_count@2 != row_count@3 AND ca_state_min@0 <= IN AND IN <= ca_state_max@1 OR ca_state_null_count@2 != row_count@3 AND ca_state_min@0 <= OH AND OH <= ca_state_max@1 OR ca_state_null_count@2 != row_count@3 AND ca_state_min@0 <= NJ AND NJ <= ca_state_max@1 OR ca_state_null_count@2 != row_count@3 AND ca_state_min@0 <= WI AND WI <= ca_state_max@1 OR ca_state_null_count@2 != row_count@3 AND ca_state_min@0 <= CT AND CT <= ca_state_max@1 OR ca_state_null_count@2 != row_count@3 AND ca_state_min@0 <= KY AND KY <= ca_state_max@1 OR ca_state_null_count@2 != row_count@3 AND ca_state_min@0 <= LA AND LA <= ca_state_max@1 OR ca_state_null_count@2 != row_count@3 AND ca_state_min@0 <= IA AND IA <= ca_state_max@1 OR ca_state_null_count@2 != row_count@3 AND ca_state_min@0 <= AR AND AR <= ca_state_max@1) AND ca_country_null_count@6 != row_count@3 AND ca_country_min@4 <= United States AND United States <= ca_country_max@5, required_guarantees=[ca_country in (United States), ca_state in (AR, CT, IA, IN, KY, LA, NJ, OH, WI)] + │ DistributedLeafExec: DataSourceExec: file_groups={2 groups: [[/testdata/tpcds/plans_sf1_partitions4/customer_address/part-0.parquet, /testdata/tpcds/plans_sf1_partitions4/customer_address/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/customer_address/part-2.parquet, /testdata/tpcds/plans_sf1_partitions4/customer_address/part-3.parquet]]}, projection=[ca_address_sk, ca_state, ca_country], file_type=parquet, predicate=(ca_state@8 = IN OR ca_state@8 = OH OR ca_state@8 = NJ OR ca_state@8 = WI OR ca_state@8 = CT OR ca_state@8 = KY OR ca_state@8 = LA OR ca_state@8 = IA OR ca_state@8 = AR) AND ca_state@8 IN (SET) ([IN, OH, NJ, WI, CT, KY, LA, IA, AR]) AND ca_country@10 = United States AND DynamicFilter [ empty ], pruning_predicate=(ca_state_null_count@2 != row_count@3 AND ca_state_min@0 <= IN AND IN <= ca_state_max@1 OR ca_state_null_count@2 != row_count@3 AND ca_state_min@0 <= OH AND OH <= ca_state_max@1 OR ca_state_null_count@2 != row_count@3 AND ca_state_min@0 <= NJ AND NJ <= ca_state_max@1 OR ca_state_null_count@2 != row_count@3 AND ca_state_min@0 <= WI AND WI <= ca_state_max@1 OR ca_state_null_count@2 != row_count@3 AND ca_state_min@0 <= CT AND CT <= ca_state_max@1 OR ca_state_null_count@2 != row_count@3 AND ca_state_min@0 <= KY AND KY <= ca_state_max@1 OR ca_state_null_count@2 != row_count@3 AND ca_state_min@0 <= LA AND LA <= ca_state_max@1 OR ca_state_null_count@2 != row_count@3 AND ca_state_min@0 <= IA AND IA <= ca_state_max@1 OR ca_state_null_count@2 != row_count@3 AND ca_state_min@0 <= AR AND AR <= ca_state_max@1) AND (ca_state_null_count@2 != row_count@3 AND ca_state_min@0 <= IN AND IN <= ca_state_max@1 OR ca_state_null_count@2 != row_count@3 AND ca_state_min@0 <= OH AND OH <= ca_state_max@1 OR ca_state_null_count@2 != row_count@3 AND ca_state_min@0 <= NJ AND NJ <= ca_state_max@1 OR ca_state_null_count@2 != row_count@3 AND ca_state_min@0 <= WI AND WI <= ca_state_max@1 OR ca_state_null_count@2 != row_count@3 AND ca_state_min@0 <= CT AND CT <= ca_state_max@1 OR ca_state_null_count@2 != row_count@3 AND ca_state_min@0 <= KY AND KY <= ca_state_max@1 OR ca_state_null_count@2 != row_count@3 AND ca_state_min@0 <= LA AND LA <= ca_state_max@1 OR ca_state_null_count@2 != row_count@3 AND ca_state_min@0 <= IA AND IA <= ca_state_max@1 OR ca_state_null_count@2 != row_count@3 AND ca_state_min@0 <= AR AND AR <= ca_state_max@1) AND ca_country_null_count@6 != row_count@3 AND ca_country_min@4 <= United States AND United States <= ca_country_max@5, required_guarantees=[ca_country in (United States), ca_state in (AR, CT, IA, IN, KY, LA, NJ, OH, WI)] └────────────────────────────────────────────────── - ┌───── Stage 7 ── Tasks: t0:[p0..p5] t1:[p6..p11] t2:[p12..p17] - │ BroadcastExec: input_partitions=2, consumer_tasks=3, output_partitions=6 + ┌───── Stage 7 ── Tasks: t0:[p0..p8] t1:[p9..p17] t2:[p18..p26] + │ BroadcastExec: input_partitions=3, consumer_tasks=3, output_partitions=9 │ HashJoinExec: mode=CollectLeft, join_type=Inner, on=[(wr_returning_cdemo_sk@4, CAST(cd2.cd_demo_sk AS Float64)@3), (cd_marital_status@8, cd_marital_status@1), (cd_education_status@9, cd_education_status@2)], projection=[ws_sold_date_sk@0, ws_quantity@1, ws_net_profit@2, wr_refunded_addr_sk@3, wr_reason_sk@5, wr_fee@6, wr_refunded_cash@7] │ CoalescePartitionsExec - │ [Stage 6] => NetworkBroadcastExec: partitions_per_consumer=2, stage_partitions=6, input_tasks=3 - │ PartitionIsolatorExec: tasks=3 partitions=6 - │ DataSourceExec: file_groups={6 groups: [[/testdata/tpcds/plans_sf1_partitions4/customer_demographics/part-0.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/customer_demographics/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/customer_demographics/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/customer_demographics/part-2.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/customer_demographics/part-2.parquet:..], ...]}, projection=[cd_demo_sk, cd_marital_status, cd_education_status, CAST(cd_demo_sk@0 AS Float64) as CAST(cd2.cd_demo_sk AS Float64)], file_type=parquet, predicate=DynamicFilter [ empty ] + │ [Stage 6] => NetworkBroadcastExec: partitions_per_consumer=3, stage_partitions=9, input_tasks=3 + │ DistributedLeafExec: DataSourceExec: file_groups={3 groups: [[/testdata/tpcds/plans_sf1_partitions4/customer_demographics/part-0.parquet:.., /testdata/tpcds/plans_sf1_partitions4/customer_demographics/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/customer_demographics/part-1.parquet:.., /testdata/tpcds/plans_sf1_partitions4/customer_demographics/part-2.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/customer_demographics/part-2.parquet:.., /testdata/tpcds/plans_sf1_partitions4/customer_demographics/part-3.parquet:..]]}, projection=[cd_demo_sk, cd_marital_status, cd_education_status, CAST(cd_demo_sk@0 AS Float64) as CAST(cd2.cd_demo_sk AS Float64)], file_type=parquet, predicate=DynamicFilter [ empty ] └────────────────────────────────────────────────── - ┌───── Stage 6 ── Tasks: t0:[p0..p5] t1:[p6..p11] t2:[p12..p17] - │ BroadcastExec: input_partitions=2, consumer_tasks=3, output_partitions=6 + ┌───── Stage 6 ── Tasks: t0:[p0..p8] t1:[p9..p17] t2:[p18..p26] + │ BroadcastExec: input_partitions=3, consumer_tasks=3, output_partitions=9 │ HashJoinExec: mode=CollectLeft, join_type=Inner, on=[(wr_refunded_cdemo_sk@4, CAST(cd1.cd_demo_sk AS Float64)@3)], filter=cd_marital_status@1 = M AND cd_education_status@2 = Advanced Degree AND ws_sales_price@0 >= Some(10000),5,2 AND ws_sales_price@0 <= Some(15000),5,2 OR cd_marital_status@1 = S AND cd_education_status@2 = College AND ws_sales_price@0 >= Some(5000),5,2 AND ws_sales_price@0 <= Some(10000),5,2 OR cd_marital_status@1 = W AND cd_education_status@2 = 2 yr Degree AND ws_sales_price@0 >= Some(15000),5,2 AND ws_sales_price@0 <= Some(20000),5,2, projection=[ws_sold_date_sk@0, ws_quantity@1, ws_net_profit@3, wr_refunded_addr_sk@5, wr_returning_cdemo_sk@6, wr_reason_sk@7, wr_fee@8, wr_refunded_cash@9, cd_marital_status@11, cd_education_status@12] │ CoalescePartitionsExec │ [Stage 5] => NetworkBroadcastExec: partitions_per_consumer=3, stage_partitions=9, input_tasks=2 │ ProjectionExec: expr=[cd_demo_sk@0 as cd_demo_sk, cd_marital_status@1 as cd_marital_status, cd_education_status@2 as cd_education_status, CAST(cd_demo_sk@0 AS Float64) as CAST(cd1.cd_demo_sk AS Float64)] │ FilterExec: cd_marital_status@1 = M AND cd_education_status@2 = Advanced Degree OR cd_marital_status@1 = S AND cd_education_status@2 = College OR cd_marital_status@1 = W AND cd_education_status@2 = 2 yr Degree - │ PartitionIsolatorExec: tasks=3 partitions=6 - │ DataSourceExec: file_groups={6 groups: [[/testdata/tpcds/plans_sf1_partitions4/customer_demographics/part-0.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/customer_demographics/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/customer_demographics/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/customer_demographics/part-2.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/customer_demographics/part-2.parquet:..], ...]}, projection=[cd_demo_sk, cd_marital_status, cd_education_status], file_type=parquet, predicate=(cd_marital_status@2 = M AND cd_education_status@3 = Advanced Degree OR cd_marital_status@2 = S AND cd_education_status@3 = College OR cd_marital_status@2 = W AND cd_education_status@3 = 2 yr Degree) AND DynamicFilter [ empty ], pruning_predicate=cd_marital_status_null_count@2 != row_count@3 AND cd_marital_status_min@0 <= M AND M <= cd_marital_status_max@1 AND cd_education_status_null_count@6 != row_count@3 AND cd_education_status_min@4 <= Advanced Degree AND Advanced Degree <= cd_education_status_max@5 OR cd_marital_status_null_count@2 != row_count@3 AND cd_marital_status_min@0 <= S AND S <= cd_marital_status_max@1 AND cd_education_status_null_count@6 != row_count@3 AND cd_education_status_min@4 <= College AND College <= cd_education_status_max@5 OR cd_marital_status_null_count@2 != row_count@3 AND cd_marital_status_min@0 <= W AND W <= cd_marital_status_max@1 AND cd_education_status_null_count@6 != row_count@3 AND cd_education_status_min@4 <= 2 yr Degree AND 2 yr Degree <= cd_education_status_max@5, required_guarantees=[cd_education_status in (2 yr Degree, Advanced Degree, College), cd_marital_status in (M, S, W)] + │ DistributedLeafExec: DataSourceExec: file_groups={3 groups: [[/testdata/tpcds/plans_sf1_partitions4/customer_demographics/part-0.parquet:.., /testdata/tpcds/plans_sf1_partitions4/customer_demographics/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/customer_demographics/part-1.parquet:.., /testdata/tpcds/plans_sf1_partitions4/customer_demographics/part-2.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/customer_demographics/part-2.parquet:.., /testdata/tpcds/plans_sf1_partitions4/customer_demographics/part-3.parquet:..]]}, projection=[cd_demo_sk, cd_marital_status, cd_education_status], file_type=parquet, predicate=(cd_marital_status@2 = M AND cd_education_status@3 = Advanced Degree OR cd_marital_status@2 = S AND cd_education_status@3 = College OR cd_marital_status@2 = W AND cd_education_status@3 = 2 yr Degree) AND DynamicFilter [ empty ], pruning_predicate=cd_marital_status_null_count@2 != row_count@3 AND cd_marital_status_min@0 <= M AND M <= cd_marital_status_max@1 AND cd_education_status_null_count@6 != row_count@3 AND cd_education_status_min@4 <= Advanced Degree AND Advanced Degree <= cd_education_status_max@5 OR cd_marital_status_null_count@2 != row_count@3 AND cd_marital_status_min@0 <= S AND S <= cd_marital_status_max@1 AND cd_education_status_null_count@6 != row_count@3 AND cd_education_status_min@4 <= College AND College <= cd_education_status_max@5 OR cd_marital_status_null_count@2 != row_count@3 AND cd_marital_status_min@0 <= W AND W <= cd_marital_status_max@1 AND cd_education_status_null_count@6 != row_count@3 AND cd_education_status_min@4 <= 2 yr Degree AND 2 yr Degree <= cd_education_status_max@5, required_guarantees=[cd_education_status in (2 yr Degree, Advanced Degree, College), cd_marital_status in (M, S, W)] └────────────────────────────────────────────────── ┌───── Stage 5 ── Tasks: t0:[p0..p8] t1:[p9..p17] │ BroadcastExec: input_partitions=3, consumer_tasks=3, output_partitions=9 @@ -9586,19 +8846,16 @@ mod tests { └────────────────────────────────────────────────── ┌───── Stage 2 ── Tasks: t0:[p0..p3] t1:[p4..p7] │ BroadcastExec: input_partitions=2, consumer_tasks=2, output_partitions=4 - │ PartitionIsolatorExec: tasks=2 partitions=4 - │ DataSourceExec: file_groups={4 groups: [[/testdata/tpcds/plans_sf1_partitions4/web_page/part-0.parquet], [/testdata/tpcds/plans_sf1_partitions4/web_page/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/web_page/part-2.parquet], [/testdata/tpcds/plans_sf1_partitions4/web_page/part-3.parquet]]}, projection=[wp_web_page_sk, CAST(wp_web_page_sk@0 AS Float64) as CAST(web_page.wp_web_page_sk AS Float64)], file_type=parquet + │ DistributedLeafExec: DataSourceExec: file_groups={2 groups: [[/testdata/tpcds/plans_sf1_partitions4/web_page/part-0.parquet, /testdata/tpcds/plans_sf1_partitions4/web_page/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/web_page/part-2.parquet, /testdata/tpcds/plans_sf1_partitions4/web_page/part-3.parquet]]}, projection=[wp_web_page_sk, CAST(wp_web_page_sk@0 AS Float64) as CAST(web_page.wp_web_page_sk AS Float64)], file_type=parquet └────────────────────────────────────────────────── ┌───── Stage 3 ── Tasks: t0:[p0..p5] t1:[p0..p5] │ RepartitionExec: partitioning=Hash([wr_item_sk@0, wr_order_number@5], 6), input_partitions=2 - │ PartitionIsolatorExec: tasks=2 partitions=4 - │ DataSourceExec: file_groups={4 groups: [[/testdata/tpcds/plans_sf1_partitions4/web_returns/part-0.parquet], [/testdata/tpcds/plans_sf1_partitions4/web_returns/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/web_returns/part-2.parquet], [/testdata/tpcds/plans_sf1_partitions4/web_returns/part-3.parquet]]}, projection=[wr_item_sk, wr_refunded_cdemo_sk, wr_refunded_addr_sk, wr_returning_cdemo_sk, wr_reason_sk, wr_order_number, wr_fee, wr_refunded_cash], file_type=parquet, predicate=DynamicFilter [ empty ] + │ DistributedLeafExec: DataSourceExec: file_groups={2 groups: [[/testdata/tpcds/plans_sf1_partitions4/web_returns/part-0.parquet, /testdata/tpcds/plans_sf1_partitions4/web_returns/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/web_returns/part-2.parquet, /testdata/tpcds/plans_sf1_partitions4/web_returns/part-3.parquet]]}, projection=[wr_item_sk, wr_refunded_cdemo_sk, wr_refunded_addr_sk, wr_returning_cdemo_sk, wr_reason_sk, wr_order_number, wr_fee, wr_refunded_cash], file_type=parquet, predicate=DynamicFilter [ empty ] └────────────────────────────────────────────────── ┌───── Stage 4 ── Tasks: t0:[p0..p5] t1:[p0..p5] t2:[p0..p5] - │ RepartitionExec: partitioning=Hash([ws_item_sk@1, ws_order_number@3], 6), input_partitions=2 + │ RepartitionExec: partitioning=Hash([ws_item_sk@1, ws_order_number@3], 6), input_partitions=3 │ FilterExec: (ws_net_profit@6 >= Some(10000),7,2 AND ws_net_profit@6 <= Some(20000),7,2 OR ws_net_profit@6 >= Some(15000),7,2 AND ws_net_profit@6 <= Some(30000),7,2 OR ws_net_profit@6 >= Some(5000),7,2 AND ws_net_profit@6 <= Some(25000),7,2) AND (ws_sales_price@5 >= Some(10000),5,2 AND ws_sales_price@5 <= Some(15000),5,2 OR ws_sales_price@5 >= Some(5000),5,2 AND ws_sales_price@5 <= Some(10000),5,2 OR ws_sales_price@5 >= Some(15000),5,2 AND ws_sales_price@5 <= Some(20000),5,2) - │ PartitionIsolatorExec: tasks=3 partitions=6 - │ DataSourceExec: file_groups={6 groups: [[/testdata/tpcds/plans_sf1_partitions4/web_sales/part-0.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/web_sales/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/web_sales/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/web_sales/part-2.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/web_sales/part-2.parquet:..], ...]}, projection=[ws_sold_date_sk, ws_item_sk, ws_web_page_sk, ws_order_number, ws_quantity, ws_sales_price, ws_net_profit], file_type=parquet, predicate=(ws_net_profit@33 >= Some(10000),7,2 AND ws_net_profit@33 <= Some(20000),7,2 OR ws_net_profit@33 >= Some(15000),7,2 AND ws_net_profit@33 <= Some(30000),7,2 OR ws_net_profit@33 >= Some(5000),7,2 AND ws_net_profit@33 <= Some(25000),7,2) AND (ws_sales_price@21 >= Some(10000),5,2 AND ws_sales_price@21 <= Some(15000),5,2 OR ws_sales_price@21 >= Some(5000),5,2 AND ws_sales_price@21 <= Some(10000),5,2 OR ws_sales_price@21 >= Some(15000),5,2 AND ws_sales_price@21 <= Some(20000),5,2) AND DynamicFilter [ empty ] AND DynamicFilter [ empty ], pruning_predicate=(ws_net_profit_null_count@1 != row_count@2 AND ws_net_profit_max@0 >= Some(10000),7,2 AND ws_net_profit_null_count@1 != row_count@2 AND ws_net_profit_min@3 <= Some(20000),7,2 OR ws_net_profit_null_count@1 != row_count@2 AND ws_net_profit_max@0 >= Some(15000),7,2 AND ws_net_profit_null_count@1 != row_count@2 AND ws_net_profit_min@3 <= Some(30000),7,2 OR ws_net_profit_null_count@1 != row_count@2 AND ws_net_profit_max@0 >= Some(5000),7,2 AND ws_net_profit_null_count@1 != row_count@2 AND ws_net_profit_min@3 <= Some(25000),7,2) AND (ws_sales_price_null_count@5 != row_count@2 AND ws_sales_price_max@4 >= Some(10000),5,2 AND ws_sales_price_null_count@5 != row_count@2 AND ws_sales_price_min@6 <= Some(15000),5,2 OR ws_sales_price_null_count@5 != row_count@2 AND ws_sales_price_max@4 >= Some(5000),5,2 AND ws_sales_price_null_count@5 != row_count@2 AND ws_sales_price_min@6 <= Some(10000),5,2 OR ws_sales_price_null_count@5 != row_count@2 AND ws_sales_price_max@4 >= Some(15000),5,2 AND ws_sales_price_null_count@5 != row_count@2 AND ws_sales_price_min@6 <= Some(20000),5,2), required_guarantees=[] + │ DistributedLeafExec: DataSourceExec: file_groups={3 groups: [[/testdata/tpcds/plans_sf1_partitions4/web_sales/part-0.parquet:.., /testdata/tpcds/plans_sf1_partitions4/web_sales/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/web_sales/part-1.parquet:.., /testdata/tpcds/plans_sf1_partitions4/web_sales/part-2.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/web_sales/part-2.parquet:.., /testdata/tpcds/plans_sf1_partitions4/web_sales/part-3.parquet:..]]}, projection=[ws_sold_date_sk, ws_item_sk, ws_web_page_sk, ws_order_number, ws_quantity, ws_sales_price, ws_net_profit], file_type=parquet, predicate=(ws_net_profit@33 >= Some(10000),7,2 AND ws_net_profit@33 <= Some(20000),7,2 OR ws_net_profit@33 >= Some(15000),7,2 AND ws_net_profit@33 <= Some(30000),7,2 OR ws_net_profit@33 >= Some(5000),7,2 AND ws_net_profit@33 <= Some(25000),7,2) AND (ws_sales_price@21 >= Some(10000),5,2 AND ws_sales_price@21 <= Some(15000),5,2 OR ws_sales_price@21 >= Some(5000),5,2 AND ws_sales_price@21 <= Some(10000),5,2 OR ws_sales_price@21 >= Some(15000),5,2 AND ws_sales_price@21 <= Some(20000),5,2) AND DynamicFilter [ empty ] AND DynamicFilter [ empty ], pruning_predicate=(ws_net_profit_null_count@1 != row_count@2 AND ws_net_profit_max@0 >= Some(10000),7,2 AND ws_net_profit_null_count@1 != row_count@2 AND ws_net_profit_min@3 <= Some(20000),7,2 OR ws_net_profit_null_count@1 != row_count@2 AND ws_net_profit_max@0 >= Some(15000),7,2 AND ws_net_profit_null_count@1 != row_count@2 AND ws_net_profit_min@3 <= Some(30000),7,2 OR ws_net_profit_null_count@1 != row_count@2 AND ws_net_profit_max@0 >= Some(5000),7,2 AND ws_net_profit_null_count@1 != row_count@2 AND ws_net_profit_min@3 <= Some(25000),7,2) AND (ws_sales_price_null_count@5 != row_count@2 AND ws_sales_price_max@4 >= Some(10000),5,2 AND ws_sales_price_null_count@5 != row_count@2 AND ws_sales_price_min@6 <= Some(15000),5,2 OR ws_sales_price_null_count@5 != row_count@2 AND ws_sales_price_max@4 >= Some(5000),5,2 AND ws_sales_price_null_count@5 != row_count@2 AND ws_sales_price_min@6 <= Some(10000),5,2 OR ws_sales_price_null_count@5 != row_count@2 AND ws_sales_price_max@4 >= Some(15000),5,2 AND ws_sales_price_null_count@5 != row_count@2 AND ws_sales_price_min@6 <= Some(20000),5,2), required_guarantees=[] └────────────────────────────────────────────────── "); Ok(()) @@ -9654,7 +8911,7 @@ mod tests { #[tokio::test] async fn test_tpcds_87() -> Result<()> { let display = test_tpcds_query("q87").await?; - assert_snapshot!(display, @" + assert_snapshot!(display, @r" ┌───── DistributedExec ── Tasks: t0:[p0] │ ProjectionExec: expr=[count(Int64(1))@0 as count(*)] │ AggregateExec: mode=Final, gby=[], aggr=[count(Int64(1))] @@ -9685,28 +8942,25 @@ mod tests { │ ProjectionExec: expr=[c_last_name@2 as c_last_name, c_first_name@1 as c_first_name, d_date@0 as d_date] │ HashJoinExec: mode=CollectLeft, join_type=Inner, on=[(ws_bill_customer_sk@0, CAST(customer.c_customer_sk AS Float64)@3)], projection=[d_date@1, c_first_name@3, c_last_name@4] │ CoalescePartitionsExec - │ [Stage 2] => NetworkBroadcastExec: partitions_per_consumer=2, stage_partitions=6, input_tasks=3 + │ [Stage 2] => NetworkBroadcastExec: partitions_per_consumer=3, stage_partitions=9, input_tasks=3 │ ProjectionExec: expr=[c_customer_sk@0 as c_customer_sk, c_first_name@1 as c_first_name, c_last_name@2 as c_last_name, CAST(c_customer_sk@0 AS Float64) as CAST(customer.c_customer_sk AS Float64)] │ RepartitionExec: partitioning=RoundRobinBatch(3), input_partitions=2 - │ PartitionIsolatorExec: tasks=3 partitions=4 - │ DataSourceExec: file_groups={4 groups: [[/testdata/tpcds/plans_sf1_partitions4/customer/part-0.parquet], [/testdata/tpcds/plans_sf1_partitions4/customer/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/customer/part-2.parquet], [/testdata/tpcds/plans_sf1_partitions4/customer/part-3.parquet]]}, projection=[c_customer_sk, c_first_name, c_last_name], file_type=parquet, predicate=DynamicFilter [ empty ] + │ DistributedLeafExec: DataSourceExec: file_groups={2 groups: [[/testdata/tpcds/plans_sf1_partitions4/customer/part-0.parquet, /testdata/tpcds/plans_sf1_partitions4/customer/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/customer/part-2.parquet, /testdata/tpcds/plans_sf1_partitions4/customer/part-3.parquet]]}, projection=[c_customer_sk, c_first_name, c_last_name], file_type=parquet, predicate=DynamicFilter [ empty ] └────────────────────────────────────────────────── - ┌───── Stage 2 ── Tasks: t0:[p0..p5] t1:[p6..p11] t2:[p12..p17] - │ BroadcastExec: input_partitions=2, consumer_tasks=3, output_partitions=6 + ┌───── Stage 2 ── Tasks: t0:[p0..p8] t1:[p9..p17] t2:[p18..p26] + │ BroadcastExec: input_partitions=3, consumer_tasks=3, output_partitions=9 │ ProjectionExec: expr=[ws_bill_customer_sk@1 as ws_bill_customer_sk, d_date@0 as d_date] │ HashJoinExec: mode=CollectLeft, join_type=Inner, on=[(CAST(date_dim.d_date_sk AS Float64)@2, ws_sold_date_sk@0)], projection=[d_date@1, ws_bill_customer_sk@4] │ CoalescePartitionsExec │ [Stage 1] => NetworkBroadcastExec: partitions_per_consumer=3, stage_partitions=9, input_tasks=2 - │ PartitionIsolatorExec: tasks=3 partitions=6 - │ DataSourceExec: file_groups={6 groups: [[/testdata/tpcds/plans_sf1_partitions4/web_sales/part-0.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/web_sales/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/web_sales/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/web_sales/part-2.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/web_sales/part-2.parquet:..], ...]}, projection=[ws_sold_date_sk, ws_bill_customer_sk], file_type=parquet, predicate=DynamicFilter [ empty ] + │ DistributedLeafExec: DataSourceExec: file_groups={3 groups: [[/testdata/tpcds/plans_sf1_partitions4/web_sales/part-0.parquet:.., /testdata/tpcds/plans_sf1_partitions4/web_sales/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/web_sales/part-1.parquet:.., /testdata/tpcds/plans_sf1_partitions4/web_sales/part-2.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/web_sales/part-2.parquet:.., /testdata/tpcds/plans_sf1_partitions4/web_sales/part-3.parquet:..]]}, projection=[ws_sold_date_sk, ws_bill_customer_sk], file_type=parquet, predicate=DynamicFilter [ empty ] └────────────────────────────────────────────────── ┌───── Stage 1 ── Tasks: t0:[p0..p8] t1:[p9..p17] │ BroadcastExec: input_partitions=3, consumer_tasks=3, output_partitions=9 │ ProjectionExec: expr=[d_date_sk@0 as d_date_sk, d_date@1 as d_date, CAST(d_date_sk@0 AS Float64) as CAST(date_dim.d_date_sk AS Float64)] │ FilterExec: d_month_seq@2 >= 1200 AND d_month_seq@2 <= 1211, projection=[d_date_sk@0, d_date@1] │ RepartitionExec: partitioning=RoundRobinBatch(3), input_partitions=2 - │ PartitionIsolatorExec: tasks=2 partitions=4 - │ DataSourceExec: file_groups={4 groups: [[/testdata/tpcds/plans_sf1_partitions4/date_dim/part-0.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-2.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-3.parquet]]}, projection=[d_date_sk, d_date, d_month_seq], file_type=parquet, predicate=d_month_seq@3 >= 1200 AND d_month_seq@3 <= 1211, pruning_predicate=d_month_seq_null_count@1 != row_count@2 AND d_month_seq_max@0 >= 1200 AND d_month_seq_null_count@1 != row_count@2 AND d_month_seq_min@3 <= 1211, required_guarantees=[] + │ DistributedLeafExec: DataSourceExec: file_groups={2 groups: [[/testdata/tpcds/plans_sf1_partitions4/date_dim/part-0.parquet, /testdata/tpcds/plans_sf1_partitions4/date_dim/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-2.parquet, /testdata/tpcds/plans_sf1_partitions4/date_dim/part-3.parquet]]}, projection=[d_date_sk, d_date, d_month_seq], file_type=parquet, predicate=d_month_seq@3 >= 1200 AND d_month_seq@3 <= 1211, pruning_predicate=d_month_seq_null_count@1 != row_count@2 AND d_month_seq_max@0 >= 1200 AND d_month_seq_null_count@1 != row_count@2 AND d_month_seq_min@3 <= 1211, required_guarantees=[] └────────────────────────────────────────────────── ┌───── Stage 8 ── Tasks: t0:[p0..p5] t1:[p6..p11] │ BroadcastExec: input_partitions=3, consumer_tasks=2, output_partitions=6 @@ -9719,28 +8973,25 @@ mod tests { │ ProjectionExec: expr=[c_last_name@2 as c_last_name, c_first_name@1 as c_first_name, d_date@0 as d_date] │ HashJoinExec: mode=CollectLeft, join_type=Inner, on=[(cs_bill_customer_sk@0, CAST(customer.c_customer_sk AS Float64)@3)], projection=[d_date@1, c_first_name@3, c_last_name@4] │ CoalescePartitionsExec - │ [Stage 6] => NetworkBroadcastExec: partitions_per_consumer=2, stage_partitions=6, input_tasks=3 + │ [Stage 6] => NetworkBroadcastExec: partitions_per_consumer=3, stage_partitions=9, input_tasks=3 │ ProjectionExec: expr=[c_customer_sk@0 as c_customer_sk, c_first_name@1 as c_first_name, c_last_name@2 as c_last_name, CAST(c_customer_sk@0 AS Float64) as CAST(customer.c_customer_sk AS Float64)] │ RepartitionExec: partitioning=RoundRobinBatch(3), input_partitions=2 - │ PartitionIsolatorExec: tasks=3 partitions=4 - │ DataSourceExec: file_groups={4 groups: [[/testdata/tpcds/plans_sf1_partitions4/customer/part-0.parquet], [/testdata/tpcds/plans_sf1_partitions4/customer/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/customer/part-2.parquet], [/testdata/tpcds/plans_sf1_partitions4/customer/part-3.parquet]]}, projection=[c_customer_sk, c_first_name, c_last_name], file_type=parquet, predicate=DynamicFilter [ empty ] + │ DistributedLeafExec: DataSourceExec: file_groups={2 groups: [[/testdata/tpcds/plans_sf1_partitions4/customer/part-0.parquet, /testdata/tpcds/plans_sf1_partitions4/customer/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/customer/part-2.parquet, /testdata/tpcds/plans_sf1_partitions4/customer/part-3.parquet]]}, projection=[c_customer_sk, c_first_name, c_last_name], file_type=parquet, predicate=DynamicFilter [ empty ] └────────────────────────────────────────────────── - ┌───── Stage 6 ── Tasks: t0:[p0..p5] t1:[p6..p11] t2:[p12..p17] - │ BroadcastExec: input_partitions=2, consumer_tasks=3, output_partitions=6 + ┌───── Stage 6 ── Tasks: t0:[p0..p8] t1:[p9..p17] t2:[p18..p26] + │ BroadcastExec: input_partitions=3, consumer_tasks=3, output_partitions=9 │ ProjectionExec: expr=[cs_bill_customer_sk@1 as cs_bill_customer_sk, d_date@0 as d_date] │ HashJoinExec: mode=CollectLeft, join_type=Inner, on=[(CAST(date_dim.d_date_sk AS Float64)@2, cs_sold_date_sk@0)], projection=[d_date@1, cs_bill_customer_sk@4] │ CoalescePartitionsExec │ [Stage 5] => NetworkBroadcastExec: partitions_per_consumer=3, stage_partitions=9, input_tasks=2 - │ PartitionIsolatorExec: tasks=3 partitions=6 - │ DataSourceExec: file_groups={6 groups: [[/testdata/tpcds/plans_sf1_partitions4/catalog_sales/part-0.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/catalog_sales/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/catalog_sales/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/catalog_sales/part-2.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/catalog_sales/part-2.parquet:..], ...]}, projection=[cs_sold_date_sk, cs_bill_customer_sk], file_type=parquet, predicate=DynamicFilter [ empty ] + │ DistributedLeafExec: DataSourceExec: file_groups={3 groups: [[/testdata/tpcds/plans_sf1_partitions4/catalog_sales/part-0.parquet:.., /testdata/tpcds/plans_sf1_partitions4/catalog_sales/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/catalog_sales/part-1.parquet:.., /testdata/tpcds/plans_sf1_partitions4/catalog_sales/part-2.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/catalog_sales/part-2.parquet:.., /testdata/tpcds/plans_sf1_partitions4/catalog_sales/part-3.parquet:..]]}, projection=[cs_sold_date_sk, cs_bill_customer_sk], file_type=parquet, predicate=DynamicFilter [ empty ] └────────────────────────────────────────────────── ┌───── Stage 5 ── Tasks: t0:[p0..p8] t1:[p9..p17] │ BroadcastExec: input_partitions=3, consumer_tasks=3, output_partitions=9 │ ProjectionExec: expr=[d_date_sk@0 as d_date_sk, d_date@1 as d_date, CAST(d_date_sk@0 AS Float64) as CAST(date_dim.d_date_sk AS Float64)] │ FilterExec: d_month_seq@2 >= 1200 AND d_month_seq@2 <= 1211, projection=[d_date_sk@0, d_date@1] │ RepartitionExec: partitioning=RoundRobinBatch(3), input_partitions=2 - │ PartitionIsolatorExec: tasks=2 partitions=4 - │ DataSourceExec: file_groups={4 groups: [[/testdata/tpcds/plans_sf1_partitions4/date_dim/part-0.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-2.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-3.parquet]]}, projection=[d_date_sk, d_date, d_month_seq], file_type=parquet, predicate=d_month_seq@3 >= 1200 AND d_month_seq@3 <= 1211, pruning_predicate=d_month_seq_null_count@1 != row_count@2 AND d_month_seq_max@0 >= 1200 AND d_month_seq_null_count@1 != row_count@2 AND d_month_seq_min@3 <= 1211, required_guarantees=[] + │ DistributedLeafExec: DataSourceExec: file_groups={2 groups: [[/testdata/tpcds/plans_sf1_partitions4/date_dim/part-0.parquet, /testdata/tpcds/plans_sf1_partitions4/date_dim/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-2.parquet, /testdata/tpcds/plans_sf1_partitions4/date_dim/part-3.parquet]]}, projection=[d_date_sk, d_date, d_month_seq], file_type=parquet, predicate=d_month_seq@3 >= 1200 AND d_month_seq@3 <= 1211, pruning_predicate=d_month_seq_null_count@1 != row_count@2 AND d_month_seq_max@0 >= 1200 AND d_month_seq_null_count@1 != row_count@2 AND d_month_seq_min@3 <= 1211, required_guarantees=[] └────────────────────────────────────────────────── ┌───── Stage 11 ── Tasks: t0:[p0..p5] t1:[p0..p5] t2:[p0..p5] │ RepartitionExec: partitioning=Hash([c_last_name@0, c_first_name@1, d_date@2], 6), input_partitions=3 @@ -9748,28 +8999,25 @@ mod tests { │ ProjectionExec: expr=[c_last_name@2 as c_last_name, c_first_name@1 as c_first_name, d_date@0 as d_date] │ HashJoinExec: mode=CollectLeft, join_type=Inner, on=[(ss_customer_sk@0, CAST(customer.c_customer_sk AS Float64)@3)], projection=[d_date@1, c_first_name@3, c_last_name@4] │ CoalescePartitionsExec - │ [Stage 10] => NetworkBroadcastExec: partitions_per_consumer=2, stage_partitions=6, input_tasks=3 + │ [Stage 10] => NetworkBroadcastExec: partitions_per_consumer=3, stage_partitions=9, input_tasks=3 │ ProjectionExec: expr=[c_customer_sk@0 as c_customer_sk, c_first_name@1 as c_first_name, c_last_name@2 as c_last_name, CAST(c_customer_sk@0 AS Float64) as CAST(customer.c_customer_sk AS Float64)] │ RepartitionExec: partitioning=RoundRobinBatch(3), input_partitions=2 - │ PartitionIsolatorExec: tasks=3 partitions=4 - │ DataSourceExec: file_groups={4 groups: [[/testdata/tpcds/plans_sf1_partitions4/customer/part-0.parquet], [/testdata/tpcds/plans_sf1_partitions4/customer/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/customer/part-2.parquet], [/testdata/tpcds/plans_sf1_partitions4/customer/part-3.parquet]]}, projection=[c_customer_sk, c_first_name, c_last_name], file_type=parquet, predicate=DynamicFilter [ empty ] + │ DistributedLeafExec: DataSourceExec: file_groups={2 groups: [[/testdata/tpcds/plans_sf1_partitions4/customer/part-0.parquet, /testdata/tpcds/plans_sf1_partitions4/customer/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/customer/part-2.parquet, /testdata/tpcds/plans_sf1_partitions4/customer/part-3.parquet]]}, projection=[c_customer_sk, c_first_name, c_last_name], file_type=parquet, predicate=DynamicFilter [ empty ] └────────────────────────────────────────────────── - ┌───── Stage 10 ── Tasks: t0:[p0..p5] t1:[p6..p11] t2:[p12..p17] - │ BroadcastExec: input_partitions=2, consumer_tasks=3, output_partitions=6 + ┌───── Stage 10 ── Tasks: t0:[p0..p8] t1:[p9..p17] t2:[p18..p26] + │ BroadcastExec: input_partitions=3, consumer_tasks=3, output_partitions=9 │ ProjectionExec: expr=[ss_customer_sk@1 as ss_customer_sk, d_date@0 as d_date] │ HashJoinExec: mode=CollectLeft, join_type=Inner, on=[(CAST(date_dim.d_date_sk AS Float64)@2, ss_sold_date_sk@0)], projection=[d_date@1, ss_customer_sk@4] │ CoalescePartitionsExec │ [Stage 9] => NetworkBroadcastExec: partitions_per_consumer=3, stage_partitions=9, input_tasks=2 - │ PartitionIsolatorExec: tasks=3 partitions=6 - │ DataSourceExec: file_groups={6 groups: [[/testdata/tpcds/plans_sf1_partitions4/store_sales/part-0.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_sales/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_sales/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_sales/part-2.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_sales/part-2.parquet:..], ...]}, projection=[ss_sold_date_sk, ss_customer_sk], file_type=parquet, predicate=DynamicFilter [ empty ] + │ DistributedLeafExec: DataSourceExec: file_groups={3 groups: [[/testdata/tpcds/plans_sf1_partitions4/store_sales/part-0.parquet:.., /testdata/tpcds/plans_sf1_partitions4/store_sales/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_sales/part-1.parquet:.., /testdata/tpcds/plans_sf1_partitions4/store_sales/part-2.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_sales/part-2.parquet:.., /testdata/tpcds/plans_sf1_partitions4/store_sales/part-3.parquet:..]]}, projection=[ss_sold_date_sk, ss_customer_sk], file_type=parquet, predicate=DynamicFilter [ empty ] └────────────────────────────────────────────────── ┌───── Stage 9 ── Tasks: t0:[p0..p8] t1:[p9..p17] │ BroadcastExec: input_partitions=3, consumer_tasks=3, output_partitions=9 │ ProjectionExec: expr=[d_date_sk@0 as d_date_sk, d_date@1 as d_date, CAST(d_date_sk@0 AS Float64) as CAST(date_dim.d_date_sk AS Float64)] │ FilterExec: d_month_seq@2 >= 1200 AND d_month_seq@2 <= 1211, projection=[d_date_sk@0, d_date@1] │ RepartitionExec: partitioning=RoundRobinBatch(3), input_partitions=2 - │ PartitionIsolatorExec: tasks=2 partitions=4 - │ DataSourceExec: file_groups={4 groups: [[/testdata/tpcds/plans_sf1_partitions4/date_dim/part-0.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-2.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-3.parquet]]}, projection=[d_date_sk, d_date, d_month_seq], file_type=parquet, predicate=d_month_seq@3 >= 1200 AND d_month_seq@3 <= 1211, pruning_predicate=d_month_seq_null_count@1 != row_count@2 AND d_month_seq_max@0 >= 1200 AND d_month_seq_null_count@1 != row_count@2 AND d_month_seq_min@3 <= 1211, required_guarantees=[] + │ DistributedLeafExec: DataSourceExec: file_groups={2 groups: [[/testdata/tpcds/plans_sf1_partitions4/date_dim/part-0.parquet, /testdata/tpcds/plans_sf1_partitions4/date_dim/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-2.parquet, /testdata/tpcds/plans_sf1_partitions4/date_dim/part-3.parquet]]}, projection=[d_date_sk, d_date, d_month_seq], file_type=parquet, predicate=d_month_seq@3 >= 1200 AND d_month_seq@3 <= 1211, pruning_predicate=d_month_seq_null_count@1 != row_count@2 AND d_month_seq_max@0 >= 1200 AND d_month_seq_null_count@1 != row_count@2 AND d_month_seq_min@3 <= 1211, required_guarantees=[] └────────────────────────────────────────────────── "); Ok(()) @@ -9777,55 +9025,55 @@ mod tests { #[tokio::test] async fn test_tpcds_88() -> Result<()> { let display = test_tpcds_query("q88").await?; - assert_snapshot!(display, @" + assert_snapshot!(display, @r" ┌───── DistributedExec ── Tasks: t0:[p0] │ ProjectionExec: expr=[h8_30_to_9@1 as h8_30_to_9, h9_to_9_30@2 as h9_to_9_30, h9_30_to_10@3 as h9_30_to_10, h10_to_10_30@4 as h10_to_10_30, h10_30_to_11@5 as h10_30_to_11, h11_to_11_30@6 as h11_to_11_30, h11_30_to_12@7 as h11_30_to_12, h12_to_12_30@0 as h12_to_12_30] │ CrossJoinExec │ ProjectionExec: expr=[count(Int64(1))@0 as h12_to_12_30] │ AggregateExec: mode=Final, gby=[], aggr=[count(Int64(1))] │ CoalescePartitionsExec - │ [Stage 4] => NetworkCoalesceExec: output_partitions=6, input_tasks=3 + │ [Stage 4] => NetworkCoalesceExec: output_partitions=9, input_tasks=3 │ ProjectionExec: expr=[h8_30_to_9@1 as h8_30_to_9, h9_to_9_30@2 as h9_to_9_30, h9_30_to_10@3 as h9_30_to_10, h10_to_10_30@4 as h10_to_10_30, h10_30_to_11@5 as h10_30_to_11, h11_to_11_30@6 as h11_to_11_30, h11_30_to_12@0 as h11_30_to_12] │ CrossJoinExec │ ProjectionExec: expr=[count(Int64(1))@0 as h11_30_to_12] │ AggregateExec: mode=Final, gby=[], aggr=[count(Int64(1))] │ CoalescePartitionsExec - │ [Stage 8] => NetworkCoalesceExec: output_partitions=6, input_tasks=3 + │ [Stage 8] => NetworkCoalesceExec: output_partitions=9, input_tasks=3 │ ProjectionExec: expr=[h8_30_to_9@1 as h8_30_to_9, h9_to_9_30@2 as h9_to_9_30, h9_30_to_10@3 as h9_30_to_10, h10_to_10_30@4 as h10_to_10_30, h10_30_to_11@5 as h10_30_to_11, h11_to_11_30@0 as h11_to_11_30] │ CrossJoinExec │ ProjectionExec: expr=[count(Int64(1))@0 as h11_to_11_30] │ AggregateExec: mode=Final, gby=[], aggr=[count(Int64(1))] │ CoalescePartitionsExec - │ [Stage 12] => NetworkCoalesceExec: output_partitions=6, input_tasks=3 + │ [Stage 12] => NetworkCoalesceExec: output_partitions=9, input_tasks=3 │ ProjectionExec: expr=[h8_30_to_9@1 as h8_30_to_9, h9_to_9_30@2 as h9_to_9_30, h9_30_to_10@3 as h9_30_to_10, h10_to_10_30@4 as h10_to_10_30, h10_30_to_11@0 as h10_30_to_11] │ CrossJoinExec │ ProjectionExec: expr=[count(Int64(1))@0 as h10_30_to_11] │ AggregateExec: mode=Final, gby=[], aggr=[count(Int64(1))] │ CoalescePartitionsExec - │ [Stage 16] => NetworkCoalesceExec: output_partitions=6, input_tasks=3 + │ [Stage 16] => NetworkCoalesceExec: output_partitions=9, input_tasks=3 │ ProjectionExec: expr=[h8_30_to_9@1 as h8_30_to_9, h9_to_9_30@2 as h9_to_9_30, h9_30_to_10@3 as h9_30_to_10, h10_to_10_30@0 as h10_to_10_30] │ CrossJoinExec │ ProjectionExec: expr=[count(Int64(1))@0 as h10_to_10_30] │ AggregateExec: mode=Final, gby=[], aggr=[count(Int64(1))] │ CoalescePartitionsExec - │ [Stage 20] => NetworkCoalesceExec: output_partitions=6, input_tasks=3 + │ [Stage 20] => NetworkCoalesceExec: output_partitions=9, input_tasks=3 │ ProjectionExec: expr=[h8_30_to_9@1 as h8_30_to_9, h9_to_9_30@2 as h9_to_9_30, h9_30_to_10@0 as h9_30_to_10] │ CrossJoinExec │ ProjectionExec: expr=[count(Int64(1))@0 as h9_30_to_10] │ AggregateExec: mode=Final, gby=[], aggr=[count(Int64(1))] │ CoalescePartitionsExec - │ [Stage 24] => NetworkCoalesceExec: output_partitions=6, input_tasks=3 + │ [Stage 24] => NetworkCoalesceExec: output_partitions=9, input_tasks=3 │ CrossJoinExec │ ProjectionExec: expr=[count(Int64(1))@0 as h8_30_to_9] │ AggregateExec: mode=Final, gby=[], aggr=[count(Int64(1))] │ CoalescePartitionsExec - │ [Stage 28] => NetworkCoalesceExec: output_partitions=6, input_tasks=3 + │ [Stage 28] => NetworkCoalesceExec: output_partitions=9, input_tasks=3 │ ProjectionExec: expr=[count(Int64(1))@0 as h9_to_9_30] │ AggregateExec: mode=Final, gby=[], aggr=[count(Int64(1))] │ CoalescePartitionsExec - │ [Stage 32] => NetworkCoalesceExec: output_partitions=6, input_tasks=3 + │ [Stage 32] => NetworkCoalesceExec: output_partitions=9, input_tasks=3 └────────────────────────────────────────────────── - ┌───── Stage 4 ── Tasks: t0:[p0..p1] t1:[p2..p3] t2:[p4..p5] + ┌───── Stage 4 ── Tasks: t0:[p0..p2] t1:[p3..p5] t2:[p6..p8] │ AggregateExec: mode=Partial, gby=[], aggr=[count(Int64(1))] │ HashJoinExec: mode=CollectLeft, join_type=Inner, on=[(CAST(store.s_store_sk AS Float64)@1, ss_store_sk@0)], projection=[] │ CoalescePartitionsExec @@ -9836,34 +9084,30 @@ mod tests { │ HashJoinExec: mode=CollectLeft, join_type=Inner, on=[(CAST(household_demographics.hd_demo_sk AS Float64)@1, ss_hdemo_sk@1)], projection=[ss_sold_time_sk@2, ss_store_sk@4] │ CoalescePartitionsExec │ [Stage 3] => NetworkBroadcastExec: partitions_per_consumer=3, stage_partitions=9, input_tasks=2 - │ PartitionIsolatorExec: tasks=3 partitions=6 - │ DataSourceExec: file_groups={6 groups: [[/testdata/tpcds/plans_sf1_partitions4/store_sales/part-0.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_sales/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_sales/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_sales/part-2.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_sales/part-2.parquet:..], ...]}, projection=[ss_sold_time_sk, ss_hdemo_sk, ss_store_sk], file_type=parquet, predicate=DynamicFilter [ empty ] AND DynamicFilter [ empty ] AND DynamicFilter [ empty ] + │ DistributedLeafExec: DataSourceExec: file_groups={3 groups: [[/testdata/tpcds/plans_sf1_partitions4/store_sales/part-0.parquet:.., /testdata/tpcds/plans_sf1_partitions4/store_sales/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_sales/part-1.parquet:.., /testdata/tpcds/plans_sf1_partitions4/store_sales/part-2.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_sales/part-2.parquet:.., /testdata/tpcds/plans_sf1_partitions4/store_sales/part-3.parquet:..]]}, projection=[ss_sold_time_sk, ss_hdemo_sk, ss_store_sk], file_type=parquet, predicate=DynamicFilter [ empty ] AND DynamicFilter [ empty ] AND DynamicFilter [ empty ] └────────────────────────────────────────────────── ┌───── Stage 1 ── Tasks: t0:[p0..p8] t1:[p9..p17] │ BroadcastExec: input_partitions=3, consumer_tasks=3, output_partitions=9 │ ProjectionExec: expr=[s_store_sk@0 as s_store_sk, CAST(s_store_sk@0 AS Float64) as CAST(store.s_store_sk AS Float64)] │ FilterExec: s_store_name@1 = ese, projection=[s_store_sk@0] │ RepartitionExec: partitioning=RoundRobinBatch(3), input_partitions=2 - │ PartitionIsolatorExec: tasks=2 partitions=4 - │ DataSourceExec: file_groups={4 groups: [[/testdata/tpcds/plans_sf1_partitions4/store/part-0.parquet], [/testdata/tpcds/plans_sf1_partitions4/store/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/store/part-2.parquet], [/testdata/tpcds/plans_sf1_partitions4/store/part-3.parquet]]}, projection=[s_store_sk, s_store_name], file_type=parquet, predicate=s_store_name@5 = ese, pruning_predicate=s_store_name_null_count@2 != row_count@3 AND s_store_name_min@0 <= ese AND ese <= s_store_name_max@1, required_guarantees=[s_store_name in (ese)] + │ DistributedLeafExec: DataSourceExec: file_groups={2 groups: [[/testdata/tpcds/plans_sf1_partitions4/store/part-0.parquet, /testdata/tpcds/plans_sf1_partitions4/store/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/store/part-2.parquet, /testdata/tpcds/plans_sf1_partitions4/store/part-3.parquet]]}, projection=[s_store_sk, s_store_name], file_type=parquet, predicate=s_store_name@5 = ese, pruning_predicate=s_store_name_null_count@2 != row_count@3 AND s_store_name_min@0 <= ese AND ese <= s_store_name_max@1, required_guarantees=[s_store_name in (ese)] └────────────────────────────────────────────────── ┌───── Stage 2 ── Tasks: t0:[p0..p8] t1:[p9..p17] │ BroadcastExec: input_partitions=3, consumer_tasks=3, output_partitions=9 │ ProjectionExec: expr=[t_time_sk@0 as t_time_sk, CAST(t_time_sk@0 AS Float64) as CAST(time_dim.t_time_sk AS Float64)] │ FilterExec: t_hour@1 = 12 AND t_minute@2 < 30, projection=[t_time_sk@0] │ RepartitionExec: partitioning=RoundRobinBatch(3), input_partitions=2 - │ PartitionIsolatorExec: tasks=2 partitions=4 - │ DataSourceExec: file_groups={4 groups: [[/testdata/tpcds/plans_sf1_partitions4/time_dim/part-0.parquet], [/testdata/tpcds/plans_sf1_partitions4/time_dim/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/time_dim/part-2.parquet], [/testdata/tpcds/plans_sf1_partitions4/time_dim/part-3.parquet]]}, projection=[t_time_sk, t_hour, t_minute], file_type=parquet, predicate=t_hour@3 = 12 AND t_minute@4 < 30, pruning_predicate=t_hour_null_count@2 != row_count@3 AND t_hour_min@0 <= 12 AND 12 <= t_hour_max@1 AND t_minute_null_count@5 != row_count@3 AND t_minute_min@4 < 30, required_guarantees=[t_hour in (12)] + │ DistributedLeafExec: DataSourceExec: file_groups={2 groups: [[/testdata/tpcds/plans_sf1_partitions4/time_dim/part-0.parquet, /testdata/tpcds/plans_sf1_partitions4/time_dim/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/time_dim/part-2.parquet, /testdata/tpcds/plans_sf1_partitions4/time_dim/part-3.parquet]]}, projection=[t_time_sk, t_hour, t_minute], file_type=parquet, predicate=t_hour@3 = 12 AND t_minute@4 < 30, pruning_predicate=t_hour_null_count@2 != row_count@3 AND t_hour_min@0 <= 12 AND 12 <= t_hour_max@1 AND t_minute_null_count@5 != row_count@3 AND t_minute_min@4 < 30, required_guarantees=[t_hour in (12)] └────────────────────────────────────────────────── ┌───── Stage 3 ── Tasks: t0:[p0..p8] t1:[p9..p17] │ BroadcastExec: input_partitions=3, consumer_tasks=3, output_partitions=9 │ ProjectionExec: expr=[hd_demo_sk@0 as hd_demo_sk, CAST(hd_demo_sk@0 AS Float64) as CAST(household_demographics.hd_demo_sk AS Float64)] │ FilterExec: hd_dep_count@1 = 4 AND hd_vehicle_count@2 <= 6 OR hd_dep_count@1 = 2 AND hd_vehicle_count@2 <= 4 OR hd_dep_count@1 = 0 AND hd_vehicle_count@2 <= 2, projection=[hd_demo_sk@0] │ RepartitionExec: partitioning=RoundRobinBatch(3), input_partitions=2 - │ PartitionIsolatorExec: tasks=2 partitions=4 - │ DataSourceExec: file_groups={4 groups: [[/testdata/tpcds/plans_sf1_partitions4/household_demographics/part-0.parquet], [/testdata/tpcds/plans_sf1_partitions4/household_demographics/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/household_demographics/part-2.parquet], [/testdata/tpcds/plans_sf1_partitions4/household_demographics/part-3.parquet]]}, projection=[hd_demo_sk, hd_dep_count, hd_vehicle_count], file_type=parquet, predicate=hd_dep_count@3 = 4 AND hd_vehicle_count@4 <= 6 OR hd_dep_count@3 = 2 AND hd_vehicle_count@4 <= 4 OR hd_dep_count@3 = 0 AND hd_vehicle_count@4 <= 2, pruning_predicate=hd_dep_count_null_count@2 != row_count@3 AND hd_dep_count_min@0 <= 4 AND 4 <= hd_dep_count_max@1 AND hd_vehicle_count_null_count@5 != row_count@3 AND hd_vehicle_count_min@4 <= 6 OR hd_dep_count_null_count@2 != row_count@3 AND hd_dep_count_min@0 <= 2 AND 2 <= hd_dep_count_max@1 AND hd_vehicle_count_null_count@5 != row_count@3 AND hd_vehicle_count_min@4 <= 4 OR hd_dep_count_null_count@2 != row_count@3 AND hd_dep_count_min@0 <= 0 AND 0 <= hd_dep_count_max@1 AND hd_vehicle_count_null_count@5 != row_count@3 AND hd_vehicle_count_min@4 <= 2, required_guarantees=[hd_dep_count in (0, 2, 4)] + │ DistributedLeafExec: DataSourceExec: file_groups={2 groups: [[/testdata/tpcds/plans_sf1_partitions4/household_demographics/part-0.parquet, /testdata/tpcds/plans_sf1_partitions4/household_demographics/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/household_demographics/part-2.parquet, /testdata/tpcds/plans_sf1_partitions4/household_demographics/part-3.parquet]]}, projection=[hd_demo_sk, hd_dep_count, hd_vehicle_count], file_type=parquet, predicate=hd_dep_count@3 = 4 AND hd_vehicle_count@4 <= 6 OR hd_dep_count@3 = 2 AND hd_vehicle_count@4 <= 4 OR hd_dep_count@3 = 0 AND hd_vehicle_count@4 <= 2, pruning_predicate=hd_dep_count_null_count@2 != row_count@3 AND hd_dep_count_min@0 <= 4 AND 4 <= hd_dep_count_max@1 AND hd_vehicle_count_null_count@5 != row_count@3 AND hd_vehicle_count_min@4 <= 6 OR hd_dep_count_null_count@2 != row_count@3 AND hd_dep_count_min@0 <= 2 AND 2 <= hd_dep_count_max@1 AND hd_vehicle_count_null_count@5 != row_count@3 AND hd_vehicle_count_min@4 <= 4 OR hd_dep_count_null_count@2 != row_count@3 AND hd_dep_count_min@0 <= 0 AND 0 <= hd_dep_count_max@1 AND hd_vehicle_count_null_count@5 != row_count@3 AND hd_vehicle_count_min@4 <= 2, required_guarantees=[hd_dep_count in (0, 2, 4)] └────────────────────────────────────────────────── - ┌───── Stage 8 ── Tasks: t0:[p0..p1] t1:[p2..p3] t2:[p4..p5] + ┌───── Stage 8 ── Tasks: t0:[p0..p2] t1:[p3..p5] t2:[p6..p8] │ AggregateExec: mode=Partial, gby=[], aggr=[count(Int64(1))] │ HashJoinExec: mode=CollectLeft, join_type=Inner, on=[(CAST(store.s_store_sk AS Float64)@1, ss_store_sk@0)], projection=[] │ CoalescePartitionsExec @@ -9874,34 +9118,30 @@ mod tests { │ HashJoinExec: mode=CollectLeft, join_type=Inner, on=[(CAST(household_demographics.hd_demo_sk AS Float64)@1, ss_hdemo_sk@1)], projection=[ss_sold_time_sk@2, ss_store_sk@4] │ CoalescePartitionsExec │ [Stage 7] => NetworkBroadcastExec: partitions_per_consumer=3, stage_partitions=9, input_tasks=2 - │ PartitionIsolatorExec: tasks=3 partitions=6 - │ DataSourceExec: file_groups={6 groups: [[/testdata/tpcds/plans_sf1_partitions4/store_sales/part-0.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_sales/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_sales/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_sales/part-2.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_sales/part-2.parquet:..], ...]}, projection=[ss_sold_time_sk, ss_hdemo_sk, ss_store_sk], file_type=parquet, predicate=DynamicFilter [ empty ] AND DynamicFilter [ empty ] AND DynamicFilter [ empty ] + │ DistributedLeafExec: DataSourceExec: file_groups={3 groups: [[/testdata/tpcds/plans_sf1_partitions4/store_sales/part-0.parquet:.., /testdata/tpcds/plans_sf1_partitions4/store_sales/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_sales/part-1.parquet:.., /testdata/tpcds/plans_sf1_partitions4/store_sales/part-2.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_sales/part-2.parquet:.., /testdata/tpcds/plans_sf1_partitions4/store_sales/part-3.parquet:..]]}, projection=[ss_sold_time_sk, ss_hdemo_sk, ss_store_sk], file_type=parquet, predicate=DynamicFilter [ empty ] AND DynamicFilter [ empty ] AND DynamicFilter [ empty ] └────────────────────────────────────────────────── ┌───── Stage 5 ── Tasks: t0:[p0..p8] t1:[p9..p17] │ BroadcastExec: input_partitions=3, consumer_tasks=3, output_partitions=9 │ ProjectionExec: expr=[s_store_sk@0 as s_store_sk, CAST(s_store_sk@0 AS Float64) as CAST(store.s_store_sk AS Float64)] │ FilterExec: s_store_name@1 = ese, projection=[s_store_sk@0] │ RepartitionExec: partitioning=RoundRobinBatch(3), input_partitions=2 - │ PartitionIsolatorExec: tasks=2 partitions=4 - │ DataSourceExec: file_groups={4 groups: [[/testdata/tpcds/plans_sf1_partitions4/store/part-0.parquet], [/testdata/tpcds/plans_sf1_partitions4/store/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/store/part-2.parquet], [/testdata/tpcds/plans_sf1_partitions4/store/part-3.parquet]]}, projection=[s_store_sk, s_store_name], file_type=parquet, predicate=s_store_name@5 = ese, pruning_predicate=s_store_name_null_count@2 != row_count@3 AND s_store_name_min@0 <= ese AND ese <= s_store_name_max@1, required_guarantees=[s_store_name in (ese)] + │ DistributedLeafExec: DataSourceExec: file_groups={2 groups: [[/testdata/tpcds/plans_sf1_partitions4/store/part-0.parquet, /testdata/tpcds/plans_sf1_partitions4/store/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/store/part-2.parquet, /testdata/tpcds/plans_sf1_partitions4/store/part-3.parquet]]}, projection=[s_store_sk, s_store_name], file_type=parquet, predicate=s_store_name@5 = ese, pruning_predicate=s_store_name_null_count@2 != row_count@3 AND s_store_name_min@0 <= ese AND ese <= s_store_name_max@1, required_guarantees=[s_store_name in (ese)] └────────────────────────────────────────────────── ┌───── Stage 6 ── Tasks: t0:[p0..p8] t1:[p9..p17] │ BroadcastExec: input_partitions=3, consumer_tasks=3, output_partitions=9 │ ProjectionExec: expr=[t_time_sk@0 as t_time_sk, CAST(t_time_sk@0 AS Float64) as CAST(time_dim.t_time_sk AS Float64)] │ FilterExec: t_hour@1 = 11 AND t_minute@2 >= 30, projection=[t_time_sk@0] │ RepartitionExec: partitioning=RoundRobinBatch(3), input_partitions=2 - │ PartitionIsolatorExec: tasks=2 partitions=4 - │ DataSourceExec: file_groups={4 groups: [[/testdata/tpcds/plans_sf1_partitions4/time_dim/part-0.parquet], [/testdata/tpcds/plans_sf1_partitions4/time_dim/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/time_dim/part-2.parquet], [/testdata/tpcds/plans_sf1_partitions4/time_dim/part-3.parquet]]}, projection=[t_time_sk, t_hour, t_minute], file_type=parquet, predicate=t_hour@3 = 11 AND t_minute@4 >= 30, pruning_predicate=t_hour_null_count@2 != row_count@3 AND t_hour_min@0 <= 11 AND 11 <= t_hour_max@1 AND t_minute_null_count@5 != row_count@3 AND t_minute_max@4 >= 30, required_guarantees=[t_hour in (11)] + │ DistributedLeafExec: DataSourceExec: file_groups={2 groups: [[/testdata/tpcds/plans_sf1_partitions4/time_dim/part-0.parquet, /testdata/tpcds/plans_sf1_partitions4/time_dim/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/time_dim/part-2.parquet, /testdata/tpcds/plans_sf1_partitions4/time_dim/part-3.parquet]]}, projection=[t_time_sk, t_hour, t_minute], file_type=parquet, predicate=t_hour@3 = 11 AND t_minute@4 >= 30, pruning_predicate=t_hour_null_count@2 != row_count@3 AND t_hour_min@0 <= 11 AND 11 <= t_hour_max@1 AND t_minute_null_count@5 != row_count@3 AND t_minute_max@4 >= 30, required_guarantees=[t_hour in (11)] └────────────────────────────────────────────────── ┌───── Stage 7 ── Tasks: t0:[p0..p8] t1:[p9..p17] │ BroadcastExec: input_partitions=3, consumer_tasks=3, output_partitions=9 │ ProjectionExec: expr=[hd_demo_sk@0 as hd_demo_sk, CAST(hd_demo_sk@0 AS Float64) as CAST(household_demographics.hd_demo_sk AS Float64)] │ FilterExec: hd_dep_count@1 = 4 AND hd_vehicle_count@2 <= 6 OR hd_dep_count@1 = 2 AND hd_vehicle_count@2 <= 4 OR hd_dep_count@1 = 0 AND hd_vehicle_count@2 <= 2, projection=[hd_demo_sk@0] │ RepartitionExec: partitioning=RoundRobinBatch(3), input_partitions=2 - │ PartitionIsolatorExec: tasks=2 partitions=4 - │ DataSourceExec: file_groups={4 groups: [[/testdata/tpcds/plans_sf1_partitions4/household_demographics/part-0.parquet], [/testdata/tpcds/plans_sf1_partitions4/household_demographics/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/household_demographics/part-2.parquet], [/testdata/tpcds/plans_sf1_partitions4/household_demographics/part-3.parquet]]}, projection=[hd_demo_sk, hd_dep_count, hd_vehicle_count], file_type=parquet, predicate=hd_dep_count@3 = 4 AND hd_vehicle_count@4 <= 6 OR hd_dep_count@3 = 2 AND hd_vehicle_count@4 <= 4 OR hd_dep_count@3 = 0 AND hd_vehicle_count@4 <= 2, pruning_predicate=hd_dep_count_null_count@2 != row_count@3 AND hd_dep_count_min@0 <= 4 AND 4 <= hd_dep_count_max@1 AND hd_vehicle_count_null_count@5 != row_count@3 AND hd_vehicle_count_min@4 <= 6 OR hd_dep_count_null_count@2 != row_count@3 AND hd_dep_count_min@0 <= 2 AND 2 <= hd_dep_count_max@1 AND hd_vehicle_count_null_count@5 != row_count@3 AND hd_vehicle_count_min@4 <= 4 OR hd_dep_count_null_count@2 != row_count@3 AND hd_dep_count_min@0 <= 0 AND 0 <= hd_dep_count_max@1 AND hd_vehicle_count_null_count@5 != row_count@3 AND hd_vehicle_count_min@4 <= 2, required_guarantees=[hd_dep_count in (0, 2, 4)] + │ DistributedLeafExec: DataSourceExec: file_groups={2 groups: [[/testdata/tpcds/plans_sf1_partitions4/household_demographics/part-0.parquet, /testdata/tpcds/plans_sf1_partitions4/household_demographics/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/household_demographics/part-2.parquet, /testdata/tpcds/plans_sf1_partitions4/household_demographics/part-3.parquet]]}, projection=[hd_demo_sk, hd_dep_count, hd_vehicle_count], file_type=parquet, predicate=hd_dep_count@3 = 4 AND hd_vehicle_count@4 <= 6 OR hd_dep_count@3 = 2 AND hd_vehicle_count@4 <= 4 OR hd_dep_count@3 = 0 AND hd_vehicle_count@4 <= 2, pruning_predicate=hd_dep_count_null_count@2 != row_count@3 AND hd_dep_count_min@0 <= 4 AND 4 <= hd_dep_count_max@1 AND hd_vehicle_count_null_count@5 != row_count@3 AND hd_vehicle_count_min@4 <= 6 OR hd_dep_count_null_count@2 != row_count@3 AND hd_dep_count_min@0 <= 2 AND 2 <= hd_dep_count_max@1 AND hd_vehicle_count_null_count@5 != row_count@3 AND hd_vehicle_count_min@4 <= 4 OR hd_dep_count_null_count@2 != row_count@3 AND hd_dep_count_min@0 <= 0 AND 0 <= hd_dep_count_max@1 AND hd_vehicle_count_null_count@5 != row_count@3 AND hd_vehicle_count_min@4 <= 2, required_guarantees=[hd_dep_count in (0, 2, 4)] └────────────────────────────────────────────────── - ┌───── Stage 12 ── Tasks: t0:[p0..p1] t1:[p2..p3] t2:[p4..p5] + ┌───── Stage 12 ── Tasks: t0:[p0..p2] t1:[p3..p5] t2:[p6..p8] │ AggregateExec: mode=Partial, gby=[], aggr=[count(Int64(1))] │ HashJoinExec: mode=CollectLeft, join_type=Inner, on=[(CAST(store.s_store_sk AS Float64)@1, ss_store_sk@0)], projection=[] │ CoalescePartitionsExec @@ -9912,34 +9152,30 @@ mod tests { │ HashJoinExec: mode=CollectLeft, join_type=Inner, on=[(CAST(household_demographics.hd_demo_sk AS Float64)@1, ss_hdemo_sk@1)], projection=[ss_sold_time_sk@2, ss_store_sk@4] │ CoalescePartitionsExec │ [Stage 11] => NetworkBroadcastExec: partitions_per_consumer=3, stage_partitions=9, input_tasks=2 - │ PartitionIsolatorExec: tasks=3 partitions=6 - │ DataSourceExec: file_groups={6 groups: [[/testdata/tpcds/plans_sf1_partitions4/store_sales/part-0.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_sales/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_sales/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_sales/part-2.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_sales/part-2.parquet:..], ...]}, projection=[ss_sold_time_sk, ss_hdemo_sk, ss_store_sk], file_type=parquet, predicate=DynamicFilter [ empty ] AND DynamicFilter [ empty ] AND DynamicFilter [ empty ] + │ DistributedLeafExec: DataSourceExec: file_groups={3 groups: [[/testdata/tpcds/plans_sf1_partitions4/store_sales/part-0.parquet:.., /testdata/tpcds/plans_sf1_partitions4/store_sales/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_sales/part-1.parquet:.., /testdata/tpcds/plans_sf1_partitions4/store_sales/part-2.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_sales/part-2.parquet:.., /testdata/tpcds/plans_sf1_partitions4/store_sales/part-3.parquet:..]]}, projection=[ss_sold_time_sk, ss_hdemo_sk, ss_store_sk], file_type=parquet, predicate=DynamicFilter [ empty ] AND DynamicFilter [ empty ] AND DynamicFilter [ empty ] └────────────────────────────────────────────────── ┌───── Stage 9 ── Tasks: t0:[p0..p8] t1:[p9..p17] │ BroadcastExec: input_partitions=3, consumer_tasks=3, output_partitions=9 │ ProjectionExec: expr=[s_store_sk@0 as s_store_sk, CAST(s_store_sk@0 AS Float64) as CAST(store.s_store_sk AS Float64)] │ FilterExec: s_store_name@1 = ese, projection=[s_store_sk@0] │ RepartitionExec: partitioning=RoundRobinBatch(3), input_partitions=2 - │ PartitionIsolatorExec: tasks=2 partitions=4 - │ DataSourceExec: file_groups={4 groups: [[/testdata/tpcds/plans_sf1_partitions4/store/part-0.parquet], [/testdata/tpcds/plans_sf1_partitions4/store/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/store/part-2.parquet], [/testdata/tpcds/plans_sf1_partitions4/store/part-3.parquet]]}, projection=[s_store_sk, s_store_name], file_type=parquet, predicate=s_store_name@5 = ese, pruning_predicate=s_store_name_null_count@2 != row_count@3 AND s_store_name_min@0 <= ese AND ese <= s_store_name_max@1, required_guarantees=[s_store_name in (ese)] + │ DistributedLeafExec: DataSourceExec: file_groups={2 groups: [[/testdata/tpcds/plans_sf1_partitions4/store/part-0.parquet, /testdata/tpcds/plans_sf1_partitions4/store/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/store/part-2.parquet, /testdata/tpcds/plans_sf1_partitions4/store/part-3.parquet]]}, projection=[s_store_sk, s_store_name], file_type=parquet, predicate=s_store_name@5 = ese, pruning_predicate=s_store_name_null_count@2 != row_count@3 AND s_store_name_min@0 <= ese AND ese <= s_store_name_max@1, required_guarantees=[s_store_name in (ese)] └────────────────────────────────────────────────── ┌───── Stage 10 ── Tasks: t0:[p0..p8] t1:[p9..p17] │ BroadcastExec: input_partitions=3, consumer_tasks=3, output_partitions=9 │ ProjectionExec: expr=[t_time_sk@0 as t_time_sk, CAST(t_time_sk@0 AS Float64) as CAST(time_dim.t_time_sk AS Float64)] │ FilterExec: t_hour@1 = 11 AND t_minute@2 < 30, projection=[t_time_sk@0] │ RepartitionExec: partitioning=RoundRobinBatch(3), input_partitions=2 - │ PartitionIsolatorExec: tasks=2 partitions=4 - │ DataSourceExec: file_groups={4 groups: [[/testdata/tpcds/plans_sf1_partitions4/time_dim/part-0.parquet], [/testdata/tpcds/plans_sf1_partitions4/time_dim/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/time_dim/part-2.parquet], [/testdata/tpcds/plans_sf1_partitions4/time_dim/part-3.parquet]]}, projection=[t_time_sk, t_hour, t_minute], file_type=parquet, predicate=t_hour@3 = 11 AND t_minute@4 < 30, pruning_predicate=t_hour_null_count@2 != row_count@3 AND t_hour_min@0 <= 11 AND 11 <= t_hour_max@1 AND t_minute_null_count@5 != row_count@3 AND t_minute_min@4 < 30, required_guarantees=[t_hour in (11)] + │ DistributedLeafExec: DataSourceExec: file_groups={2 groups: [[/testdata/tpcds/plans_sf1_partitions4/time_dim/part-0.parquet, /testdata/tpcds/plans_sf1_partitions4/time_dim/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/time_dim/part-2.parquet, /testdata/tpcds/plans_sf1_partitions4/time_dim/part-3.parquet]]}, projection=[t_time_sk, t_hour, t_minute], file_type=parquet, predicate=t_hour@3 = 11 AND t_minute@4 < 30, pruning_predicate=t_hour_null_count@2 != row_count@3 AND t_hour_min@0 <= 11 AND 11 <= t_hour_max@1 AND t_minute_null_count@5 != row_count@3 AND t_minute_min@4 < 30, required_guarantees=[t_hour in (11)] └────────────────────────────────────────────────── ┌───── Stage 11 ── Tasks: t0:[p0..p8] t1:[p9..p17] │ BroadcastExec: input_partitions=3, consumer_tasks=3, output_partitions=9 │ ProjectionExec: expr=[hd_demo_sk@0 as hd_demo_sk, CAST(hd_demo_sk@0 AS Float64) as CAST(household_demographics.hd_demo_sk AS Float64)] │ FilterExec: hd_dep_count@1 = 4 AND hd_vehicle_count@2 <= 6 OR hd_dep_count@1 = 2 AND hd_vehicle_count@2 <= 4 OR hd_dep_count@1 = 0 AND hd_vehicle_count@2 <= 2, projection=[hd_demo_sk@0] │ RepartitionExec: partitioning=RoundRobinBatch(3), input_partitions=2 - │ PartitionIsolatorExec: tasks=2 partitions=4 - │ DataSourceExec: file_groups={4 groups: [[/testdata/tpcds/plans_sf1_partitions4/household_demographics/part-0.parquet], [/testdata/tpcds/plans_sf1_partitions4/household_demographics/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/household_demographics/part-2.parquet], [/testdata/tpcds/plans_sf1_partitions4/household_demographics/part-3.parquet]]}, projection=[hd_demo_sk, hd_dep_count, hd_vehicle_count], file_type=parquet, predicate=hd_dep_count@3 = 4 AND hd_vehicle_count@4 <= 6 OR hd_dep_count@3 = 2 AND hd_vehicle_count@4 <= 4 OR hd_dep_count@3 = 0 AND hd_vehicle_count@4 <= 2, pruning_predicate=hd_dep_count_null_count@2 != row_count@3 AND hd_dep_count_min@0 <= 4 AND 4 <= hd_dep_count_max@1 AND hd_vehicle_count_null_count@5 != row_count@3 AND hd_vehicle_count_min@4 <= 6 OR hd_dep_count_null_count@2 != row_count@3 AND hd_dep_count_min@0 <= 2 AND 2 <= hd_dep_count_max@1 AND hd_vehicle_count_null_count@5 != row_count@3 AND hd_vehicle_count_min@4 <= 4 OR hd_dep_count_null_count@2 != row_count@3 AND hd_dep_count_min@0 <= 0 AND 0 <= hd_dep_count_max@1 AND hd_vehicle_count_null_count@5 != row_count@3 AND hd_vehicle_count_min@4 <= 2, required_guarantees=[hd_dep_count in (0, 2, 4)] + │ DistributedLeafExec: DataSourceExec: file_groups={2 groups: [[/testdata/tpcds/plans_sf1_partitions4/household_demographics/part-0.parquet, /testdata/tpcds/plans_sf1_partitions4/household_demographics/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/household_demographics/part-2.parquet, /testdata/tpcds/plans_sf1_partitions4/household_demographics/part-3.parquet]]}, projection=[hd_demo_sk, hd_dep_count, hd_vehicle_count], file_type=parquet, predicate=hd_dep_count@3 = 4 AND hd_vehicle_count@4 <= 6 OR hd_dep_count@3 = 2 AND hd_vehicle_count@4 <= 4 OR hd_dep_count@3 = 0 AND hd_vehicle_count@4 <= 2, pruning_predicate=hd_dep_count_null_count@2 != row_count@3 AND hd_dep_count_min@0 <= 4 AND 4 <= hd_dep_count_max@1 AND hd_vehicle_count_null_count@5 != row_count@3 AND hd_vehicle_count_min@4 <= 6 OR hd_dep_count_null_count@2 != row_count@3 AND hd_dep_count_min@0 <= 2 AND 2 <= hd_dep_count_max@1 AND hd_vehicle_count_null_count@5 != row_count@3 AND hd_vehicle_count_min@4 <= 4 OR hd_dep_count_null_count@2 != row_count@3 AND hd_dep_count_min@0 <= 0 AND 0 <= hd_dep_count_max@1 AND hd_vehicle_count_null_count@5 != row_count@3 AND hd_vehicle_count_min@4 <= 2, required_guarantees=[hd_dep_count in (0, 2, 4)] └────────────────────────────────────────────────── - ┌───── Stage 16 ── Tasks: t0:[p0..p1] t1:[p2..p3] t2:[p4..p5] + ┌───── Stage 16 ── Tasks: t0:[p0..p2] t1:[p3..p5] t2:[p6..p8] │ AggregateExec: mode=Partial, gby=[], aggr=[count(Int64(1))] │ HashJoinExec: mode=CollectLeft, join_type=Inner, on=[(CAST(store.s_store_sk AS Float64)@1, ss_store_sk@0)], projection=[] │ CoalescePartitionsExec @@ -9950,34 +9186,30 @@ mod tests { │ HashJoinExec: mode=CollectLeft, join_type=Inner, on=[(CAST(household_demographics.hd_demo_sk AS Float64)@1, ss_hdemo_sk@1)], projection=[ss_sold_time_sk@2, ss_store_sk@4] │ CoalescePartitionsExec │ [Stage 15] => NetworkBroadcastExec: partitions_per_consumer=3, stage_partitions=9, input_tasks=2 - │ PartitionIsolatorExec: tasks=3 partitions=6 - │ DataSourceExec: file_groups={6 groups: [[/testdata/tpcds/plans_sf1_partitions4/store_sales/part-0.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_sales/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_sales/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_sales/part-2.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_sales/part-2.parquet:..], ...]}, projection=[ss_sold_time_sk, ss_hdemo_sk, ss_store_sk], file_type=parquet, predicate=DynamicFilter [ empty ] AND DynamicFilter [ empty ] AND DynamicFilter [ empty ] + │ DistributedLeafExec: DataSourceExec: file_groups={3 groups: [[/testdata/tpcds/plans_sf1_partitions4/store_sales/part-0.parquet:.., /testdata/tpcds/plans_sf1_partitions4/store_sales/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_sales/part-1.parquet:.., /testdata/tpcds/plans_sf1_partitions4/store_sales/part-2.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_sales/part-2.parquet:.., /testdata/tpcds/plans_sf1_partitions4/store_sales/part-3.parquet:..]]}, projection=[ss_sold_time_sk, ss_hdemo_sk, ss_store_sk], file_type=parquet, predicate=DynamicFilter [ empty ] AND DynamicFilter [ empty ] AND DynamicFilter [ empty ] └────────────────────────────────────────────────── ┌───── Stage 13 ── Tasks: t0:[p0..p8] t1:[p9..p17] │ BroadcastExec: input_partitions=3, consumer_tasks=3, output_partitions=9 │ ProjectionExec: expr=[s_store_sk@0 as s_store_sk, CAST(s_store_sk@0 AS Float64) as CAST(store.s_store_sk AS Float64)] │ FilterExec: s_store_name@1 = ese, projection=[s_store_sk@0] │ RepartitionExec: partitioning=RoundRobinBatch(3), input_partitions=2 - │ PartitionIsolatorExec: tasks=2 partitions=4 - │ DataSourceExec: file_groups={4 groups: [[/testdata/tpcds/plans_sf1_partitions4/store/part-0.parquet], [/testdata/tpcds/plans_sf1_partitions4/store/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/store/part-2.parquet], [/testdata/tpcds/plans_sf1_partitions4/store/part-3.parquet]]}, projection=[s_store_sk, s_store_name], file_type=parquet, predicate=s_store_name@5 = ese, pruning_predicate=s_store_name_null_count@2 != row_count@3 AND s_store_name_min@0 <= ese AND ese <= s_store_name_max@1, required_guarantees=[s_store_name in (ese)] + │ DistributedLeafExec: DataSourceExec: file_groups={2 groups: [[/testdata/tpcds/plans_sf1_partitions4/store/part-0.parquet, /testdata/tpcds/plans_sf1_partitions4/store/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/store/part-2.parquet, /testdata/tpcds/plans_sf1_partitions4/store/part-3.parquet]]}, projection=[s_store_sk, s_store_name], file_type=parquet, predicate=s_store_name@5 = ese, pruning_predicate=s_store_name_null_count@2 != row_count@3 AND s_store_name_min@0 <= ese AND ese <= s_store_name_max@1, required_guarantees=[s_store_name in (ese)] └────────────────────────────────────────────────── ┌───── Stage 14 ── Tasks: t0:[p0..p8] t1:[p9..p17] │ BroadcastExec: input_partitions=3, consumer_tasks=3, output_partitions=9 │ ProjectionExec: expr=[t_time_sk@0 as t_time_sk, CAST(t_time_sk@0 AS Float64) as CAST(time_dim.t_time_sk AS Float64)] │ FilterExec: t_hour@1 = 10 AND t_minute@2 >= 30, projection=[t_time_sk@0] │ RepartitionExec: partitioning=RoundRobinBatch(3), input_partitions=2 - │ PartitionIsolatorExec: tasks=2 partitions=4 - │ DataSourceExec: file_groups={4 groups: [[/testdata/tpcds/plans_sf1_partitions4/time_dim/part-0.parquet], [/testdata/tpcds/plans_sf1_partitions4/time_dim/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/time_dim/part-2.parquet], [/testdata/tpcds/plans_sf1_partitions4/time_dim/part-3.parquet]]}, projection=[t_time_sk, t_hour, t_minute], file_type=parquet, predicate=t_hour@3 = 10 AND t_minute@4 >= 30, pruning_predicate=t_hour_null_count@2 != row_count@3 AND t_hour_min@0 <= 10 AND 10 <= t_hour_max@1 AND t_minute_null_count@5 != row_count@3 AND t_minute_max@4 >= 30, required_guarantees=[t_hour in (10)] + │ DistributedLeafExec: DataSourceExec: file_groups={2 groups: [[/testdata/tpcds/plans_sf1_partitions4/time_dim/part-0.parquet, /testdata/tpcds/plans_sf1_partitions4/time_dim/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/time_dim/part-2.parquet, /testdata/tpcds/plans_sf1_partitions4/time_dim/part-3.parquet]]}, projection=[t_time_sk, t_hour, t_minute], file_type=parquet, predicate=t_hour@3 = 10 AND t_minute@4 >= 30, pruning_predicate=t_hour_null_count@2 != row_count@3 AND t_hour_min@0 <= 10 AND 10 <= t_hour_max@1 AND t_minute_null_count@5 != row_count@3 AND t_minute_max@4 >= 30, required_guarantees=[t_hour in (10)] └────────────────────────────────────────────────── ┌───── Stage 15 ── Tasks: t0:[p0..p8] t1:[p9..p17] │ BroadcastExec: input_partitions=3, consumer_tasks=3, output_partitions=9 │ ProjectionExec: expr=[hd_demo_sk@0 as hd_demo_sk, CAST(hd_demo_sk@0 AS Float64) as CAST(household_demographics.hd_demo_sk AS Float64)] │ FilterExec: hd_dep_count@1 = 4 AND hd_vehicle_count@2 <= 6 OR hd_dep_count@1 = 2 AND hd_vehicle_count@2 <= 4 OR hd_dep_count@1 = 0 AND hd_vehicle_count@2 <= 2, projection=[hd_demo_sk@0] │ RepartitionExec: partitioning=RoundRobinBatch(3), input_partitions=2 - │ PartitionIsolatorExec: tasks=2 partitions=4 - │ DataSourceExec: file_groups={4 groups: [[/testdata/tpcds/plans_sf1_partitions4/household_demographics/part-0.parquet], [/testdata/tpcds/plans_sf1_partitions4/household_demographics/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/household_demographics/part-2.parquet], [/testdata/tpcds/plans_sf1_partitions4/household_demographics/part-3.parquet]]}, projection=[hd_demo_sk, hd_dep_count, hd_vehicle_count], file_type=parquet, predicate=hd_dep_count@3 = 4 AND hd_vehicle_count@4 <= 6 OR hd_dep_count@3 = 2 AND hd_vehicle_count@4 <= 4 OR hd_dep_count@3 = 0 AND hd_vehicle_count@4 <= 2, pruning_predicate=hd_dep_count_null_count@2 != row_count@3 AND hd_dep_count_min@0 <= 4 AND 4 <= hd_dep_count_max@1 AND hd_vehicle_count_null_count@5 != row_count@3 AND hd_vehicle_count_min@4 <= 6 OR hd_dep_count_null_count@2 != row_count@3 AND hd_dep_count_min@0 <= 2 AND 2 <= hd_dep_count_max@1 AND hd_vehicle_count_null_count@5 != row_count@3 AND hd_vehicle_count_min@4 <= 4 OR hd_dep_count_null_count@2 != row_count@3 AND hd_dep_count_min@0 <= 0 AND 0 <= hd_dep_count_max@1 AND hd_vehicle_count_null_count@5 != row_count@3 AND hd_vehicle_count_min@4 <= 2, required_guarantees=[hd_dep_count in (0, 2, 4)] + │ DistributedLeafExec: DataSourceExec: file_groups={2 groups: [[/testdata/tpcds/plans_sf1_partitions4/household_demographics/part-0.parquet, /testdata/tpcds/plans_sf1_partitions4/household_demographics/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/household_demographics/part-2.parquet, /testdata/tpcds/plans_sf1_partitions4/household_demographics/part-3.parquet]]}, projection=[hd_demo_sk, hd_dep_count, hd_vehicle_count], file_type=parquet, predicate=hd_dep_count@3 = 4 AND hd_vehicle_count@4 <= 6 OR hd_dep_count@3 = 2 AND hd_vehicle_count@4 <= 4 OR hd_dep_count@3 = 0 AND hd_vehicle_count@4 <= 2, pruning_predicate=hd_dep_count_null_count@2 != row_count@3 AND hd_dep_count_min@0 <= 4 AND 4 <= hd_dep_count_max@1 AND hd_vehicle_count_null_count@5 != row_count@3 AND hd_vehicle_count_min@4 <= 6 OR hd_dep_count_null_count@2 != row_count@3 AND hd_dep_count_min@0 <= 2 AND 2 <= hd_dep_count_max@1 AND hd_vehicle_count_null_count@5 != row_count@3 AND hd_vehicle_count_min@4 <= 4 OR hd_dep_count_null_count@2 != row_count@3 AND hd_dep_count_min@0 <= 0 AND 0 <= hd_dep_count_max@1 AND hd_vehicle_count_null_count@5 != row_count@3 AND hd_vehicle_count_min@4 <= 2, required_guarantees=[hd_dep_count in (0, 2, 4)] └────────────────────────────────────────────────── - ┌───── Stage 20 ── Tasks: t0:[p0..p1] t1:[p2..p3] t2:[p4..p5] + ┌───── Stage 20 ── Tasks: t0:[p0..p2] t1:[p3..p5] t2:[p6..p8] │ AggregateExec: mode=Partial, gby=[], aggr=[count(Int64(1))] │ HashJoinExec: mode=CollectLeft, join_type=Inner, on=[(CAST(store.s_store_sk AS Float64)@1, ss_store_sk@0)], projection=[] │ CoalescePartitionsExec @@ -9988,34 +9220,30 @@ mod tests { │ HashJoinExec: mode=CollectLeft, join_type=Inner, on=[(CAST(household_demographics.hd_demo_sk AS Float64)@1, ss_hdemo_sk@1)], projection=[ss_sold_time_sk@2, ss_store_sk@4] │ CoalescePartitionsExec │ [Stage 19] => NetworkBroadcastExec: partitions_per_consumer=3, stage_partitions=9, input_tasks=2 - │ PartitionIsolatorExec: tasks=3 partitions=6 - │ DataSourceExec: file_groups={6 groups: [[/testdata/tpcds/plans_sf1_partitions4/store_sales/part-0.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_sales/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_sales/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_sales/part-2.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_sales/part-2.parquet:..], ...]}, projection=[ss_sold_time_sk, ss_hdemo_sk, ss_store_sk], file_type=parquet, predicate=DynamicFilter [ empty ] AND DynamicFilter [ empty ] AND DynamicFilter [ empty ] + │ DistributedLeafExec: DataSourceExec: file_groups={3 groups: [[/testdata/tpcds/plans_sf1_partitions4/store_sales/part-0.parquet:.., /testdata/tpcds/plans_sf1_partitions4/store_sales/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_sales/part-1.parquet:.., /testdata/tpcds/plans_sf1_partitions4/store_sales/part-2.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_sales/part-2.parquet:.., /testdata/tpcds/plans_sf1_partitions4/store_sales/part-3.parquet:..]]}, projection=[ss_sold_time_sk, ss_hdemo_sk, ss_store_sk], file_type=parquet, predicate=DynamicFilter [ empty ] AND DynamicFilter [ empty ] AND DynamicFilter [ empty ] └────────────────────────────────────────────────── ┌───── Stage 17 ── Tasks: t0:[p0..p8] t1:[p9..p17] │ BroadcastExec: input_partitions=3, consumer_tasks=3, output_partitions=9 │ ProjectionExec: expr=[s_store_sk@0 as s_store_sk, CAST(s_store_sk@0 AS Float64) as CAST(store.s_store_sk AS Float64)] │ FilterExec: s_store_name@1 = ese, projection=[s_store_sk@0] │ RepartitionExec: partitioning=RoundRobinBatch(3), input_partitions=2 - │ PartitionIsolatorExec: tasks=2 partitions=4 - │ DataSourceExec: file_groups={4 groups: [[/testdata/tpcds/plans_sf1_partitions4/store/part-0.parquet], [/testdata/tpcds/plans_sf1_partitions4/store/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/store/part-2.parquet], [/testdata/tpcds/plans_sf1_partitions4/store/part-3.parquet]]}, projection=[s_store_sk, s_store_name], file_type=parquet, predicate=s_store_name@5 = ese, pruning_predicate=s_store_name_null_count@2 != row_count@3 AND s_store_name_min@0 <= ese AND ese <= s_store_name_max@1, required_guarantees=[s_store_name in (ese)] + │ DistributedLeafExec: DataSourceExec: file_groups={2 groups: [[/testdata/tpcds/plans_sf1_partitions4/store/part-0.parquet, /testdata/tpcds/plans_sf1_partitions4/store/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/store/part-2.parquet, /testdata/tpcds/plans_sf1_partitions4/store/part-3.parquet]]}, projection=[s_store_sk, s_store_name], file_type=parquet, predicate=s_store_name@5 = ese, pruning_predicate=s_store_name_null_count@2 != row_count@3 AND s_store_name_min@0 <= ese AND ese <= s_store_name_max@1, required_guarantees=[s_store_name in (ese)] └────────────────────────────────────────────────── ┌───── Stage 18 ── Tasks: t0:[p0..p8] t1:[p9..p17] │ BroadcastExec: input_partitions=3, consumer_tasks=3, output_partitions=9 │ ProjectionExec: expr=[t_time_sk@0 as t_time_sk, CAST(t_time_sk@0 AS Float64) as CAST(time_dim.t_time_sk AS Float64)] │ FilterExec: t_hour@1 = 10 AND t_minute@2 < 30, projection=[t_time_sk@0] │ RepartitionExec: partitioning=RoundRobinBatch(3), input_partitions=2 - │ PartitionIsolatorExec: tasks=2 partitions=4 - │ DataSourceExec: file_groups={4 groups: [[/testdata/tpcds/plans_sf1_partitions4/time_dim/part-0.parquet], [/testdata/tpcds/plans_sf1_partitions4/time_dim/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/time_dim/part-2.parquet], [/testdata/tpcds/plans_sf1_partitions4/time_dim/part-3.parquet]]}, projection=[t_time_sk, t_hour, t_minute], file_type=parquet, predicate=t_hour@3 = 10 AND t_minute@4 < 30, pruning_predicate=t_hour_null_count@2 != row_count@3 AND t_hour_min@0 <= 10 AND 10 <= t_hour_max@1 AND t_minute_null_count@5 != row_count@3 AND t_minute_min@4 < 30, required_guarantees=[t_hour in (10)] + │ DistributedLeafExec: DataSourceExec: file_groups={2 groups: [[/testdata/tpcds/plans_sf1_partitions4/time_dim/part-0.parquet, /testdata/tpcds/plans_sf1_partitions4/time_dim/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/time_dim/part-2.parquet, /testdata/tpcds/plans_sf1_partitions4/time_dim/part-3.parquet]]}, projection=[t_time_sk, t_hour, t_minute], file_type=parquet, predicate=t_hour@3 = 10 AND t_minute@4 < 30, pruning_predicate=t_hour_null_count@2 != row_count@3 AND t_hour_min@0 <= 10 AND 10 <= t_hour_max@1 AND t_minute_null_count@5 != row_count@3 AND t_minute_min@4 < 30, required_guarantees=[t_hour in (10)] └────────────────────────────────────────────────── ┌───── Stage 19 ── Tasks: t0:[p0..p8] t1:[p9..p17] │ BroadcastExec: input_partitions=3, consumer_tasks=3, output_partitions=9 │ ProjectionExec: expr=[hd_demo_sk@0 as hd_demo_sk, CAST(hd_demo_sk@0 AS Float64) as CAST(household_demographics.hd_demo_sk AS Float64)] │ FilterExec: hd_dep_count@1 = 4 AND hd_vehicle_count@2 <= 6 OR hd_dep_count@1 = 2 AND hd_vehicle_count@2 <= 4 OR hd_dep_count@1 = 0 AND hd_vehicle_count@2 <= 2, projection=[hd_demo_sk@0] │ RepartitionExec: partitioning=RoundRobinBatch(3), input_partitions=2 - │ PartitionIsolatorExec: tasks=2 partitions=4 - │ DataSourceExec: file_groups={4 groups: [[/testdata/tpcds/plans_sf1_partitions4/household_demographics/part-0.parquet], [/testdata/tpcds/plans_sf1_partitions4/household_demographics/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/household_demographics/part-2.parquet], [/testdata/tpcds/plans_sf1_partitions4/household_demographics/part-3.parquet]]}, projection=[hd_demo_sk, hd_dep_count, hd_vehicle_count], file_type=parquet, predicate=hd_dep_count@3 = 4 AND hd_vehicle_count@4 <= 6 OR hd_dep_count@3 = 2 AND hd_vehicle_count@4 <= 4 OR hd_dep_count@3 = 0 AND hd_vehicle_count@4 <= 2, pruning_predicate=hd_dep_count_null_count@2 != row_count@3 AND hd_dep_count_min@0 <= 4 AND 4 <= hd_dep_count_max@1 AND hd_vehicle_count_null_count@5 != row_count@3 AND hd_vehicle_count_min@4 <= 6 OR hd_dep_count_null_count@2 != row_count@3 AND hd_dep_count_min@0 <= 2 AND 2 <= hd_dep_count_max@1 AND hd_vehicle_count_null_count@5 != row_count@3 AND hd_vehicle_count_min@4 <= 4 OR hd_dep_count_null_count@2 != row_count@3 AND hd_dep_count_min@0 <= 0 AND 0 <= hd_dep_count_max@1 AND hd_vehicle_count_null_count@5 != row_count@3 AND hd_vehicle_count_min@4 <= 2, required_guarantees=[hd_dep_count in (0, 2, 4)] + │ DistributedLeafExec: DataSourceExec: file_groups={2 groups: [[/testdata/tpcds/plans_sf1_partitions4/household_demographics/part-0.parquet, /testdata/tpcds/plans_sf1_partitions4/household_demographics/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/household_demographics/part-2.parquet, /testdata/tpcds/plans_sf1_partitions4/household_demographics/part-3.parquet]]}, projection=[hd_demo_sk, hd_dep_count, hd_vehicle_count], file_type=parquet, predicate=hd_dep_count@3 = 4 AND hd_vehicle_count@4 <= 6 OR hd_dep_count@3 = 2 AND hd_vehicle_count@4 <= 4 OR hd_dep_count@3 = 0 AND hd_vehicle_count@4 <= 2, pruning_predicate=hd_dep_count_null_count@2 != row_count@3 AND hd_dep_count_min@0 <= 4 AND 4 <= hd_dep_count_max@1 AND hd_vehicle_count_null_count@5 != row_count@3 AND hd_vehicle_count_min@4 <= 6 OR hd_dep_count_null_count@2 != row_count@3 AND hd_dep_count_min@0 <= 2 AND 2 <= hd_dep_count_max@1 AND hd_vehicle_count_null_count@5 != row_count@3 AND hd_vehicle_count_min@4 <= 4 OR hd_dep_count_null_count@2 != row_count@3 AND hd_dep_count_min@0 <= 0 AND 0 <= hd_dep_count_max@1 AND hd_vehicle_count_null_count@5 != row_count@3 AND hd_vehicle_count_min@4 <= 2, required_guarantees=[hd_dep_count in (0, 2, 4)] └────────────────────────────────────────────────── - ┌───── Stage 24 ── Tasks: t0:[p0..p1] t1:[p2..p3] t2:[p4..p5] + ┌───── Stage 24 ── Tasks: t0:[p0..p2] t1:[p3..p5] t2:[p6..p8] │ AggregateExec: mode=Partial, gby=[], aggr=[count(Int64(1))] │ HashJoinExec: mode=CollectLeft, join_type=Inner, on=[(CAST(store.s_store_sk AS Float64)@1, ss_store_sk@0)], projection=[] │ CoalescePartitionsExec @@ -10026,34 +9254,30 @@ mod tests { │ HashJoinExec: mode=CollectLeft, join_type=Inner, on=[(CAST(household_demographics.hd_demo_sk AS Float64)@1, ss_hdemo_sk@1)], projection=[ss_sold_time_sk@2, ss_store_sk@4] │ CoalescePartitionsExec │ [Stage 23] => NetworkBroadcastExec: partitions_per_consumer=3, stage_partitions=9, input_tasks=2 - │ PartitionIsolatorExec: tasks=3 partitions=6 - │ DataSourceExec: file_groups={6 groups: [[/testdata/tpcds/plans_sf1_partitions4/store_sales/part-0.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_sales/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_sales/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_sales/part-2.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_sales/part-2.parquet:..], ...]}, projection=[ss_sold_time_sk, ss_hdemo_sk, ss_store_sk], file_type=parquet, predicate=DynamicFilter [ empty ] AND DynamicFilter [ empty ] AND DynamicFilter [ empty ] + │ DistributedLeafExec: DataSourceExec: file_groups={3 groups: [[/testdata/tpcds/plans_sf1_partitions4/store_sales/part-0.parquet:.., /testdata/tpcds/plans_sf1_partitions4/store_sales/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_sales/part-1.parquet:.., /testdata/tpcds/plans_sf1_partitions4/store_sales/part-2.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_sales/part-2.parquet:.., /testdata/tpcds/plans_sf1_partitions4/store_sales/part-3.parquet:..]]}, projection=[ss_sold_time_sk, ss_hdemo_sk, ss_store_sk], file_type=parquet, predicate=DynamicFilter [ empty ] AND DynamicFilter [ empty ] AND DynamicFilter [ empty ] └────────────────────────────────────────────────── ┌───── Stage 21 ── Tasks: t0:[p0..p8] t1:[p9..p17] │ BroadcastExec: input_partitions=3, consumer_tasks=3, output_partitions=9 │ ProjectionExec: expr=[s_store_sk@0 as s_store_sk, CAST(s_store_sk@0 AS Float64) as CAST(store.s_store_sk AS Float64)] │ FilterExec: s_store_name@1 = ese, projection=[s_store_sk@0] │ RepartitionExec: partitioning=RoundRobinBatch(3), input_partitions=2 - │ PartitionIsolatorExec: tasks=2 partitions=4 - │ DataSourceExec: file_groups={4 groups: [[/testdata/tpcds/plans_sf1_partitions4/store/part-0.parquet], [/testdata/tpcds/plans_sf1_partitions4/store/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/store/part-2.parquet], [/testdata/tpcds/plans_sf1_partitions4/store/part-3.parquet]]}, projection=[s_store_sk, s_store_name], file_type=parquet, predicate=s_store_name@5 = ese, pruning_predicate=s_store_name_null_count@2 != row_count@3 AND s_store_name_min@0 <= ese AND ese <= s_store_name_max@1, required_guarantees=[s_store_name in (ese)] + │ DistributedLeafExec: DataSourceExec: file_groups={2 groups: [[/testdata/tpcds/plans_sf1_partitions4/store/part-0.parquet, /testdata/tpcds/plans_sf1_partitions4/store/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/store/part-2.parquet, /testdata/tpcds/plans_sf1_partitions4/store/part-3.parquet]]}, projection=[s_store_sk, s_store_name], file_type=parquet, predicate=s_store_name@5 = ese, pruning_predicate=s_store_name_null_count@2 != row_count@3 AND s_store_name_min@0 <= ese AND ese <= s_store_name_max@1, required_guarantees=[s_store_name in (ese)] └────────────────────────────────────────────────── ┌───── Stage 22 ── Tasks: t0:[p0..p8] t1:[p9..p17] │ BroadcastExec: input_partitions=3, consumer_tasks=3, output_partitions=9 │ ProjectionExec: expr=[t_time_sk@0 as t_time_sk, CAST(t_time_sk@0 AS Float64) as CAST(time_dim.t_time_sk AS Float64)] │ FilterExec: t_hour@1 = 9 AND t_minute@2 >= 30, projection=[t_time_sk@0] │ RepartitionExec: partitioning=RoundRobinBatch(3), input_partitions=2 - │ PartitionIsolatorExec: tasks=2 partitions=4 - │ DataSourceExec: file_groups={4 groups: [[/testdata/tpcds/plans_sf1_partitions4/time_dim/part-0.parquet], [/testdata/tpcds/plans_sf1_partitions4/time_dim/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/time_dim/part-2.parquet], [/testdata/tpcds/plans_sf1_partitions4/time_dim/part-3.parquet]]}, projection=[t_time_sk, t_hour, t_minute], file_type=parquet, predicate=t_hour@3 = 9 AND t_minute@4 >= 30, pruning_predicate=t_hour_null_count@2 != row_count@3 AND t_hour_min@0 <= 9 AND 9 <= t_hour_max@1 AND t_minute_null_count@5 != row_count@3 AND t_minute_max@4 >= 30, required_guarantees=[t_hour in (9)] + │ DistributedLeafExec: DataSourceExec: file_groups={2 groups: [[/testdata/tpcds/plans_sf1_partitions4/time_dim/part-0.parquet, /testdata/tpcds/plans_sf1_partitions4/time_dim/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/time_dim/part-2.parquet, /testdata/tpcds/plans_sf1_partitions4/time_dim/part-3.parquet]]}, projection=[t_time_sk, t_hour, t_minute], file_type=parquet, predicate=t_hour@3 = 9 AND t_minute@4 >= 30, pruning_predicate=t_hour_null_count@2 != row_count@3 AND t_hour_min@0 <= 9 AND 9 <= t_hour_max@1 AND t_minute_null_count@5 != row_count@3 AND t_minute_max@4 >= 30, required_guarantees=[t_hour in (9)] └────────────────────────────────────────────────── ┌───── Stage 23 ── Tasks: t0:[p0..p8] t1:[p9..p17] │ BroadcastExec: input_partitions=3, consumer_tasks=3, output_partitions=9 │ ProjectionExec: expr=[hd_demo_sk@0 as hd_demo_sk, CAST(hd_demo_sk@0 AS Float64) as CAST(household_demographics.hd_demo_sk AS Float64)] │ FilterExec: hd_dep_count@1 = 4 AND hd_vehicle_count@2 <= 6 OR hd_dep_count@1 = 2 AND hd_vehicle_count@2 <= 4 OR hd_dep_count@1 = 0 AND hd_vehicle_count@2 <= 2, projection=[hd_demo_sk@0] │ RepartitionExec: partitioning=RoundRobinBatch(3), input_partitions=2 - │ PartitionIsolatorExec: tasks=2 partitions=4 - │ DataSourceExec: file_groups={4 groups: [[/testdata/tpcds/plans_sf1_partitions4/household_demographics/part-0.parquet], [/testdata/tpcds/plans_sf1_partitions4/household_demographics/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/household_demographics/part-2.parquet], [/testdata/tpcds/plans_sf1_partitions4/household_demographics/part-3.parquet]]}, projection=[hd_demo_sk, hd_dep_count, hd_vehicle_count], file_type=parquet, predicate=hd_dep_count@3 = 4 AND hd_vehicle_count@4 <= 6 OR hd_dep_count@3 = 2 AND hd_vehicle_count@4 <= 4 OR hd_dep_count@3 = 0 AND hd_vehicle_count@4 <= 2, pruning_predicate=hd_dep_count_null_count@2 != row_count@3 AND hd_dep_count_min@0 <= 4 AND 4 <= hd_dep_count_max@1 AND hd_vehicle_count_null_count@5 != row_count@3 AND hd_vehicle_count_min@4 <= 6 OR hd_dep_count_null_count@2 != row_count@3 AND hd_dep_count_min@0 <= 2 AND 2 <= hd_dep_count_max@1 AND hd_vehicle_count_null_count@5 != row_count@3 AND hd_vehicle_count_min@4 <= 4 OR hd_dep_count_null_count@2 != row_count@3 AND hd_dep_count_min@0 <= 0 AND 0 <= hd_dep_count_max@1 AND hd_vehicle_count_null_count@5 != row_count@3 AND hd_vehicle_count_min@4 <= 2, required_guarantees=[hd_dep_count in (0, 2, 4)] + │ DistributedLeafExec: DataSourceExec: file_groups={2 groups: [[/testdata/tpcds/plans_sf1_partitions4/household_demographics/part-0.parquet, /testdata/tpcds/plans_sf1_partitions4/household_demographics/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/household_demographics/part-2.parquet, /testdata/tpcds/plans_sf1_partitions4/household_demographics/part-3.parquet]]}, projection=[hd_demo_sk, hd_dep_count, hd_vehicle_count], file_type=parquet, predicate=hd_dep_count@3 = 4 AND hd_vehicle_count@4 <= 6 OR hd_dep_count@3 = 2 AND hd_vehicle_count@4 <= 4 OR hd_dep_count@3 = 0 AND hd_vehicle_count@4 <= 2, pruning_predicate=hd_dep_count_null_count@2 != row_count@3 AND hd_dep_count_min@0 <= 4 AND 4 <= hd_dep_count_max@1 AND hd_vehicle_count_null_count@5 != row_count@3 AND hd_vehicle_count_min@4 <= 6 OR hd_dep_count_null_count@2 != row_count@3 AND hd_dep_count_min@0 <= 2 AND 2 <= hd_dep_count_max@1 AND hd_vehicle_count_null_count@5 != row_count@3 AND hd_vehicle_count_min@4 <= 4 OR hd_dep_count_null_count@2 != row_count@3 AND hd_dep_count_min@0 <= 0 AND 0 <= hd_dep_count_max@1 AND hd_vehicle_count_null_count@5 != row_count@3 AND hd_vehicle_count_min@4 <= 2, required_guarantees=[hd_dep_count in (0, 2, 4)] └────────────────────────────────────────────────── - ┌───── Stage 28 ── Tasks: t0:[p0..p1] t1:[p2..p3] t2:[p4..p5] + ┌───── Stage 28 ── Tasks: t0:[p0..p2] t1:[p3..p5] t2:[p6..p8] │ AggregateExec: mode=Partial, gby=[], aggr=[count(Int64(1))] │ HashJoinExec: mode=CollectLeft, join_type=Inner, on=[(CAST(store.s_store_sk AS Float64)@1, ss_store_sk@0)], projection=[] │ CoalescePartitionsExec @@ -10064,34 +9288,30 @@ mod tests { │ HashJoinExec: mode=CollectLeft, join_type=Inner, on=[(CAST(household_demographics.hd_demo_sk AS Float64)@1, ss_hdemo_sk@1)], projection=[ss_sold_time_sk@2, ss_store_sk@4] │ CoalescePartitionsExec │ [Stage 27] => NetworkBroadcastExec: partitions_per_consumer=3, stage_partitions=9, input_tasks=2 - │ PartitionIsolatorExec: tasks=3 partitions=6 - │ DataSourceExec: file_groups={6 groups: [[/testdata/tpcds/plans_sf1_partitions4/store_sales/part-0.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_sales/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_sales/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_sales/part-2.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_sales/part-2.parquet:..], ...]}, projection=[ss_sold_time_sk, ss_hdemo_sk, ss_store_sk], file_type=parquet, predicate=DynamicFilter [ empty ] AND DynamicFilter [ empty ] AND DynamicFilter [ empty ] + │ DistributedLeafExec: DataSourceExec: file_groups={3 groups: [[/testdata/tpcds/plans_sf1_partitions4/store_sales/part-0.parquet:.., /testdata/tpcds/plans_sf1_partitions4/store_sales/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_sales/part-1.parquet:.., /testdata/tpcds/plans_sf1_partitions4/store_sales/part-2.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_sales/part-2.parquet:.., /testdata/tpcds/plans_sf1_partitions4/store_sales/part-3.parquet:..]]}, projection=[ss_sold_time_sk, ss_hdemo_sk, ss_store_sk], file_type=parquet, predicate=DynamicFilter [ empty ] AND DynamicFilter [ empty ] AND DynamicFilter [ empty ] └────────────────────────────────────────────────── ┌───── Stage 25 ── Tasks: t0:[p0..p8] t1:[p9..p17] │ BroadcastExec: input_partitions=3, consumer_tasks=3, output_partitions=9 │ ProjectionExec: expr=[s_store_sk@0 as s_store_sk, CAST(s_store_sk@0 AS Float64) as CAST(store.s_store_sk AS Float64)] │ FilterExec: s_store_name@1 = ese, projection=[s_store_sk@0] │ RepartitionExec: partitioning=RoundRobinBatch(3), input_partitions=2 - │ PartitionIsolatorExec: tasks=2 partitions=4 - │ DataSourceExec: file_groups={4 groups: [[/testdata/tpcds/plans_sf1_partitions4/store/part-0.parquet], [/testdata/tpcds/plans_sf1_partitions4/store/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/store/part-2.parquet], [/testdata/tpcds/plans_sf1_partitions4/store/part-3.parquet]]}, projection=[s_store_sk, s_store_name], file_type=parquet, predicate=s_store_name@5 = ese, pruning_predicate=s_store_name_null_count@2 != row_count@3 AND s_store_name_min@0 <= ese AND ese <= s_store_name_max@1, required_guarantees=[s_store_name in (ese)] + │ DistributedLeafExec: DataSourceExec: file_groups={2 groups: [[/testdata/tpcds/plans_sf1_partitions4/store/part-0.parquet, /testdata/tpcds/plans_sf1_partitions4/store/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/store/part-2.parquet, /testdata/tpcds/plans_sf1_partitions4/store/part-3.parquet]]}, projection=[s_store_sk, s_store_name], file_type=parquet, predicate=s_store_name@5 = ese, pruning_predicate=s_store_name_null_count@2 != row_count@3 AND s_store_name_min@0 <= ese AND ese <= s_store_name_max@1, required_guarantees=[s_store_name in (ese)] └────────────────────────────────────────────────── ┌───── Stage 26 ── Tasks: t0:[p0..p8] t1:[p9..p17] │ BroadcastExec: input_partitions=3, consumer_tasks=3, output_partitions=9 │ ProjectionExec: expr=[t_time_sk@0 as t_time_sk, CAST(t_time_sk@0 AS Float64) as CAST(time_dim.t_time_sk AS Float64)] │ FilterExec: t_hour@1 = 8 AND t_minute@2 >= 30, projection=[t_time_sk@0] │ RepartitionExec: partitioning=RoundRobinBatch(3), input_partitions=2 - │ PartitionIsolatorExec: tasks=2 partitions=4 - │ DataSourceExec: file_groups={4 groups: [[/testdata/tpcds/plans_sf1_partitions4/time_dim/part-0.parquet], [/testdata/tpcds/plans_sf1_partitions4/time_dim/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/time_dim/part-2.parquet], [/testdata/tpcds/plans_sf1_partitions4/time_dim/part-3.parquet]]}, projection=[t_time_sk, t_hour, t_minute], file_type=parquet, predicate=t_hour@3 = 8 AND t_minute@4 >= 30, pruning_predicate=t_hour_null_count@2 != row_count@3 AND t_hour_min@0 <= 8 AND 8 <= t_hour_max@1 AND t_minute_null_count@5 != row_count@3 AND t_minute_max@4 >= 30, required_guarantees=[t_hour in (8)] + │ DistributedLeafExec: DataSourceExec: file_groups={2 groups: [[/testdata/tpcds/plans_sf1_partitions4/time_dim/part-0.parquet, /testdata/tpcds/plans_sf1_partitions4/time_dim/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/time_dim/part-2.parquet, /testdata/tpcds/plans_sf1_partitions4/time_dim/part-3.parquet]]}, projection=[t_time_sk, t_hour, t_minute], file_type=parquet, predicate=t_hour@3 = 8 AND t_minute@4 >= 30, pruning_predicate=t_hour_null_count@2 != row_count@3 AND t_hour_min@0 <= 8 AND 8 <= t_hour_max@1 AND t_minute_null_count@5 != row_count@3 AND t_minute_max@4 >= 30, required_guarantees=[t_hour in (8)] └────────────────────────────────────────────────── ┌───── Stage 27 ── Tasks: t0:[p0..p8] t1:[p9..p17] │ BroadcastExec: input_partitions=3, consumer_tasks=3, output_partitions=9 │ ProjectionExec: expr=[hd_demo_sk@0 as hd_demo_sk, CAST(hd_demo_sk@0 AS Float64) as CAST(household_demographics.hd_demo_sk AS Float64)] │ FilterExec: hd_dep_count@1 = 4 AND hd_vehicle_count@2 <= 6 OR hd_dep_count@1 = 2 AND hd_vehicle_count@2 <= 4 OR hd_dep_count@1 = 0 AND hd_vehicle_count@2 <= 2, projection=[hd_demo_sk@0] │ RepartitionExec: partitioning=RoundRobinBatch(3), input_partitions=2 - │ PartitionIsolatorExec: tasks=2 partitions=4 - │ DataSourceExec: file_groups={4 groups: [[/testdata/tpcds/plans_sf1_partitions4/household_demographics/part-0.parquet], [/testdata/tpcds/plans_sf1_partitions4/household_demographics/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/household_demographics/part-2.parquet], [/testdata/tpcds/plans_sf1_partitions4/household_demographics/part-3.parquet]]}, projection=[hd_demo_sk, hd_dep_count, hd_vehicle_count], file_type=parquet, predicate=hd_dep_count@3 = 4 AND hd_vehicle_count@4 <= 6 OR hd_dep_count@3 = 2 AND hd_vehicle_count@4 <= 4 OR hd_dep_count@3 = 0 AND hd_vehicle_count@4 <= 2, pruning_predicate=hd_dep_count_null_count@2 != row_count@3 AND hd_dep_count_min@0 <= 4 AND 4 <= hd_dep_count_max@1 AND hd_vehicle_count_null_count@5 != row_count@3 AND hd_vehicle_count_min@4 <= 6 OR hd_dep_count_null_count@2 != row_count@3 AND hd_dep_count_min@0 <= 2 AND 2 <= hd_dep_count_max@1 AND hd_vehicle_count_null_count@5 != row_count@3 AND hd_vehicle_count_min@4 <= 4 OR hd_dep_count_null_count@2 != row_count@3 AND hd_dep_count_min@0 <= 0 AND 0 <= hd_dep_count_max@1 AND hd_vehicle_count_null_count@5 != row_count@3 AND hd_vehicle_count_min@4 <= 2, required_guarantees=[hd_dep_count in (0, 2, 4)] + │ DistributedLeafExec: DataSourceExec: file_groups={2 groups: [[/testdata/tpcds/plans_sf1_partitions4/household_demographics/part-0.parquet, /testdata/tpcds/plans_sf1_partitions4/household_demographics/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/household_demographics/part-2.parquet, /testdata/tpcds/plans_sf1_partitions4/household_demographics/part-3.parquet]]}, projection=[hd_demo_sk, hd_dep_count, hd_vehicle_count], file_type=parquet, predicate=hd_dep_count@3 = 4 AND hd_vehicle_count@4 <= 6 OR hd_dep_count@3 = 2 AND hd_vehicle_count@4 <= 4 OR hd_dep_count@3 = 0 AND hd_vehicle_count@4 <= 2, pruning_predicate=hd_dep_count_null_count@2 != row_count@3 AND hd_dep_count_min@0 <= 4 AND 4 <= hd_dep_count_max@1 AND hd_vehicle_count_null_count@5 != row_count@3 AND hd_vehicle_count_min@4 <= 6 OR hd_dep_count_null_count@2 != row_count@3 AND hd_dep_count_min@0 <= 2 AND 2 <= hd_dep_count_max@1 AND hd_vehicle_count_null_count@5 != row_count@3 AND hd_vehicle_count_min@4 <= 4 OR hd_dep_count_null_count@2 != row_count@3 AND hd_dep_count_min@0 <= 0 AND 0 <= hd_dep_count_max@1 AND hd_vehicle_count_null_count@5 != row_count@3 AND hd_vehicle_count_min@4 <= 2, required_guarantees=[hd_dep_count in (0, 2, 4)] └────────────────────────────────────────────────── - ┌───── Stage 32 ── Tasks: t0:[p0..p1] t1:[p2..p3] t2:[p4..p5] + ┌───── Stage 32 ── Tasks: t0:[p0..p2] t1:[p3..p5] t2:[p6..p8] │ AggregateExec: mode=Partial, gby=[], aggr=[count(Int64(1))] │ HashJoinExec: mode=CollectLeft, join_type=Inner, on=[(CAST(store.s_store_sk AS Float64)@1, ss_store_sk@0)], projection=[] │ CoalescePartitionsExec @@ -10102,32 +9322,28 @@ mod tests { │ HashJoinExec: mode=CollectLeft, join_type=Inner, on=[(CAST(household_demographics.hd_demo_sk AS Float64)@1, ss_hdemo_sk@1)], projection=[ss_sold_time_sk@2, ss_store_sk@4] │ CoalescePartitionsExec │ [Stage 31] => NetworkBroadcastExec: partitions_per_consumer=3, stage_partitions=9, input_tasks=2 - │ PartitionIsolatorExec: tasks=3 partitions=6 - │ DataSourceExec: file_groups={6 groups: [[/testdata/tpcds/plans_sf1_partitions4/store_sales/part-0.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_sales/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_sales/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_sales/part-2.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_sales/part-2.parquet:..], ...]}, projection=[ss_sold_time_sk, ss_hdemo_sk, ss_store_sk], file_type=parquet, predicate=DynamicFilter [ empty ] AND DynamicFilter [ empty ] AND DynamicFilter [ empty ] + │ DistributedLeafExec: DataSourceExec: file_groups={3 groups: [[/testdata/tpcds/plans_sf1_partitions4/store_sales/part-0.parquet:.., /testdata/tpcds/plans_sf1_partitions4/store_sales/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_sales/part-1.parquet:.., /testdata/tpcds/plans_sf1_partitions4/store_sales/part-2.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_sales/part-2.parquet:.., /testdata/tpcds/plans_sf1_partitions4/store_sales/part-3.parquet:..]]}, projection=[ss_sold_time_sk, ss_hdemo_sk, ss_store_sk], file_type=parquet, predicate=DynamicFilter [ empty ] AND DynamicFilter [ empty ] AND DynamicFilter [ empty ] └────────────────────────────────────────────────── ┌───── Stage 29 ── Tasks: t0:[p0..p8] t1:[p9..p17] │ BroadcastExec: input_partitions=3, consumer_tasks=3, output_partitions=9 │ ProjectionExec: expr=[s_store_sk@0 as s_store_sk, CAST(s_store_sk@0 AS Float64) as CAST(store.s_store_sk AS Float64)] │ FilterExec: s_store_name@1 = ese, projection=[s_store_sk@0] │ RepartitionExec: partitioning=RoundRobinBatch(3), input_partitions=2 - │ PartitionIsolatorExec: tasks=2 partitions=4 - │ DataSourceExec: file_groups={4 groups: [[/testdata/tpcds/plans_sf1_partitions4/store/part-0.parquet], [/testdata/tpcds/plans_sf1_partitions4/store/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/store/part-2.parquet], [/testdata/tpcds/plans_sf1_partitions4/store/part-3.parquet]]}, projection=[s_store_sk, s_store_name], file_type=parquet, predicate=s_store_name@5 = ese, pruning_predicate=s_store_name_null_count@2 != row_count@3 AND s_store_name_min@0 <= ese AND ese <= s_store_name_max@1, required_guarantees=[s_store_name in (ese)] + │ DistributedLeafExec: DataSourceExec: file_groups={2 groups: [[/testdata/tpcds/plans_sf1_partitions4/store/part-0.parquet, /testdata/tpcds/plans_sf1_partitions4/store/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/store/part-2.parquet, /testdata/tpcds/plans_sf1_partitions4/store/part-3.parquet]]}, projection=[s_store_sk, s_store_name], file_type=parquet, predicate=s_store_name@5 = ese, pruning_predicate=s_store_name_null_count@2 != row_count@3 AND s_store_name_min@0 <= ese AND ese <= s_store_name_max@1, required_guarantees=[s_store_name in (ese)] └────────────────────────────────────────────────── ┌───── Stage 30 ── Tasks: t0:[p0..p8] t1:[p9..p17] │ BroadcastExec: input_partitions=3, consumer_tasks=3, output_partitions=9 │ ProjectionExec: expr=[t_time_sk@0 as t_time_sk, CAST(t_time_sk@0 AS Float64) as CAST(time_dim.t_time_sk AS Float64)] │ FilterExec: t_hour@1 = 9 AND t_minute@2 < 30, projection=[t_time_sk@0] │ RepartitionExec: partitioning=RoundRobinBatch(3), input_partitions=2 - │ PartitionIsolatorExec: tasks=2 partitions=4 - │ DataSourceExec: file_groups={4 groups: [[/testdata/tpcds/plans_sf1_partitions4/time_dim/part-0.parquet], [/testdata/tpcds/plans_sf1_partitions4/time_dim/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/time_dim/part-2.parquet], [/testdata/tpcds/plans_sf1_partitions4/time_dim/part-3.parquet]]}, projection=[t_time_sk, t_hour, t_minute], file_type=parquet, predicate=t_hour@3 = 9 AND t_minute@4 < 30, pruning_predicate=t_hour_null_count@2 != row_count@3 AND t_hour_min@0 <= 9 AND 9 <= t_hour_max@1 AND t_minute_null_count@5 != row_count@3 AND t_minute_min@4 < 30, required_guarantees=[t_hour in (9)] + │ DistributedLeafExec: DataSourceExec: file_groups={2 groups: [[/testdata/tpcds/plans_sf1_partitions4/time_dim/part-0.parquet, /testdata/tpcds/plans_sf1_partitions4/time_dim/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/time_dim/part-2.parquet, /testdata/tpcds/plans_sf1_partitions4/time_dim/part-3.parquet]]}, projection=[t_time_sk, t_hour, t_minute], file_type=parquet, predicate=t_hour@3 = 9 AND t_minute@4 < 30, pruning_predicate=t_hour_null_count@2 != row_count@3 AND t_hour_min@0 <= 9 AND 9 <= t_hour_max@1 AND t_minute_null_count@5 != row_count@3 AND t_minute_min@4 < 30, required_guarantees=[t_hour in (9)] └────────────────────────────────────────────────── ┌───── Stage 31 ── Tasks: t0:[p0..p8] t1:[p9..p17] │ BroadcastExec: input_partitions=3, consumer_tasks=3, output_partitions=9 │ ProjectionExec: expr=[hd_demo_sk@0 as hd_demo_sk, CAST(hd_demo_sk@0 AS Float64) as CAST(household_demographics.hd_demo_sk AS Float64)] │ FilterExec: hd_dep_count@1 = 4 AND hd_vehicle_count@2 <= 6 OR hd_dep_count@1 = 2 AND hd_vehicle_count@2 <= 4 OR hd_dep_count@1 = 0 AND hd_vehicle_count@2 <= 2, projection=[hd_demo_sk@0] │ RepartitionExec: partitioning=RoundRobinBatch(3), input_partitions=2 - │ PartitionIsolatorExec: tasks=2 partitions=4 - │ DataSourceExec: file_groups={4 groups: [[/testdata/tpcds/plans_sf1_partitions4/household_demographics/part-0.parquet], [/testdata/tpcds/plans_sf1_partitions4/household_demographics/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/household_demographics/part-2.parquet], [/testdata/tpcds/plans_sf1_partitions4/household_demographics/part-3.parquet]]}, projection=[hd_demo_sk, hd_dep_count, hd_vehicle_count], file_type=parquet, predicate=hd_dep_count@3 = 4 AND hd_vehicle_count@4 <= 6 OR hd_dep_count@3 = 2 AND hd_vehicle_count@4 <= 4 OR hd_dep_count@3 = 0 AND hd_vehicle_count@4 <= 2, pruning_predicate=hd_dep_count_null_count@2 != row_count@3 AND hd_dep_count_min@0 <= 4 AND 4 <= hd_dep_count_max@1 AND hd_vehicle_count_null_count@5 != row_count@3 AND hd_vehicle_count_min@4 <= 6 OR hd_dep_count_null_count@2 != row_count@3 AND hd_dep_count_min@0 <= 2 AND 2 <= hd_dep_count_max@1 AND hd_vehicle_count_null_count@5 != row_count@3 AND hd_vehicle_count_min@4 <= 4 OR hd_dep_count_null_count@2 != row_count@3 AND hd_dep_count_min@0 <= 0 AND 0 <= hd_dep_count_max@1 AND hd_vehicle_count_null_count@5 != row_count@3 AND hd_vehicle_count_min@4 <= 2, required_guarantees=[hd_dep_count in (0, 2, 4)] + │ DistributedLeafExec: DataSourceExec: file_groups={2 groups: [[/testdata/tpcds/plans_sf1_partitions4/household_demographics/part-0.parquet, /testdata/tpcds/plans_sf1_partitions4/household_demographics/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/household_demographics/part-2.parquet, /testdata/tpcds/plans_sf1_partitions4/household_demographics/part-3.parquet]]}, projection=[hd_demo_sk, hd_dep_count, hd_vehicle_count], file_type=parquet, predicate=hd_dep_count@3 = 4 AND hd_vehicle_count@4 <= 6 OR hd_dep_count@3 = 2 AND hd_vehicle_count@4 <= 4 OR hd_dep_count@3 = 0 AND hd_vehicle_count@4 <= 2, pruning_predicate=hd_dep_count_null_count@2 != row_count@3 AND hd_dep_count_min@0 <= 4 AND 4 <= hd_dep_count_max@1 AND hd_vehicle_count_null_count@5 != row_count@3 AND hd_vehicle_count_min@4 <= 6 OR hd_dep_count_null_count@2 != row_count@3 AND hd_dep_count_min@0 <= 2 AND 2 <= hd_dep_count_max@1 AND hd_vehicle_count_null_count@5 != row_count@3 AND hd_vehicle_count_min@4 <= 4 OR hd_dep_count_null_count@2 != row_count@3 AND hd_dep_count_min@0 <= 0 AND 0 <= hd_dep_count_max@1 AND hd_vehicle_count_null_count@5 != row_count@3 AND hd_vehicle_count_min@4 <= 2, required_guarantees=[hd_dep_count in (0, 2, 4)] └────────────────────────────────────────────────── "); Ok(()) @@ -10151,7 +9367,7 @@ mod tests { │ [Stage 4] => NetworkShuffleExec: output_partitions=3, input_tasks=3 └────────────────────────────────────────────────── ┌───── Stage 4 ── Tasks: t0:[p0..p5] t1:[p0..p5] t2:[p0..p5] - │ RepartitionExec: partitioning=Hash([i_category@0, i_class@1, i_brand@2, s_store_name@3, s_company_name@4, d_moy@5], 6), input_partitions=2 + │ RepartitionExec: partitioning=Hash([i_category@0, i_class@1, i_brand@2, s_store_name@3, s_company_name@4, d_moy@5], 6), input_partitions=3 │ AggregateExec: mode=Partial, gby=[i_category@2 as i_category, i_class@1 as i_class, i_brand@0 as i_brand, s_store_name@5 as s_store_name, s_company_name@6 as s_company_name, d_moy@4 as d_moy], aggr=[sum(store_sales.ss_sales_price)] │ ProjectionExec: expr=[i_brand@2 as i_brand, i_class@3 as i_class, i_category@4 as i_category, ss_sales_price@5 as ss_sales_price, d_moy@6 as d_moy, s_store_name@0 as s_store_name, s_company_name@1 as s_company_name] │ HashJoinExec: mode=CollectLeft, join_type=Inner, on=[(CAST(store.s_store_sk AS Float64)@3, ss_store_sk@3)], projection=[s_store_name@1, s_company_name@2, i_brand@4, i_class@5, i_category@6, ss_sales_price@8, d_moy@9] @@ -10164,28 +9380,24 @@ mod tests { │ HashJoinExec: mode=CollectLeft, join_type=Inner, on=[(i_item_sk@0, ss_item_sk@1)], projection=[i_brand@1, i_class@2, i_category@3, ss_sold_date_sk@4, ss_store_sk@6, ss_sales_price@7] │ CoalescePartitionsExec │ [Stage 3] => NetworkBroadcastExec: partitions_per_consumer=3, stage_partitions=9, input_tasks=2 - │ PartitionIsolatorExec: tasks=3 partitions=6 - │ DataSourceExec: file_groups={6 groups: [[/testdata/tpcds/plans_sf1_partitions4/store_sales/part-0.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_sales/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_sales/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_sales/part-2.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_sales/part-2.parquet:..], ...]}, projection=[ss_sold_date_sk, ss_item_sk, ss_store_sk, ss_sales_price], file_type=parquet, predicate=DynamicFilter [ empty ] AND DynamicFilter [ empty ] AND DynamicFilter [ empty ] + │ DistributedLeafExec: DataSourceExec: file_groups={3 groups: [[/testdata/tpcds/plans_sf1_partitions4/store_sales/part-0.parquet:.., /testdata/tpcds/plans_sf1_partitions4/store_sales/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_sales/part-1.parquet:.., /testdata/tpcds/plans_sf1_partitions4/store_sales/part-2.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_sales/part-2.parquet:.., /testdata/tpcds/plans_sf1_partitions4/store_sales/part-3.parquet:..]]}, projection=[ss_sold_date_sk, ss_item_sk, ss_store_sk, ss_sales_price], file_type=parquet, predicate=DynamicFilter [ empty ] AND DynamicFilter [ empty ] AND DynamicFilter [ empty ] └────────────────────────────────────────────────── ┌───── Stage 1 ── Tasks: t0:[p0..p5] t1:[p6..p11] │ BroadcastExec: input_partitions=2, consumer_tasks=3, output_partitions=6 - │ PartitionIsolatorExec: tasks=2 partitions=4 - │ DataSourceExec: file_groups={4 groups: [[/testdata/tpcds/plans_sf1_partitions4/store/part-0.parquet], [/testdata/tpcds/plans_sf1_partitions4/store/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/store/part-2.parquet], [/testdata/tpcds/plans_sf1_partitions4/store/part-3.parquet]]}, projection=[s_store_sk, s_store_name, s_company_name, CAST(s_store_sk@0 AS Float64) as CAST(store.s_store_sk AS Float64)], file_type=parquet + │ DistributedLeafExec: DataSourceExec: file_groups={2 groups: [[/testdata/tpcds/plans_sf1_partitions4/store/part-0.parquet, /testdata/tpcds/plans_sf1_partitions4/store/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/store/part-2.parquet, /testdata/tpcds/plans_sf1_partitions4/store/part-3.parquet]]}, projection=[s_store_sk, s_store_name, s_company_name, CAST(s_store_sk@0 AS Float64) as CAST(store.s_store_sk AS Float64)], file_type=parquet └────────────────────────────────────────────────── ┌───── Stage 2 ── Tasks: t0:[p0..p8] t1:[p9..p17] │ BroadcastExec: input_partitions=3, consumer_tasks=3, output_partitions=9 │ ProjectionExec: expr=[d_date_sk@0 as d_date_sk, d_moy@1 as d_moy, CAST(d_date_sk@0 AS Float64) as CAST(date_dim.d_date_sk AS Float64)] │ FilterExec: d_year@1 = 1999, projection=[d_date_sk@0, d_moy@2] │ RepartitionExec: partitioning=RoundRobinBatch(3), input_partitions=2 - │ PartitionIsolatorExec: tasks=2 partitions=4 - │ DataSourceExec: file_groups={4 groups: [[/testdata/tpcds/plans_sf1_partitions4/date_dim/part-0.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-2.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-3.parquet]]}, projection=[d_date_sk, d_year, d_moy], file_type=parquet, predicate=d_year@6 = 1999, pruning_predicate=d_year_null_count@2 != row_count@3 AND d_year_min@0 <= 1999 AND 1999 <= d_year_max@1, required_guarantees=[d_year in (1999)] + │ DistributedLeafExec: DataSourceExec: file_groups={2 groups: [[/testdata/tpcds/plans_sf1_partitions4/date_dim/part-0.parquet, /testdata/tpcds/plans_sf1_partitions4/date_dim/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-2.parquet, /testdata/tpcds/plans_sf1_partitions4/date_dim/part-3.parquet]]}, projection=[d_date_sk, d_year, d_moy], file_type=parquet, predicate=d_year@6 = 1999, pruning_predicate=d_year_null_count@2 != row_count@3 AND d_year_min@0 <= 1999 AND 1999 <= d_year_max@1, required_guarantees=[d_year in (1999)] └────────────────────────────────────────────────── ┌───── Stage 3 ── Tasks: t0:[p0..p8] t1:[p9..p17] │ BroadcastExec: input_partitions=3, consumer_tasks=3, output_partitions=9 │ FilterExec: (i_category@3 = Books OR i_category@3 = Electronics OR i_category@3 = Sports) AND (i_class@2 = computers OR i_class@2 = stereo OR i_class@2 = football) OR (i_category@3 = Men OR i_category@3 = Jewelry OR i_category@3 = Women) AND (i_class@2 = shirts OR i_class@2 = birdal OR i_class@2 = dresses) │ RepartitionExec: partitioning=RoundRobinBatch(3), input_partitions=2 - │ PartitionIsolatorExec: tasks=2 partitions=4 - │ DataSourceExec: file_groups={4 groups: [[/testdata/tpcds/plans_sf1_partitions4/item/part-0.parquet], [/testdata/tpcds/plans_sf1_partitions4/item/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/item/part-2.parquet], [/testdata/tpcds/plans_sf1_partitions4/item/part-3.parquet]]}, projection=[i_item_sk, i_brand, i_class, i_category], file_type=parquet, predicate=(i_category@12 = Books OR i_category@12 = Electronics OR i_category@12 = Sports) AND (i_class@10 = computers OR i_class@10 = stereo OR i_class@10 = football) OR (i_category@12 = Men OR i_category@12 = Jewelry OR i_category@12 = Women) AND (i_class@10 = shirts OR i_class@10 = birdal OR i_class@10 = dresses), pruning_predicate=(i_category_null_count@2 != row_count@3 AND i_category_min@0 <= Books AND Books <= i_category_max@1 OR i_category_null_count@2 != row_count@3 AND i_category_min@0 <= Electronics AND Electronics <= i_category_max@1 OR i_category_null_count@2 != row_count@3 AND i_category_min@0 <= Sports AND Sports <= i_category_max@1) AND (i_class_null_count@6 != row_count@3 AND i_class_min@4 <= computers AND computers <= i_class_max@5 OR i_class_null_count@6 != row_count@3 AND i_class_min@4 <= stereo AND stereo <= i_class_max@5 OR i_class_null_count@6 != row_count@3 AND i_class_min@4 <= football AND football <= i_class_max@5) OR (i_category_null_count@2 != row_count@3 AND i_category_min@0 <= Men AND Men <= i_category_max@1 OR i_category_null_count@2 != row_count@3 AND i_category_min@0 <= Jewelry AND Jewelry <= i_category_max@1 OR i_category_null_count@2 != row_count@3 AND i_category_min@0 <= Women AND Women <= i_category_max@1) AND (i_class_null_count@6 != row_count@3 AND i_class_min@4 <= shirts AND shirts <= i_class_max@5 OR i_class_null_count@6 != row_count@3 AND i_class_min@4 <= birdal AND birdal <= i_class_max@5 OR i_class_null_count@6 != row_count@3 AND i_class_min@4 <= dresses AND dresses <= i_class_max@5), required_guarantees=[] + │ DistributedLeafExec: DataSourceExec: file_groups={2 groups: [[/testdata/tpcds/plans_sf1_partitions4/item/part-0.parquet, /testdata/tpcds/plans_sf1_partitions4/item/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/item/part-2.parquet, /testdata/tpcds/plans_sf1_partitions4/item/part-3.parquet]]}, projection=[i_item_sk, i_brand, i_class, i_category], file_type=parquet, predicate=(i_category@12 = Books OR i_category@12 = Electronics OR i_category@12 = Sports) AND (i_class@10 = computers OR i_class@10 = stereo OR i_class@10 = football) OR (i_category@12 = Men OR i_category@12 = Jewelry OR i_category@12 = Women) AND (i_class@10 = shirts OR i_class@10 = birdal OR i_class@10 = dresses), pruning_predicate=(i_category_null_count@2 != row_count@3 AND i_category_min@0 <= Books AND Books <= i_category_max@1 OR i_category_null_count@2 != row_count@3 AND i_category_min@0 <= Electronics AND Electronics <= i_category_max@1 OR i_category_null_count@2 != row_count@3 AND i_category_min@0 <= Sports AND Sports <= i_category_max@1) AND (i_class_null_count@6 != row_count@3 AND i_class_min@4 <= computers AND computers <= i_class_max@5 OR i_class_null_count@6 != row_count@3 AND i_class_min@4 <= stereo AND stereo <= i_class_max@5 OR i_class_null_count@6 != row_count@3 AND i_class_min@4 <= football AND football <= i_class_max@5) OR (i_category_null_count@2 != row_count@3 AND i_category_min@0 <= Men AND Men <= i_category_max@1 OR i_category_null_count@2 != row_count@3 AND i_category_min@0 <= Jewelry AND Jewelry <= i_category_max@1 OR i_category_null_count@2 != row_count@3 AND i_category_min@0 <= Women AND Women <= i_category_max@1) AND (i_class_null_count@6 != row_count@3 AND i_class_min@4 <= shirts AND shirts <= i_class_max@5 OR i_class_null_count@6 != row_count@3 AND i_class_min@4 <= birdal AND birdal <= i_class_max@5 OR i_class_null_count@6 != row_count@3 AND i_class_min@4 <= dresses AND dresses <= i_class_max@5), required_guarantees=[] └────────────────────────────────────────────────── "#); Ok(()) @@ -10193,7 +9405,7 @@ mod tests { #[tokio::test] async fn test_tpcds_90() -> Result<()> { let display = test_tpcds_query("q90").await?; - assert_snapshot!(display, @" + assert_snapshot!(display, @r" ┌───── DistributedExec ── Tasks: t0:[p0] │ SortExec: TopK(fetch=100), expr=[am_pm_ratio@0 ASC NULLS LAST], preserve_partitioning=[false] │ ProjectionExec: expr=[CASE WHEN pmc@1 = 0 THEN None,23,8 ELSE CAST(amc@0 AS Decimal128(15, 4)) / CAST(pmc@1 AS Decimal128(15, 4)) END as am_pm_ratio] @@ -10201,13 +9413,13 @@ mod tests { │ ProjectionExec: expr=[count(Int64(1))@0 as amc] │ AggregateExec: mode=Final, gby=[], aggr=[count(Int64(1))] │ CoalescePartitionsExec - │ [Stage 4] => NetworkCoalesceExec: output_partitions=6, input_tasks=3 + │ [Stage 4] => NetworkCoalesceExec: output_partitions=9, input_tasks=3 │ ProjectionExec: expr=[count(Int64(1))@0 as pmc] │ AggregateExec: mode=Final, gby=[], aggr=[count(Int64(1))] │ CoalescePartitionsExec - │ [Stage 8] => NetworkCoalesceExec: output_partitions=6, input_tasks=3 + │ [Stage 8] => NetworkCoalesceExec: output_partitions=9, input_tasks=3 └────────────────────────────────────────────────── - ┌───── Stage 4 ── Tasks: t0:[p0..p1] t1:[p2..p3] t2:[p4..p5] + ┌───── Stage 4 ── Tasks: t0:[p0..p2] t1:[p3..p5] t2:[p6..p8] │ AggregateExec: mode=Partial, gby=[], aggr=[count(Int64(1))] │ HashJoinExec: mode=CollectLeft, join_type=Inner, on=[(CAST(web_page.wp_web_page_sk AS Float64)@1, ws_web_page_sk@0)], projection=[] │ CoalescePartitionsExec @@ -10218,34 +9430,30 @@ mod tests { │ HashJoinExec: mode=CollectLeft, join_type=Inner, on=[(CAST(household_demographics.hd_demo_sk AS Float64)@1, ws_ship_hdemo_sk@1)], projection=[ws_sold_time_sk@2, ws_web_page_sk@4] │ CoalescePartitionsExec │ [Stage 3] => NetworkBroadcastExec: partitions_per_consumer=3, stage_partitions=9, input_tasks=2 - │ PartitionIsolatorExec: tasks=3 partitions=6 - │ DataSourceExec: file_groups={6 groups: [[/testdata/tpcds/plans_sf1_partitions4/web_sales/part-0.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/web_sales/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/web_sales/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/web_sales/part-2.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/web_sales/part-2.parquet:..], ...]}, projection=[ws_sold_time_sk, ws_ship_hdemo_sk, ws_web_page_sk], file_type=parquet, predicate=DynamicFilter [ empty ] AND DynamicFilter [ empty ] AND DynamicFilter [ empty ] + │ DistributedLeafExec: DataSourceExec: file_groups={3 groups: [[/testdata/tpcds/plans_sf1_partitions4/web_sales/part-0.parquet:.., /testdata/tpcds/plans_sf1_partitions4/web_sales/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/web_sales/part-1.parquet:.., /testdata/tpcds/plans_sf1_partitions4/web_sales/part-2.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/web_sales/part-2.parquet:.., /testdata/tpcds/plans_sf1_partitions4/web_sales/part-3.parquet:..]]}, projection=[ws_sold_time_sk, ws_ship_hdemo_sk, ws_web_page_sk], file_type=parquet, predicate=DynamicFilter [ empty ] AND DynamicFilter [ empty ] AND DynamicFilter [ empty ] └────────────────────────────────────────────────── ┌───── Stage 1 ── Tasks: t0:[p0..p8] t1:[p9..p17] │ BroadcastExec: input_partitions=3, consumer_tasks=3, output_partitions=9 │ ProjectionExec: expr=[wp_web_page_sk@0 as wp_web_page_sk, CAST(wp_web_page_sk@0 AS Float64) as CAST(web_page.wp_web_page_sk AS Float64)] │ FilterExec: wp_char_count@1 >= 5000 AND wp_char_count@1 <= 5200, projection=[wp_web_page_sk@0] │ RepartitionExec: partitioning=RoundRobinBatch(3), input_partitions=2 - │ PartitionIsolatorExec: tasks=2 partitions=4 - │ DataSourceExec: file_groups={4 groups: [[/testdata/tpcds/plans_sf1_partitions4/web_page/part-0.parquet], [/testdata/tpcds/plans_sf1_partitions4/web_page/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/web_page/part-2.parquet], [/testdata/tpcds/plans_sf1_partitions4/web_page/part-3.parquet]]}, projection=[wp_web_page_sk, wp_char_count], file_type=parquet, predicate=wp_char_count@10 >= 5000 AND wp_char_count@10 <= 5200, pruning_predicate=wp_char_count_null_count@1 != row_count@2 AND wp_char_count_max@0 >= 5000 AND wp_char_count_null_count@1 != row_count@2 AND wp_char_count_min@3 <= 5200, required_guarantees=[] + │ DistributedLeafExec: DataSourceExec: file_groups={2 groups: [[/testdata/tpcds/plans_sf1_partitions4/web_page/part-0.parquet, /testdata/tpcds/plans_sf1_partitions4/web_page/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/web_page/part-2.parquet, /testdata/tpcds/plans_sf1_partitions4/web_page/part-3.parquet]]}, projection=[wp_web_page_sk, wp_char_count], file_type=parquet, predicate=wp_char_count@10 >= 5000 AND wp_char_count@10 <= 5200, pruning_predicate=wp_char_count_null_count@1 != row_count@2 AND wp_char_count_max@0 >= 5000 AND wp_char_count_null_count@1 != row_count@2 AND wp_char_count_min@3 <= 5200, required_guarantees=[] └────────────────────────────────────────────────── ┌───── Stage 2 ── Tasks: t0:[p0..p8] t1:[p9..p17] │ BroadcastExec: input_partitions=3, consumer_tasks=3, output_partitions=9 │ ProjectionExec: expr=[t_time_sk@0 as t_time_sk, CAST(t_time_sk@0 AS Float64) as CAST(time_dim.t_time_sk AS Float64)] │ FilterExec: t_hour@1 >= 8 AND t_hour@1 <= 9, projection=[t_time_sk@0] │ RepartitionExec: partitioning=RoundRobinBatch(3), input_partitions=2 - │ PartitionIsolatorExec: tasks=2 partitions=4 - │ DataSourceExec: file_groups={4 groups: [[/testdata/tpcds/plans_sf1_partitions4/time_dim/part-0.parquet], [/testdata/tpcds/plans_sf1_partitions4/time_dim/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/time_dim/part-2.parquet], [/testdata/tpcds/plans_sf1_partitions4/time_dim/part-3.parquet]]}, projection=[t_time_sk, t_hour], file_type=parquet, predicate=t_hour@3 >= 8 AND t_hour@3 <= 9, pruning_predicate=t_hour_null_count@1 != row_count@2 AND t_hour_max@0 >= 8 AND t_hour_null_count@1 != row_count@2 AND t_hour_min@3 <= 9, required_guarantees=[] + │ DistributedLeafExec: DataSourceExec: file_groups={2 groups: [[/testdata/tpcds/plans_sf1_partitions4/time_dim/part-0.parquet, /testdata/tpcds/plans_sf1_partitions4/time_dim/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/time_dim/part-2.parquet, /testdata/tpcds/plans_sf1_partitions4/time_dim/part-3.parquet]]}, projection=[t_time_sk, t_hour], file_type=parquet, predicate=t_hour@3 >= 8 AND t_hour@3 <= 9, pruning_predicate=t_hour_null_count@1 != row_count@2 AND t_hour_max@0 >= 8 AND t_hour_null_count@1 != row_count@2 AND t_hour_min@3 <= 9, required_guarantees=[] └────────────────────────────────────────────────── ┌───── Stage 3 ── Tasks: t0:[p0..p8] t1:[p9..p17] │ BroadcastExec: input_partitions=3, consumer_tasks=3, output_partitions=9 │ ProjectionExec: expr=[hd_demo_sk@0 as hd_demo_sk, CAST(hd_demo_sk@0 AS Float64) as CAST(household_demographics.hd_demo_sk AS Float64)] │ FilterExec: hd_dep_count@1 = 6, projection=[hd_demo_sk@0] │ RepartitionExec: partitioning=RoundRobinBatch(3), input_partitions=2 - │ PartitionIsolatorExec: tasks=2 partitions=4 - │ DataSourceExec: file_groups={4 groups: [[/testdata/tpcds/plans_sf1_partitions4/household_demographics/part-0.parquet], [/testdata/tpcds/plans_sf1_partitions4/household_demographics/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/household_demographics/part-2.parquet], [/testdata/tpcds/plans_sf1_partitions4/household_demographics/part-3.parquet]]}, projection=[hd_demo_sk, hd_dep_count], file_type=parquet, predicate=hd_dep_count@3 = 6, pruning_predicate=hd_dep_count_null_count@2 != row_count@3 AND hd_dep_count_min@0 <= 6 AND 6 <= hd_dep_count_max@1, required_guarantees=[hd_dep_count in (6)] + │ DistributedLeafExec: DataSourceExec: file_groups={2 groups: [[/testdata/tpcds/plans_sf1_partitions4/household_demographics/part-0.parquet, /testdata/tpcds/plans_sf1_partitions4/household_demographics/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/household_demographics/part-2.parquet, /testdata/tpcds/plans_sf1_partitions4/household_demographics/part-3.parquet]]}, projection=[hd_demo_sk, hd_dep_count], file_type=parquet, predicate=hd_dep_count@3 = 6, pruning_predicate=hd_dep_count_null_count@2 != row_count@3 AND hd_dep_count_min@0 <= 6 AND 6 <= hd_dep_count_max@1, required_guarantees=[hd_dep_count in (6)] └────────────────────────────────────────────────── - ┌───── Stage 8 ── Tasks: t0:[p0..p1] t1:[p2..p3] t2:[p4..p5] + ┌───── Stage 8 ── Tasks: t0:[p0..p2] t1:[p3..p5] t2:[p6..p8] │ AggregateExec: mode=Partial, gby=[], aggr=[count(Int64(1))] │ HashJoinExec: mode=CollectLeft, join_type=Inner, on=[(CAST(web_page.wp_web_page_sk AS Float64)@1, ws_web_page_sk@0)], projection=[] │ CoalescePartitionsExec @@ -10256,32 +9464,28 @@ mod tests { │ HashJoinExec: mode=CollectLeft, join_type=Inner, on=[(CAST(household_demographics.hd_demo_sk AS Float64)@1, ws_ship_hdemo_sk@1)], projection=[ws_sold_time_sk@2, ws_web_page_sk@4] │ CoalescePartitionsExec │ [Stage 7] => NetworkBroadcastExec: partitions_per_consumer=3, stage_partitions=9, input_tasks=2 - │ PartitionIsolatorExec: tasks=3 partitions=6 - │ DataSourceExec: file_groups={6 groups: [[/testdata/tpcds/plans_sf1_partitions4/web_sales/part-0.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/web_sales/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/web_sales/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/web_sales/part-2.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/web_sales/part-2.parquet:..], ...]}, projection=[ws_sold_time_sk, ws_ship_hdemo_sk, ws_web_page_sk], file_type=parquet, predicate=DynamicFilter [ empty ] AND DynamicFilter [ empty ] AND DynamicFilter [ empty ] + │ DistributedLeafExec: DataSourceExec: file_groups={3 groups: [[/testdata/tpcds/plans_sf1_partitions4/web_sales/part-0.parquet:.., /testdata/tpcds/plans_sf1_partitions4/web_sales/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/web_sales/part-1.parquet:.., /testdata/tpcds/plans_sf1_partitions4/web_sales/part-2.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/web_sales/part-2.parquet:.., /testdata/tpcds/plans_sf1_partitions4/web_sales/part-3.parquet:..]]}, projection=[ws_sold_time_sk, ws_ship_hdemo_sk, ws_web_page_sk], file_type=parquet, predicate=DynamicFilter [ empty ] AND DynamicFilter [ empty ] AND DynamicFilter [ empty ] └────────────────────────────────────────────────── ┌───── Stage 5 ── Tasks: t0:[p0..p8] t1:[p9..p17] │ BroadcastExec: input_partitions=3, consumer_tasks=3, output_partitions=9 │ ProjectionExec: expr=[wp_web_page_sk@0 as wp_web_page_sk, CAST(wp_web_page_sk@0 AS Float64) as CAST(web_page.wp_web_page_sk AS Float64)] │ FilterExec: wp_char_count@1 >= 5000 AND wp_char_count@1 <= 5200, projection=[wp_web_page_sk@0] │ RepartitionExec: partitioning=RoundRobinBatch(3), input_partitions=2 - │ PartitionIsolatorExec: tasks=2 partitions=4 - │ DataSourceExec: file_groups={4 groups: [[/testdata/tpcds/plans_sf1_partitions4/web_page/part-0.parquet], [/testdata/tpcds/plans_sf1_partitions4/web_page/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/web_page/part-2.parquet], [/testdata/tpcds/plans_sf1_partitions4/web_page/part-3.parquet]]}, projection=[wp_web_page_sk, wp_char_count], file_type=parquet, predicate=wp_char_count@10 >= 5000 AND wp_char_count@10 <= 5200, pruning_predicate=wp_char_count_null_count@1 != row_count@2 AND wp_char_count_max@0 >= 5000 AND wp_char_count_null_count@1 != row_count@2 AND wp_char_count_min@3 <= 5200, required_guarantees=[] + │ DistributedLeafExec: DataSourceExec: file_groups={2 groups: [[/testdata/tpcds/plans_sf1_partitions4/web_page/part-0.parquet, /testdata/tpcds/plans_sf1_partitions4/web_page/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/web_page/part-2.parquet, /testdata/tpcds/plans_sf1_partitions4/web_page/part-3.parquet]]}, projection=[wp_web_page_sk, wp_char_count], file_type=parquet, predicate=wp_char_count@10 >= 5000 AND wp_char_count@10 <= 5200, pruning_predicate=wp_char_count_null_count@1 != row_count@2 AND wp_char_count_max@0 >= 5000 AND wp_char_count_null_count@1 != row_count@2 AND wp_char_count_min@3 <= 5200, required_guarantees=[] └────────────────────────────────────────────────── ┌───── Stage 6 ── Tasks: t0:[p0..p8] t1:[p9..p17] │ BroadcastExec: input_partitions=3, consumer_tasks=3, output_partitions=9 │ ProjectionExec: expr=[t_time_sk@0 as t_time_sk, CAST(t_time_sk@0 AS Float64) as CAST(time_dim.t_time_sk AS Float64)] │ FilterExec: t_hour@1 >= 19 AND t_hour@1 <= 20, projection=[t_time_sk@0] │ RepartitionExec: partitioning=RoundRobinBatch(3), input_partitions=2 - │ PartitionIsolatorExec: tasks=2 partitions=4 - │ DataSourceExec: file_groups={4 groups: [[/testdata/tpcds/plans_sf1_partitions4/time_dim/part-0.parquet], [/testdata/tpcds/plans_sf1_partitions4/time_dim/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/time_dim/part-2.parquet], [/testdata/tpcds/plans_sf1_partitions4/time_dim/part-3.parquet]]}, projection=[t_time_sk, t_hour], file_type=parquet, predicate=t_hour@3 >= 19 AND t_hour@3 <= 20, pruning_predicate=t_hour_null_count@1 != row_count@2 AND t_hour_max@0 >= 19 AND t_hour_null_count@1 != row_count@2 AND t_hour_min@3 <= 20, required_guarantees=[] + │ DistributedLeafExec: DataSourceExec: file_groups={2 groups: [[/testdata/tpcds/plans_sf1_partitions4/time_dim/part-0.parquet, /testdata/tpcds/plans_sf1_partitions4/time_dim/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/time_dim/part-2.parquet, /testdata/tpcds/plans_sf1_partitions4/time_dim/part-3.parquet]]}, projection=[t_time_sk, t_hour], file_type=parquet, predicate=t_hour@3 >= 19 AND t_hour@3 <= 20, pruning_predicate=t_hour_null_count@1 != row_count@2 AND t_hour_max@0 >= 19 AND t_hour_null_count@1 != row_count@2 AND t_hour_min@3 <= 20, required_guarantees=[] └────────────────────────────────────────────────── ┌───── Stage 7 ── Tasks: t0:[p0..p8] t1:[p9..p17] │ BroadcastExec: input_partitions=3, consumer_tasks=3, output_partitions=9 │ ProjectionExec: expr=[hd_demo_sk@0 as hd_demo_sk, CAST(hd_demo_sk@0 AS Float64) as CAST(household_demographics.hd_demo_sk AS Float64)] │ FilterExec: hd_dep_count@1 = 6, projection=[hd_demo_sk@0] │ RepartitionExec: partitioning=RoundRobinBatch(3), input_partitions=2 - │ PartitionIsolatorExec: tasks=2 partitions=4 - │ DataSourceExec: file_groups={4 groups: [[/testdata/tpcds/plans_sf1_partitions4/household_demographics/part-0.parquet], [/testdata/tpcds/plans_sf1_partitions4/household_demographics/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/household_demographics/part-2.parquet], [/testdata/tpcds/plans_sf1_partitions4/household_demographics/part-3.parquet]]}, projection=[hd_demo_sk, hd_dep_count], file_type=parquet, predicate=hd_dep_count@3 = 6, pruning_predicate=hd_dep_count_null_count@2 != row_count@3 AND hd_dep_count_min@0 <= 6 AND 6 <= hd_dep_count_max@1, required_guarantees=[hd_dep_count in (6)] + │ DistributedLeafExec: DataSourceExec: file_groups={2 groups: [[/testdata/tpcds/plans_sf1_partitions4/household_demographics/part-0.parquet, /testdata/tpcds/plans_sf1_partitions4/household_demographics/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/household_demographics/part-2.parquet, /testdata/tpcds/plans_sf1_partitions4/household_demographics/part-3.parquet]]}, projection=[hd_demo_sk, hd_dep_count], file_type=parquet, predicate=hd_dep_count@3 = 6, pruning_predicate=hd_dep_count_null_count@2 != row_count@3 AND hd_dep_count_min@0 <= 6 AND 6 <= hd_dep_count_max@1, required_guarantees=[hd_dep_count in (6)] └────────────────────────────────────────────────── "); Ok(()) @@ -10289,7 +9493,7 @@ mod tests { #[tokio::test] async fn test_tpcds_91() -> Result<()> { let display = test_tpcds_query("q91").await?; - assert_snapshot!(display, @" + assert_snapshot!(display, @r" ┌───── DistributedExec ── Tasks: t0:[p0] │ SortPreservingMergeExec: [returns_loss@3 DESC] │ [Stage 8] => NetworkCoalesceExec: output_partitions=6, input_tasks=2 @@ -10305,22 +9509,20 @@ mod tests { │ AggregateExec: mode=Partial, gby=[cc_call_center_id@0 as cc_call_center_id, cc_name@1 as cc_name, cc_manager@2 as cc_manager, cd_marital_status@4 as cd_marital_status, cd_education_status@5 as cd_education_status], aggr=[sum(catalog_returns.cr_net_loss)] │ HashJoinExec: mode=CollectLeft, join_type=Inner, on=[(c_current_hdemo_sk@4, CAST(household_demographics.hd_demo_sk AS Float64)@1)], projection=[cc_call_center_id@0, cc_name@1, cc_manager@2, cr_net_loss@3, cd_marital_status@5, cd_education_status@6] │ CoalescePartitionsExec - │ [Stage 6] => NetworkBroadcastExec: partitions_per_consumer=2, stage_partitions=6, input_tasks=3 + │ [Stage 6] => NetworkBroadcastExec: partitions_per_consumer=3, stage_partitions=9, input_tasks=3 │ ProjectionExec: expr=[hd_demo_sk@0 as hd_demo_sk, CAST(hd_demo_sk@0 AS Float64) as CAST(household_demographics.hd_demo_sk AS Float64)] │ FilterExec: hd_buy_potential@1 LIKE Unknown%, projection=[hd_demo_sk@0] │ RepartitionExec: partitioning=RoundRobinBatch(3), input_partitions=2 - │ PartitionIsolatorExec: tasks=3 partitions=4 - │ DataSourceExec: file_groups={4 groups: [[/testdata/tpcds/plans_sf1_partitions4/household_demographics/part-0.parquet], [/testdata/tpcds/plans_sf1_partitions4/household_demographics/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/household_demographics/part-2.parquet], [/testdata/tpcds/plans_sf1_partitions4/household_demographics/part-3.parquet]]}, projection=[hd_demo_sk, hd_buy_potential], file_type=parquet, predicate=hd_buy_potential@2 LIKE Unknown% AND DynamicFilter [ empty ], pruning_predicate=hd_buy_potential_null_count@2 != row_count@3 AND hd_buy_potential_min@0 <= Unknowo AND Unknown <= hd_buy_potential_max@1, required_guarantees=[] + │ DistributedLeafExec: DataSourceExec: file_groups={2 groups: [[/testdata/tpcds/plans_sf1_partitions4/household_demographics/part-0.parquet, /testdata/tpcds/plans_sf1_partitions4/household_demographics/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/household_demographics/part-2.parquet, /testdata/tpcds/plans_sf1_partitions4/household_demographics/part-3.parquet]]}, projection=[hd_demo_sk, hd_buy_potential], file_type=parquet, predicate=hd_buy_potential@2 LIKE Unknown% AND DynamicFilter [ empty ], pruning_predicate=hd_buy_potential_null_count@2 != row_count@3 AND hd_buy_potential_min@0 <= Unknowo AND Unknown <= hd_buy_potential_max@1, required_guarantees=[] └────────────────────────────────────────────────── - ┌───── Stage 6 ── Tasks: t0:[p0..p5] t1:[p6..p11] t2:[p12..p17] - │ BroadcastExec: input_partitions=2, consumer_tasks=3, output_partitions=6 + ┌───── Stage 6 ── Tasks: t0:[p0..p8] t1:[p9..p17] t2:[p18..p26] + │ BroadcastExec: input_partitions=3, consumer_tasks=3, output_partitions=9 │ HashJoinExec: mode=CollectLeft, join_type=Inner, on=[(c_current_cdemo_sk@4, CAST(customer_demographics.cd_demo_sk AS Float64)@3)], projection=[cc_call_center_id@0, cc_name@1, cc_manager@2, cr_net_loss@3, c_current_hdemo_sk@5, cd_marital_status@7, cd_education_status@8] │ CoalescePartitionsExec │ [Stage 5] => NetworkBroadcastExec: partitions_per_consumer=3, stage_partitions=9, input_tasks=3 │ ProjectionExec: expr=[cd_demo_sk@0 as cd_demo_sk, cd_marital_status@1 as cd_marital_status, cd_education_status@2 as cd_education_status, CAST(cd_demo_sk@0 AS Float64) as CAST(customer_demographics.cd_demo_sk AS Float64)] │ FilterExec: cd_marital_status@1 = M AND cd_education_status@2 = Unknown OR cd_marital_status@1 = W AND cd_education_status@2 = Advanced Degree - │ PartitionIsolatorExec: tasks=3 partitions=6 - │ DataSourceExec: file_groups={6 groups: [[/testdata/tpcds/plans_sf1_partitions4/customer_demographics/part-0.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/customer_demographics/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/customer_demographics/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/customer_demographics/part-2.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/customer_demographics/part-2.parquet:..], ...]}, projection=[cd_demo_sk, cd_marital_status, cd_education_status], file_type=parquet, predicate=(cd_marital_status@2 = M AND cd_education_status@3 = Unknown OR cd_marital_status@2 = W AND cd_education_status@3 = Advanced Degree) AND DynamicFilter [ empty ], pruning_predicate=cd_marital_status_null_count@2 != row_count@3 AND cd_marital_status_min@0 <= M AND M <= cd_marital_status_max@1 AND cd_education_status_null_count@6 != row_count@3 AND cd_education_status_min@4 <= Unknown AND Unknown <= cd_education_status_max@5 OR cd_marital_status_null_count@2 != row_count@3 AND cd_marital_status_min@0 <= W AND W <= cd_marital_status_max@1 AND cd_education_status_null_count@6 != row_count@3 AND cd_education_status_min@4 <= Advanced Degree AND Advanced Degree <= cd_education_status_max@5, required_guarantees=[cd_education_status in (Advanced Degree, Unknown), cd_marital_status in (M, W)] + │ DistributedLeafExec: DataSourceExec: file_groups={3 groups: [[/testdata/tpcds/plans_sf1_partitions4/customer_demographics/part-0.parquet:.., /testdata/tpcds/plans_sf1_partitions4/customer_demographics/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/customer_demographics/part-1.parquet:.., /testdata/tpcds/plans_sf1_partitions4/customer_demographics/part-2.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/customer_demographics/part-2.parquet:.., /testdata/tpcds/plans_sf1_partitions4/customer_demographics/part-3.parquet:..]]}, projection=[cd_demo_sk, cd_marital_status, cd_education_status], file_type=parquet, predicate=(cd_marital_status@2 = M AND cd_education_status@3 = Unknown OR cd_marital_status@2 = W AND cd_education_status@3 = Advanced Degree) AND DynamicFilter [ empty ], pruning_predicate=cd_marital_status_null_count@2 != row_count@3 AND cd_marital_status_min@0 <= M AND M <= cd_marital_status_max@1 AND cd_education_status_null_count@6 != row_count@3 AND cd_education_status_min@4 <= Unknown AND Unknown <= cd_education_status_max@5 OR cd_marital_status_null_count@2 != row_count@3 AND cd_marital_status_min@0 <= W AND W <= cd_marital_status_max@1 AND cd_education_status_null_count@6 != row_count@3 AND cd_education_status_min@4 <= Advanced Degree AND Advanced Degree <= cd_education_status_max@5, required_guarantees=[cd_education_status in (Advanced Degree, Unknown), cd_marital_status in (M, W)] └────────────────────────────────────────────────── ┌───── Stage 5 ── Tasks: t0:[p0..p8] t1:[p9..p17] t2:[p18..p26] │ BroadcastExec: input_partitions=3, consumer_tasks=3, output_partitions=9 @@ -10329,8 +9531,7 @@ mod tests { │ [Stage 4] => NetworkBroadcastExec: partitions_per_consumer=3, stage_partitions=9, input_tasks=3 │ FilterExec: ca_gmt_offset@1 = Some(-700),4,2, projection=[ca_address_sk@0] │ RepartitionExec: partitioning=RoundRobinBatch(3), input_partitions=2 - │ PartitionIsolatorExec: tasks=3 partitions=4 - │ DataSourceExec: file_groups={4 groups: [[/testdata/tpcds/plans_sf1_partitions4/customer_address/part-0.parquet], [/testdata/tpcds/plans_sf1_partitions4/customer_address/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/customer_address/part-2.parquet], [/testdata/tpcds/plans_sf1_partitions4/customer_address/part-3.parquet]]}, projection=[ca_address_sk, ca_gmt_offset], file_type=parquet, predicate=ca_gmt_offset@11 = Some(-700),4,2 AND DynamicFilter [ empty ], pruning_predicate=ca_gmt_offset_null_count@2 != row_count@3 AND ca_gmt_offset_min@0 <= Some(-700),4,2 AND Some(-700),4,2 <= ca_gmt_offset_max@1, required_guarantees=[ca_gmt_offset in (Some(-700),4,2)] + │ DistributedLeafExec: DataSourceExec: file_groups={2 groups: [[/testdata/tpcds/plans_sf1_partitions4/customer_address/part-0.parquet, /testdata/tpcds/plans_sf1_partitions4/customer_address/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/customer_address/part-2.parquet, /testdata/tpcds/plans_sf1_partitions4/customer_address/part-3.parquet]]}, projection=[ca_address_sk, ca_gmt_offset], file_type=parquet, predicate=ca_gmt_offset@11 = Some(-700),4,2 AND DynamicFilter [ empty ], pruning_predicate=ca_gmt_offset_null_count@2 != row_count@3 AND ca_gmt_offset_min@0 <= Some(-700),4,2 AND Some(-700),4,2 <= ca_gmt_offset_max@1, required_guarantees=[ca_gmt_offset in (Some(-700),4,2)] └────────────────────────────────────────────────── ┌───── Stage 4 ── Tasks: t0:[p0..p8] t1:[p9..p17] t2:[p18..p26] │ BroadcastExec: input_partitions=3, consumer_tasks=3, output_partitions=9 @@ -10339,31 +9540,27 @@ mod tests { │ [Stage 3] => NetworkBroadcastExec: partitions_per_consumer=3, stage_partitions=9, input_tasks=3 │ ProjectionExec: expr=[c_customer_sk@0 as c_customer_sk, c_current_cdemo_sk@1 as c_current_cdemo_sk, c_current_hdemo_sk@2 as c_current_hdemo_sk, c_current_addr_sk@3 as c_current_addr_sk, CAST(c_customer_sk@0 AS Float64) as CAST(customer.c_customer_sk AS Float64)] │ RepartitionExec: partitioning=RoundRobinBatch(3), input_partitions=2 - │ PartitionIsolatorExec: tasks=3 partitions=4 - │ DataSourceExec: file_groups={4 groups: [[/testdata/tpcds/plans_sf1_partitions4/customer/part-0.parquet], [/testdata/tpcds/plans_sf1_partitions4/customer/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/customer/part-2.parquet], [/testdata/tpcds/plans_sf1_partitions4/customer/part-3.parquet]]}, projection=[c_customer_sk, c_current_cdemo_sk, c_current_hdemo_sk, c_current_addr_sk], file_type=parquet, predicate=DynamicFilter [ empty ] + │ DistributedLeafExec: DataSourceExec: file_groups={2 groups: [[/testdata/tpcds/plans_sf1_partitions4/customer/part-0.parquet, /testdata/tpcds/plans_sf1_partitions4/customer/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/customer/part-2.parquet, /testdata/tpcds/plans_sf1_partitions4/customer/part-3.parquet]]}, projection=[c_customer_sk, c_current_cdemo_sk, c_current_hdemo_sk, c_current_addr_sk], file_type=parquet, predicate=DynamicFilter [ empty ] └────────────────────────────────────────────────── ┌───── Stage 3 ── Tasks: t0:[p0..p8] t1:[p9..p17] t2:[p18..p26] │ BroadcastExec: input_partitions=3, consumer_tasks=3, output_partitions=9 │ HashJoinExec: mode=CollectLeft, join_type=Inner, on=[(cr_returned_date_sk@3, d_date_sk@0)], projection=[cc_call_center_id@0, cc_name@1, cc_manager@2, cr_returning_customer_sk@4, cr_net_loss@5] │ CoalescePartitionsExec - │ [Stage 2] => NetworkBroadcastExec: partitions_per_consumer=2, stage_partitions=6, input_tasks=3 + │ [Stage 2] => NetworkBroadcastExec: partitions_per_consumer=3, stage_partitions=9, input_tasks=3 │ FilterExec: d_year@1 = 1998 AND d_moy@2 = 11, projection=[d_date_sk@0] │ RepartitionExec: partitioning=RoundRobinBatch(3), input_partitions=2 - │ PartitionIsolatorExec: tasks=3 partitions=4 - │ DataSourceExec: file_groups={4 groups: [[/testdata/tpcds/plans_sf1_partitions4/date_dim/part-0.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-2.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-3.parquet]]}, projection=[d_date_sk, d_year, d_moy], file_type=parquet, predicate=d_year@6 = 1998 AND d_moy@8 = 11 AND DynamicFilter [ empty ], pruning_predicate=d_year_null_count@2 != row_count@3 AND d_year_min@0 <= 1998 AND 1998 <= d_year_max@1 AND d_moy_null_count@6 != row_count@3 AND d_moy_min@4 <= 11 AND 11 <= d_moy_max@5, required_guarantees=[d_moy in (11), d_year in (1998)] + │ DistributedLeafExec: DataSourceExec: file_groups={2 groups: [[/testdata/tpcds/plans_sf1_partitions4/date_dim/part-0.parquet, /testdata/tpcds/plans_sf1_partitions4/date_dim/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-2.parquet, /testdata/tpcds/plans_sf1_partitions4/date_dim/part-3.parquet]]}, projection=[d_date_sk, d_year, d_moy], file_type=parquet, predicate=d_year@6 = 1998 AND d_moy@8 = 11 AND DynamicFilter [ empty ], pruning_predicate=d_year_null_count@2 != row_count@3 AND d_year_min@0 <= 1998 AND 1998 <= d_year_max@1 AND d_moy_null_count@6 != row_count@3 AND d_moy_min@4 <= 11 AND 11 <= d_moy_max@5, required_guarantees=[d_moy in (11), d_year in (1998)] └────────────────────────────────────────────────── - ┌───── Stage 2 ── Tasks: t0:[p0..p5] t1:[p6..p11] t2:[p12..p17] - │ BroadcastExec: input_partitions=2, consumer_tasks=3, output_partitions=6 + ┌───── Stage 2 ── Tasks: t0:[p0..p8] t1:[p9..p17] t2:[p18..p26] + │ BroadcastExec: input_partitions=3, consumer_tasks=3, output_partitions=9 │ HashJoinExec: mode=CollectLeft, join_type=Inner, on=[(CAST(call_center.cc_call_center_sk AS Float64)@4, cr_call_center_sk@2)], projection=[cc_call_center_id@1, cc_name@2, cc_manager@3, cr_returned_date_sk@5, cr_returning_customer_sk@6, cr_net_loss@8] │ CoalescePartitionsExec │ [Stage 1] => NetworkBroadcastExec: partitions_per_consumer=2, stage_partitions=6, input_tasks=2 - │ PartitionIsolatorExec: tasks=3 partitions=6 - │ DataSourceExec: file_groups={6 groups: [[/testdata/tpcds/plans_sf1_partitions4/catalog_returns/part-0.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/catalog_returns/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/catalog_returns/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/catalog_returns/part-2.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/catalog_returns/part-2.parquet:..], ...]}, projection=[cr_returned_date_sk, cr_returning_customer_sk, cr_call_center_sk, cr_net_loss], file_type=parquet, predicate=DynamicFilter [ empty ] + │ DistributedLeafExec: DataSourceExec: file_groups={3 groups: [[/testdata/tpcds/plans_sf1_partitions4/catalog_returns/part-0.parquet:.., /testdata/tpcds/plans_sf1_partitions4/catalog_returns/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/catalog_returns/part-1.parquet:.., /testdata/tpcds/plans_sf1_partitions4/catalog_returns/part-2.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/catalog_returns/part-2.parquet:.., /testdata/tpcds/plans_sf1_partitions4/catalog_returns/part-3.parquet:..]]}, projection=[cr_returned_date_sk, cr_returning_customer_sk, cr_call_center_sk, cr_net_loss], file_type=parquet, predicate=DynamicFilter [ empty ] └────────────────────────────────────────────────── ┌───── Stage 1 ── Tasks: t0:[p0..p5] t1:[p6..p11] │ BroadcastExec: input_partitions=2, consumer_tasks=3, output_partitions=6 - │ PartitionIsolatorExec: tasks=2 partitions=4 - │ DataSourceExec: file_groups={4 groups: [[/testdata/tpcds/plans_sf1_partitions4/call_center/part-0.parquet], [/testdata/tpcds/plans_sf1_partitions4/call_center/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/call_center/part-2.parquet], [/testdata/tpcds/plans_sf1_partitions4/call_center/part-3.parquet]]}, projection=[cc_call_center_sk, cc_call_center_id, cc_name, cc_manager, CAST(cc_call_center_sk@0 AS Float64) as CAST(call_center.cc_call_center_sk AS Float64)], file_type=parquet + │ DistributedLeafExec: DataSourceExec: file_groups={2 groups: [[/testdata/tpcds/plans_sf1_partitions4/call_center/part-0.parquet, /testdata/tpcds/plans_sf1_partitions4/call_center/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/call_center/part-2.parquet, /testdata/tpcds/plans_sf1_partitions4/call_center/part-3.parquet]]}, projection=[cc_call_center_sk, cc_call_center_id, cc_name, cc_manager, CAST(cc_call_center_sk@0 AS Float64) as CAST(call_center.cc_call_center_sk AS Float64)], file_type=parquet └────────────────────────────────────────────────── "); Ok(()) @@ -10371,7 +9568,7 @@ mod tests { #[tokio::test] async fn test_tpcds_92() -> Result<()> { let display = test_tpcds_query("q92").await?; - assert_snapshot!(display, @" + assert_snapshot!(display, @r" ┌───── DistributedExec ── Tasks: t0:[p0] │ ProjectionExec: expr=[sum(web_sales.ws_ext_discount_amt)@0 as Excess Discount Amount] │ GlobalLimitExec: skip=0, fetch=100 @@ -10383,13 +9580,13 @@ mod tests { │ AggregateExec: mode=Partial, gby=[], aggr=[sum(web_sales.ws_ext_discount_amt)] │ HashJoinExec: mode=CollectLeft, join_type=Inner, on=[(i_item_sk@1, ws_item_sk@1)], filter=CAST(ws_ext_discount_amt@0 AS Decimal128(30, 15)) > Float64(1.3) * avg(web_sales.ws_ext_discount_amt)@1, projection=[ws_ext_discount_amt@0] │ CoalescePartitionsExec - │ [Stage 3] => NetworkBroadcastExec: partitions_per_consumer=2, stage_partitions=6, input_tasks=3 + │ [Stage 3] => NetworkBroadcastExec: partitions_per_consumer=3, stage_partitions=9, input_tasks=3 │ ProjectionExec: expr=[CAST(1.3 * CAST(avg(web_sales.ws_ext_discount_amt)@1 AS Float64) AS Decimal128(30, 15)) as Float64(1.3) * avg(web_sales.ws_ext_discount_amt), ws_item_sk@0 as ws_item_sk] │ AggregateExec: mode=FinalPartitioned, gby=[ws_item_sk@0 as ws_item_sk], aggr=[avg(web_sales.ws_ext_discount_amt)] │ [Stage 5] => NetworkShuffleExec: output_partitions=3, input_tasks=3 └────────────────────────────────────────────────── - ┌───── Stage 3 ── Tasks: t0:[p0..p5] t1:[p6..p11] t2:[p12..p17] - │ BroadcastExec: input_partitions=2, consumer_tasks=3, output_partitions=6 + ┌───── Stage 3 ── Tasks: t0:[p0..p8] t1:[p9..p17] t2:[p18..p26] + │ BroadcastExec: input_partitions=3, consumer_tasks=3, output_partitions=9 │ HashJoinExec: mode=CollectLeft, join_type=Inner, on=[(CAST(date_dim.d_date_sk AS Float64)@1, ws_sold_date_sk@0)], projection=[ws_ext_discount_amt@3, i_item_sk@4] │ CoalescePartitionsExec │ [Stage 1] => NetworkBroadcastExec: partitions_per_consumer=3, stage_partitions=9, input_tasks=2 @@ -10397,40 +9594,35 @@ mod tests { │ HashJoinExec: mode=CollectLeft, join_type=Inner, on=[(i_item_sk@0, ws_item_sk@1)], projection=[i_item_sk@0, ws_sold_date_sk@1, ws_ext_discount_amt@3] │ CoalescePartitionsExec │ [Stage 2] => NetworkBroadcastExec: partitions_per_consumer=3, stage_partitions=9, input_tasks=2 - │ PartitionIsolatorExec: tasks=3 partitions=6 - │ DataSourceExec: file_groups={6 groups: [[/testdata/tpcds/plans_sf1_partitions4/web_sales/part-0.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/web_sales/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/web_sales/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/web_sales/part-2.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/web_sales/part-2.parquet:..], ...]}, projection=[ws_sold_date_sk, ws_item_sk, ws_ext_discount_amt], file_type=parquet, predicate=DynamicFilter [ empty ] AND DynamicFilter [ empty ] + │ DistributedLeafExec: DataSourceExec: file_groups={3 groups: [[/testdata/tpcds/plans_sf1_partitions4/web_sales/part-0.parquet:.., /testdata/tpcds/plans_sf1_partitions4/web_sales/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/web_sales/part-1.parquet:.., /testdata/tpcds/plans_sf1_partitions4/web_sales/part-2.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/web_sales/part-2.parquet:.., /testdata/tpcds/plans_sf1_partitions4/web_sales/part-3.parquet:..]]}, projection=[ws_sold_date_sk, ws_item_sk, ws_ext_discount_amt], file_type=parquet, predicate=DynamicFilter [ empty ] AND DynamicFilter [ empty ] └────────────────────────────────────────────────── ┌───── Stage 1 ── Tasks: t0:[p0..p8] t1:[p9..p17] │ BroadcastExec: input_partitions=3, consumer_tasks=3, output_partitions=9 │ ProjectionExec: expr=[d_date_sk@0 as d_date_sk, CAST(d_date_sk@0 AS Float64) as CAST(date_dim.d_date_sk AS Float64)] │ FilterExec: d_date@1 >= 2000-01-27 AND d_date@1 <= 2000-04-26, projection=[d_date_sk@0] │ RepartitionExec: partitioning=RoundRobinBatch(3), input_partitions=2 - │ PartitionIsolatorExec: tasks=2 partitions=4 - │ DataSourceExec: file_groups={4 groups: [[/testdata/tpcds/plans_sf1_partitions4/date_dim/part-0.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-2.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-3.parquet]]}, projection=[d_date_sk, d_date], file_type=parquet, predicate=d_date@2 >= 2000-01-27 AND d_date@2 <= 2000-04-26, pruning_predicate=d_date_null_count@1 != row_count@2 AND d_date_max@0 >= 2000-01-27 AND d_date_null_count@1 != row_count@2 AND d_date_min@3 <= 2000-04-26, required_guarantees=[] + │ DistributedLeafExec: DataSourceExec: file_groups={2 groups: [[/testdata/tpcds/plans_sf1_partitions4/date_dim/part-0.parquet, /testdata/tpcds/plans_sf1_partitions4/date_dim/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-2.parquet, /testdata/tpcds/plans_sf1_partitions4/date_dim/part-3.parquet]]}, projection=[d_date_sk, d_date], file_type=parquet, predicate=d_date@2 >= 2000-01-27 AND d_date@2 <= 2000-04-26, pruning_predicate=d_date_null_count@1 != row_count@2 AND d_date_max@0 >= 2000-01-27 AND d_date_null_count@1 != row_count@2 AND d_date_min@3 <= 2000-04-26, required_guarantees=[] └────────────────────────────────────────────────── ┌───── Stage 2 ── Tasks: t0:[p0..p8] t1:[p9..p17] │ BroadcastExec: input_partitions=3, consumer_tasks=3, output_partitions=9 │ FilterExec: i_manufact_id@1 = 350, projection=[i_item_sk@0] │ RepartitionExec: partitioning=RoundRobinBatch(3), input_partitions=2 - │ PartitionIsolatorExec: tasks=2 partitions=4 - │ DataSourceExec: file_groups={4 groups: [[/testdata/tpcds/plans_sf1_partitions4/item/part-0.parquet], [/testdata/tpcds/plans_sf1_partitions4/item/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/item/part-2.parquet], [/testdata/tpcds/plans_sf1_partitions4/item/part-3.parquet]]}, projection=[i_item_sk, i_manufact_id], file_type=parquet, predicate=i_manufact_id@13 = 350, pruning_predicate=i_manufact_id_null_count@2 != row_count@3 AND i_manufact_id_min@0 <= 350 AND 350 <= i_manufact_id_max@1, required_guarantees=[i_manufact_id in (350)] + │ DistributedLeafExec: DataSourceExec: file_groups={2 groups: [[/testdata/tpcds/plans_sf1_partitions4/item/part-0.parquet, /testdata/tpcds/plans_sf1_partitions4/item/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/item/part-2.parquet, /testdata/tpcds/plans_sf1_partitions4/item/part-3.parquet]]}, projection=[i_item_sk, i_manufact_id], file_type=parquet, predicate=i_manufact_id@13 = 350, pruning_predicate=i_manufact_id_null_count@2 != row_count@3 AND i_manufact_id_min@0 <= 350 AND 350 <= i_manufact_id_max@1, required_guarantees=[i_manufact_id in (350)] └────────────────────────────────────────────────── ┌───── Stage 5 ── Tasks: t0:[p0..p8] t1:[p0..p8] t2:[p0..p8] - │ RepartitionExec: partitioning=Hash([ws_item_sk@0], 9), input_partitions=2 + │ RepartitionExec: partitioning=Hash([ws_item_sk@0], 9), input_partitions=3 │ AggregateExec: mode=Partial, gby=[ws_item_sk@0 as ws_item_sk], aggr=[avg(web_sales.ws_ext_discount_amt)] │ HashJoinExec: mode=CollectLeft, join_type=Inner, on=[(CAST(date_dim.d_date_sk AS Float64)@1, ws_sold_date_sk@0)], projection=[ws_item_sk@3, ws_ext_discount_amt@4] │ CoalescePartitionsExec │ [Stage 4] => NetworkBroadcastExec: partitions_per_consumer=3, stage_partitions=9, input_tasks=2 - │ PartitionIsolatorExec: tasks=3 partitions=6 - │ DataSourceExec: file_groups={6 groups: [[/testdata/tpcds/plans_sf1_partitions4/web_sales/part-0.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/web_sales/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/web_sales/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/web_sales/part-2.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/web_sales/part-2.parquet:..], ...]}, projection=[ws_sold_date_sk, ws_item_sk, ws_ext_discount_amt], file_type=parquet, predicate=DynamicFilter [ empty ] AND DynamicFilter [ empty ] + │ DistributedLeafExec: DataSourceExec: file_groups={3 groups: [[/testdata/tpcds/plans_sf1_partitions4/web_sales/part-0.parquet:.., /testdata/tpcds/plans_sf1_partitions4/web_sales/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/web_sales/part-1.parquet:.., /testdata/tpcds/plans_sf1_partitions4/web_sales/part-2.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/web_sales/part-2.parquet:.., /testdata/tpcds/plans_sf1_partitions4/web_sales/part-3.parquet:..]]}, projection=[ws_sold_date_sk, ws_item_sk, ws_ext_discount_amt], file_type=parquet, predicate=DynamicFilter [ empty ] AND DynamicFilter [ empty ] └────────────────────────────────────────────────── ┌───── Stage 4 ── Tasks: t0:[p0..p8] t1:[p9..p17] │ BroadcastExec: input_partitions=3, consumer_tasks=3, output_partitions=9 │ ProjectionExec: expr=[d_date_sk@0 as d_date_sk, CAST(d_date_sk@0 AS Float64) as CAST(date_dim.d_date_sk AS Float64)] │ FilterExec: d_date@1 >= 2000-01-27 AND d_date@1 <= 2000-04-26, projection=[d_date_sk@0] │ RepartitionExec: partitioning=RoundRobinBatch(3), input_partitions=2 - │ PartitionIsolatorExec: tasks=2 partitions=4 - │ DataSourceExec: file_groups={4 groups: [[/testdata/tpcds/plans_sf1_partitions4/date_dim/part-0.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-2.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-3.parquet]]}, projection=[d_date_sk, d_date], file_type=parquet, predicate=d_date@2 >= 2000-01-27 AND d_date@2 <= 2000-04-26, pruning_predicate=d_date_null_count@1 != row_count@2 AND d_date_max@0 >= 2000-01-27 AND d_date_null_count@1 != row_count@2 AND d_date_min@3 <= 2000-04-26, required_guarantees=[] + │ DistributedLeafExec: DataSourceExec: file_groups={2 groups: [[/testdata/tpcds/plans_sf1_partitions4/date_dim/part-0.parquet, /testdata/tpcds/plans_sf1_partitions4/date_dim/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-2.parquet, /testdata/tpcds/plans_sf1_partitions4/date_dim/part-3.parquet]]}, projection=[d_date_sk, d_date], file_type=parquet, predicate=d_date@2 >= 2000-01-27 AND d_date@2 <= 2000-04-26, pruning_predicate=d_date_null_count@1 != row_count@2 AND d_date_max@0 >= 2000-01-27 AND d_date_null_count@1 != row_count@2 AND d_date_min@3 <= 2000-04-26, required_guarantees=[] └────────────────────────────────────────────────── "); Ok(()) @@ -10438,7 +9630,7 @@ mod tests { #[tokio::test] async fn test_tpcds_93() -> Result<()> { let display = test_tpcds_query("q93").await?; - assert_snapshot!(display, @" + assert_snapshot!(display, @r" ┌───── DistributedExec ── Tasks: t0:[p0] │ SortPreservingMergeExec: [sumsales@1 ASC, ss_customer_sk@0 ASC], fetch=100 │ [Stage 5] => NetworkCoalesceExec: output_partitions=6, input_tasks=2 @@ -10466,18 +9658,15 @@ mod tests { │ ProjectionExec: expr=[r_reason_sk@0 as r_reason_sk, CAST(r_reason_sk@0 AS Float64) as CAST(reason.r_reason_sk AS Float64)] │ FilterExec: r_reason_desc@1 = reason 28, projection=[r_reason_sk@0] │ RepartitionExec: partitioning=RoundRobinBatch(3), input_partitions=2 - │ PartitionIsolatorExec: tasks=2 partitions=4 - │ DataSourceExec: file_groups={4 groups: [[/testdata/tpcds/plans_sf1_partitions4/reason/part-0.parquet], [/testdata/tpcds/plans_sf1_partitions4/reason/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/reason/part-2.parquet], [/testdata/tpcds/plans_sf1_partitions4/reason/part-3.parquet]]}, projection=[r_reason_sk, r_reason_desc], file_type=parquet, predicate=r_reason_desc@2 = reason 28, pruning_predicate=r_reason_desc_null_count@2 != row_count@3 AND r_reason_desc_min@0 <= reason 28 AND reason 28 <= r_reason_desc_max@1, required_guarantees=[r_reason_desc in (reason 28)] + │ DistributedLeafExec: DataSourceExec: file_groups={2 groups: [[/testdata/tpcds/plans_sf1_partitions4/reason/part-0.parquet, /testdata/tpcds/plans_sf1_partitions4/reason/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/reason/part-2.parquet, /testdata/tpcds/plans_sf1_partitions4/reason/part-3.parquet]]}, projection=[r_reason_sk, r_reason_desc], file_type=parquet, predicate=r_reason_desc@2 = reason 28, pruning_predicate=r_reason_desc_null_count@2 != row_count@3 AND r_reason_desc_min@0 <= reason 28 AND reason 28 <= r_reason_desc_max@1, required_guarantees=[r_reason_desc in (reason 28)] └────────────────────────────────────────────────── ┌───── Stage 2 ── Tasks: t0:[p0..p8] t1:[p0..p8] t2:[p0..p8] - │ RepartitionExec: partitioning=Hash([sr_item_sk@0, sr_ticket_number@2], 9), input_partitions=2 - │ PartitionIsolatorExec: tasks=3 partitions=6 - │ DataSourceExec: file_groups={6 groups: [[/testdata/tpcds/plans_sf1_partitions4/store_returns/part-0.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_returns/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_returns/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_returns/part-2.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_returns/part-2.parquet:..], ...]}, projection=[sr_item_sk, sr_reason_sk, sr_ticket_number, sr_return_quantity], file_type=parquet + │ RepartitionExec: partitioning=Hash([sr_item_sk@0, sr_ticket_number@2], 9), input_partitions=3 + │ DistributedLeafExec: DataSourceExec: file_groups={3 groups: [[/testdata/tpcds/plans_sf1_partitions4/store_returns/part-0.parquet:.., /testdata/tpcds/plans_sf1_partitions4/store_returns/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_returns/part-1.parquet:.., /testdata/tpcds/plans_sf1_partitions4/store_returns/part-2.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_returns/part-2.parquet:.., /testdata/tpcds/plans_sf1_partitions4/store_returns/part-3.parquet:..]]}, projection=[sr_item_sk, sr_reason_sk, sr_ticket_number, sr_return_quantity], file_type=parquet └────────────────────────────────────────────────── ┌───── Stage 3 ── Tasks: t0:[p0..p8] t1:[p0..p8] t2:[p0..p8] - │ RepartitionExec: partitioning=Hash([ss_item_sk@0, ss_ticket_number@2], 9), input_partitions=2 - │ PartitionIsolatorExec: tasks=3 partitions=6 - │ DataSourceExec: file_groups={6 groups: [[/testdata/tpcds/plans_sf1_partitions4/store_sales/part-0.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_sales/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_sales/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_sales/part-2.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_sales/part-2.parquet:..], ...]}, projection=[ss_item_sk, ss_customer_sk, ss_ticket_number, ss_quantity, ss_sales_price], file_type=parquet + │ RepartitionExec: partitioning=Hash([ss_item_sk@0, ss_ticket_number@2], 9), input_partitions=3 + │ DistributedLeafExec: DataSourceExec: file_groups={3 groups: [[/testdata/tpcds/plans_sf1_partitions4/store_sales/part-0.parquet:.., /testdata/tpcds/plans_sf1_partitions4/store_sales/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_sales/part-1.parquet:.., /testdata/tpcds/plans_sf1_partitions4/store_sales/part-2.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_sales/part-2.parquet:.., /testdata/tpcds/plans_sf1_partitions4/store_sales/part-3.parquet:..]]}, projection=[ss_item_sk, ss_customer_sk, ss_ticket_number, ss_quantity, ss_sales_price], file_type=parquet └────────────────────────────────────────────────── "); Ok(()) @@ -10485,7 +9674,7 @@ mod tests { #[tokio::test] async fn test_tpcds_94() -> Result<()> { let display = test_tpcds_query("q94").await?; - assert_snapshot!(display, @" + assert_snapshot!(display, @r" ┌───── DistributedExec ── Tasks: t0:[p0] │ ProjectionExec: expr=[count(alias1)@0 as order count, sum(alias2)@1 as total shipping cost, sum(alias3)@2 as total net profit] │ GlobalLimitExec: skip=0, fetch=100 @@ -10499,12 +9688,12 @@ mod tests { │ CoalescePartitionsExec │ HashJoinExec: mode=CollectLeft, join_type=LeftSemi, on=[(ws_order_number@1, ws_order_number@1)], filter=ws_warehouse_sk@0 != ws_warehouse_sk@1, projection=[ws_order_number@1, ws_ext_ship_cost@2, ws_net_profit@3] │ CoalescePartitionsExec - │ [Stage 4] => NetworkCoalesceExec: output_partitions=6, input_tasks=3 + │ [Stage 4] => NetworkCoalesceExec: output_partitions=9, input_tasks=3 │ DataSourceExec: file_groups={3 groups: [[/testdata/tpcds/plans_sf1_partitions4/web_sales/part-0.parquet:.., /testdata/tpcds/plans_sf1_partitions4/web_sales/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/web_sales/part-1.parquet:.., /testdata/tpcds/plans_sf1_partitions4/web_sales/part-2.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/web_sales/part-2.parquet:.., /testdata/tpcds/plans_sf1_partitions4/web_sales/part-3.parquet:..]]}, projection=[ws_warehouse_sk, ws_order_number], file_type=parquet │ RepartitionExec: partitioning=RoundRobinBatch(3), input_partitions=2 │ DataSourceExec: file_groups={2 groups: [[/testdata/tpcds/plans_sf1_partitions4/web_returns/part-0.parquet, /testdata/tpcds/plans_sf1_partitions4/web_returns/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/web_returns/part-2.parquet, /testdata/tpcds/plans_sf1_partitions4/web_returns/part-3.parquet]]}, projection=[wr_order_number], file_type=parquet └────────────────────────────────────────────────── - ┌───── Stage 4 ── Tasks: t0:[p0..p1] t1:[p2..p3] t2:[p4..p5] + ┌───── Stage 4 ── Tasks: t0:[p0..p2] t1:[p3..p5] t2:[p6..p8] │ HashJoinExec: mode=CollectLeft, join_type=Inner, on=[(CAST(web_site.web_site_sk AS Float64)@1, ws_web_site_sk@0)], projection=[ws_warehouse_sk@3, ws_order_number@4, ws_ext_ship_cost@5, ws_net_profit@6] │ CoalescePartitionsExec │ [Stage 1] => NetworkBroadcastExec: partitions_per_consumer=3, stage_partitions=9, input_tasks=2 @@ -10514,32 +9703,28 @@ mod tests { │ HashJoinExec: mode=CollectLeft, join_type=Inner, on=[(CAST(date_dim.d_date_sk AS Float64)@1, ws_ship_date_sk@0)], projection=[ws_ship_addr_sk@3, ws_web_site_sk@4, ws_warehouse_sk@5, ws_order_number@6, ws_ext_ship_cost@7, ws_net_profit@8] │ CoalescePartitionsExec │ [Stage 3] => NetworkBroadcastExec: partitions_per_consumer=3, stage_partitions=9, input_tasks=2 - │ PartitionIsolatorExec: tasks=3 partitions=6 - │ DataSourceExec: file_groups={6 groups: [[/testdata/tpcds/plans_sf1_partitions4/web_sales/part-0.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/web_sales/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/web_sales/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/web_sales/part-2.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/web_sales/part-2.parquet:..], ...]}, projection=[ws_ship_date_sk, ws_ship_addr_sk, ws_web_site_sk, ws_warehouse_sk, ws_order_number, ws_ext_ship_cost, ws_net_profit], file_type=parquet, predicate=DynamicFilter [ empty ] AND DynamicFilter [ empty ] AND DynamicFilter [ empty ] + │ DistributedLeafExec: DataSourceExec: file_groups={3 groups: [[/testdata/tpcds/plans_sf1_partitions4/web_sales/part-0.parquet:.., /testdata/tpcds/plans_sf1_partitions4/web_sales/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/web_sales/part-1.parquet:.., /testdata/tpcds/plans_sf1_partitions4/web_sales/part-2.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/web_sales/part-2.parquet:.., /testdata/tpcds/plans_sf1_partitions4/web_sales/part-3.parquet:..]]}, projection=[ws_ship_date_sk, ws_ship_addr_sk, ws_web_site_sk, ws_warehouse_sk, ws_order_number, ws_ext_ship_cost, ws_net_profit], file_type=parquet, predicate=DynamicFilter [ empty ] AND DynamicFilter [ empty ] AND DynamicFilter [ empty ] └────────────────────────────────────────────────── ┌───── Stage 1 ── Tasks: t0:[p0..p8] t1:[p9..p17] │ BroadcastExec: input_partitions=3, consumer_tasks=3, output_partitions=9 │ ProjectionExec: expr=[web_site_sk@0 as web_site_sk, CAST(web_site_sk@0 AS Float64) as CAST(web_site.web_site_sk AS Float64)] │ FilterExec: web_company_name@1 = pri, projection=[web_site_sk@0] │ RepartitionExec: partitioning=RoundRobinBatch(3), input_partitions=2 - │ PartitionIsolatorExec: tasks=2 partitions=4 - │ DataSourceExec: file_groups={4 groups: [[/testdata/tpcds/plans_sf1_partitions4/web_site/part-0.parquet], [/testdata/tpcds/plans_sf1_partitions4/web_site/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/web_site/part-2.parquet], [/testdata/tpcds/plans_sf1_partitions4/web_site/part-3.parquet]]}, projection=[web_site_sk, web_company_name], file_type=parquet, predicate=web_company_name@14 = pri, pruning_predicate=web_company_name_null_count@2 != row_count@3 AND web_company_name_min@0 <= pri AND pri <= web_company_name_max@1, required_guarantees=[web_company_name in (pri)] + │ DistributedLeafExec: DataSourceExec: file_groups={2 groups: [[/testdata/tpcds/plans_sf1_partitions4/web_site/part-0.parquet, /testdata/tpcds/plans_sf1_partitions4/web_site/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/web_site/part-2.parquet, /testdata/tpcds/plans_sf1_partitions4/web_site/part-3.parquet]]}, projection=[web_site_sk, web_company_name], file_type=parquet, predicate=web_company_name@14 = pri, pruning_predicate=web_company_name_null_count@2 != row_count@3 AND web_company_name_min@0 <= pri AND pri <= web_company_name_max@1, required_guarantees=[web_company_name in (pri)] └────────────────────────────────────────────────── ┌───── Stage 2 ── Tasks: t0:[p0..p8] t1:[p9..p17] │ BroadcastExec: input_partitions=3, consumer_tasks=3, output_partitions=9 │ ProjectionExec: expr=[ca_address_sk@0 as ca_address_sk, CAST(ca_address_sk@0 AS Float64) as CAST(customer_address.ca_address_sk AS Float64)] │ FilterExec: ca_state@1 = IL, projection=[ca_address_sk@0] │ RepartitionExec: partitioning=RoundRobinBatch(3), input_partitions=2 - │ PartitionIsolatorExec: tasks=2 partitions=4 - │ DataSourceExec: file_groups={4 groups: [[/testdata/tpcds/plans_sf1_partitions4/customer_address/part-0.parquet], [/testdata/tpcds/plans_sf1_partitions4/customer_address/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/customer_address/part-2.parquet], [/testdata/tpcds/plans_sf1_partitions4/customer_address/part-3.parquet]]}, projection=[ca_address_sk, ca_state], file_type=parquet, predicate=ca_state@8 = IL, pruning_predicate=ca_state_null_count@2 != row_count@3 AND ca_state_min@0 <= IL AND IL <= ca_state_max@1, required_guarantees=[ca_state in (IL)] + │ DistributedLeafExec: DataSourceExec: file_groups={2 groups: [[/testdata/tpcds/plans_sf1_partitions4/customer_address/part-0.parquet, /testdata/tpcds/plans_sf1_partitions4/customer_address/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/customer_address/part-2.parquet, /testdata/tpcds/plans_sf1_partitions4/customer_address/part-3.parquet]]}, projection=[ca_address_sk, ca_state], file_type=parquet, predicate=ca_state@8 = IL, pruning_predicate=ca_state_null_count@2 != row_count@3 AND ca_state_min@0 <= IL AND IL <= ca_state_max@1, required_guarantees=[ca_state in (IL)] └────────────────────────────────────────────────── ┌───── Stage 3 ── Tasks: t0:[p0..p8] t1:[p9..p17] │ BroadcastExec: input_partitions=3, consumer_tasks=3, output_partitions=9 │ ProjectionExec: expr=[d_date_sk@0 as d_date_sk, CAST(d_date_sk@0 AS Float64) as CAST(date_dim.d_date_sk AS Float64)] │ FilterExec: d_date@1 >= 1999-02-01 AND d_date@1 <= 1999-04-02, projection=[d_date_sk@0] │ RepartitionExec: partitioning=RoundRobinBatch(3), input_partitions=2 - │ PartitionIsolatorExec: tasks=2 partitions=4 - │ DataSourceExec: file_groups={4 groups: [[/testdata/tpcds/plans_sf1_partitions4/date_dim/part-0.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-2.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-3.parquet]]}, projection=[d_date_sk, d_date], file_type=parquet, predicate=d_date@2 >= 1999-02-01 AND d_date@2 <= 1999-04-02, pruning_predicate=d_date_null_count@1 != row_count@2 AND d_date_max@0 >= 1999-02-01 AND d_date_null_count@1 != row_count@2 AND d_date_min@3 <= 1999-04-02, required_guarantees=[] + │ DistributedLeafExec: DataSourceExec: file_groups={2 groups: [[/testdata/tpcds/plans_sf1_partitions4/date_dim/part-0.parquet, /testdata/tpcds/plans_sf1_partitions4/date_dim/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-2.parquet, /testdata/tpcds/plans_sf1_partitions4/date_dim/part-3.parquet]]}, projection=[d_date_sk, d_date], file_type=parquet, predicate=d_date@2 >= 1999-02-01 AND d_date@2 <= 1999-04-02, pruning_predicate=d_date_null_count@1 != row_count@2 AND d_date_max@0 >= 1999-02-01 AND d_date_null_count@1 != row_count@2 AND d_date_min@3 <= 1999-04-02, required_guarantees=[] └────────────────────────────────────────────────── "); Ok(()) @@ -10561,7 +9746,7 @@ mod tests { │ CoalescePartitionsExec │ HashJoinExec: mode=CollectLeft, join_type=LeftSemi, on=[(ws_order_number@0, ws_order_number@0)] │ CoalescePartitionsExec - │ [Stage 4] => NetworkCoalesceExec: output_partitions=6, input_tasks=3 + │ [Stage 4] => NetworkCoalesceExec: output_partitions=9, input_tasks=3 │ HashJoinExec: mode=Partitioned, join_type=Inner, on=[(ws_order_number@1, ws_order_number@1)], filter=ws_warehouse_sk@1 != ws_warehouse_sk@0, projection=[ws_order_number@1] │ [Stage 5] => NetworkShuffleExec: output_partitions=3, input_tasks=3 │ [Stage 6] => NetworkShuffleExec: output_partitions=3, input_tasks=3 @@ -10572,7 +9757,7 @@ mod tests { │ [Stage 8] => NetworkShuffleExec: output_partitions=3, input_tasks=3 │ [Stage 9] => NetworkShuffleExec: output_partitions=3, input_tasks=3 └────────────────────────────────────────────────── - ┌───── Stage 4 ── Tasks: t0:[p0..p1] t1:[p2..p3] t2:[p4..p5] + ┌───── Stage 4 ── Tasks: t0:[p0..p2] t1:[p3..p5] t2:[p6..p8] │ HashJoinExec: mode=CollectLeft, join_type=Inner, on=[(CAST(web_site.web_site_sk AS Float64)@1, ws_web_site_sk@0)], projection=[ws_order_number@3, ws_ext_ship_cost@4, ws_net_profit@5] │ CoalescePartitionsExec │ [Stage 1] => NetworkBroadcastExec: partitions_per_consumer=3, stage_partitions=9, input_tasks=2 @@ -10582,57 +9767,48 @@ mod tests { │ HashJoinExec: mode=CollectLeft, join_type=Inner, on=[(CAST(date_dim.d_date_sk AS Float64)@1, ws_ship_date_sk@0)], projection=[ws_ship_addr_sk@3, ws_web_site_sk@4, ws_order_number@5, ws_ext_ship_cost@6, ws_net_profit@7] │ CoalescePartitionsExec │ [Stage 3] => NetworkBroadcastExec: partitions_per_consumer=3, stage_partitions=9, input_tasks=2 - │ PartitionIsolatorExec: tasks=3 partitions=6 - │ DataSourceExec: file_groups={6 groups: [[/testdata/tpcds/plans_sf1_partitions4/web_sales/part-0.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/web_sales/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/web_sales/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/web_sales/part-2.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/web_sales/part-2.parquet:..], ...]}, projection=[ws_ship_date_sk, ws_ship_addr_sk, ws_web_site_sk, ws_order_number, ws_ext_ship_cost, ws_net_profit], file_type=parquet, predicate=DynamicFilter [ empty ] AND DynamicFilter [ empty ] AND DynamicFilter [ empty ] + │ DistributedLeafExec: DataSourceExec: file_groups={3 groups: [[/testdata/tpcds/plans_sf1_partitions4/web_sales/part-0.parquet:.., /testdata/tpcds/plans_sf1_partitions4/web_sales/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/web_sales/part-1.parquet:.., /testdata/tpcds/plans_sf1_partitions4/web_sales/part-2.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/web_sales/part-2.parquet:.., /testdata/tpcds/plans_sf1_partitions4/web_sales/part-3.parquet:..]]}, projection=[ws_ship_date_sk, ws_ship_addr_sk, ws_web_site_sk, ws_order_number, ws_ext_ship_cost, ws_net_profit], file_type=parquet, predicate=DynamicFilter [ empty ] AND DynamicFilter [ empty ] AND DynamicFilter [ empty ] └────────────────────────────────────────────────── ┌───── Stage 1 ── Tasks: t0:[p0..p8] t1:[p9..p17] │ BroadcastExec: input_partitions=3, consumer_tasks=3, output_partitions=9 │ ProjectionExec: expr=[web_site_sk@0 as web_site_sk, CAST(web_site_sk@0 AS Float64) as CAST(web_site.web_site_sk AS Float64)] │ FilterExec: web_company_name@1 = pri, projection=[web_site_sk@0] │ RepartitionExec: partitioning=RoundRobinBatch(3), input_partitions=2 - │ PartitionIsolatorExec: tasks=2 partitions=4 - │ DataSourceExec: file_groups={4 groups: [[/testdata/tpcds/plans_sf1_partitions4/web_site/part-0.parquet], [/testdata/tpcds/plans_sf1_partitions4/web_site/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/web_site/part-2.parquet], [/testdata/tpcds/plans_sf1_partitions4/web_site/part-3.parquet]]}, projection=[web_site_sk, web_company_name], file_type=parquet, predicate=web_company_name@14 = pri, pruning_predicate=web_company_name_null_count@2 != row_count@3 AND web_company_name_min@0 <= pri AND pri <= web_company_name_max@1, required_guarantees=[web_company_name in (pri)] + │ DistributedLeafExec: DataSourceExec: file_groups={2 groups: [[/testdata/tpcds/plans_sf1_partitions4/web_site/part-0.parquet, /testdata/tpcds/plans_sf1_partitions4/web_site/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/web_site/part-2.parquet, /testdata/tpcds/plans_sf1_partitions4/web_site/part-3.parquet]]}, projection=[web_site_sk, web_company_name], file_type=parquet, predicate=web_company_name@14 = pri, pruning_predicate=web_company_name_null_count@2 != row_count@3 AND web_company_name_min@0 <= pri AND pri <= web_company_name_max@1, required_guarantees=[web_company_name in (pri)] └────────────────────────────────────────────────── ┌───── Stage 2 ── Tasks: t0:[p0..p8] t1:[p9..p17] │ BroadcastExec: input_partitions=3, consumer_tasks=3, output_partitions=9 │ ProjectionExec: expr=[ca_address_sk@0 as ca_address_sk, CAST(ca_address_sk@0 AS Float64) as CAST(customer_address.ca_address_sk AS Float64)] │ FilterExec: ca_state@1 = IL, projection=[ca_address_sk@0] │ RepartitionExec: partitioning=RoundRobinBatch(3), input_partitions=2 - │ PartitionIsolatorExec: tasks=2 partitions=4 - │ DataSourceExec: file_groups={4 groups: [[/testdata/tpcds/plans_sf1_partitions4/customer_address/part-0.parquet], [/testdata/tpcds/plans_sf1_partitions4/customer_address/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/customer_address/part-2.parquet], [/testdata/tpcds/plans_sf1_partitions4/customer_address/part-3.parquet]]}, projection=[ca_address_sk, ca_state], file_type=parquet, predicate=ca_state@8 = IL, pruning_predicate=ca_state_null_count@2 != row_count@3 AND ca_state_min@0 <= IL AND IL <= ca_state_max@1, required_guarantees=[ca_state in (IL)] + │ DistributedLeafExec: DataSourceExec: file_groups={2 groups: [[/testdata/tpcds/plans_sf1_partitions4/customer_address/part-0.parquet, /testdata/tpcds/plans_sf1_partitions4/customer_address/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/customer_address/part-2.parquet, /testdata/tpcds/plans_sf1_partitions4/customer_address/part-3.parquet]]}, projection=[ca_address_sk, ca_state], file_type=parquet, predicate=ca_state@8 = IL, pruning_predicate=ca_state_null_count@2 != row_count@3 AND ca_state_min@0 <= IL AND IL <= ca_state_max@1, required_guarantees=[ca_state in (IL)] └────────────────────────────────────────────────── ┌───── Stage 3 ── Tasks: t0:[p0..p8] t1:[p9..p17] │ BroadcastExec: input_partitions=3, consumer_tasks=3, output_partitions=9 │ ProjectionExec: expr=[d_date_sk@0 as d_date_sk, CAST(d_date_sk@0 AS Float64) as CAST(date_dim.d_date_sk AS Float64)] │ FilterExec: d_date@1 >= 1999-02-01 AND d_date@1 <= 1999-04-02, projection=[d_date_sk@0] │ RepartitionExec: partitioning=RoundRobinBatch(3), input_partitions=2 - │ PartitionIsolatorExec: tasks=2 partitions=4 - │ DataSourceExec: file_groups={4 groups: [[/testdata/tpcds/plans_sf1_partitions4/date_dim/part-0.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-2.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-3.parquet]]}, projection=[d_date_sk, d_date], file_type=parquet, predicate=d_date@2 >= 1999-02-01 AND d_date@2 <= 1999-04-02, pruning_predicate=d_date_null_count@1 != row_count@2 AND d_date_max@0 >= 1999-02-01 AND d_date_null_count@1 != row_count@2 AND d_date_min@3 <= 1999-04-02, required_guarantees=[] + │ DistributedLeafExec: DataSourceExec: file_groups={2 groups: [[/testdata/tpcds/plans_sf1_partitions4/date_dim/part-0.parquet, /testdata/tpcds/plans_sf1_partitions4/date_dim/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-2.parquet, /testdata/tpcds/plans_sf1_partitions4/date_dim/part-3.parquet]]}, projection=[d_date_sk, d_date], file_type=parquet, predicate=d_date@2 >= 1999-02-01 AND d_date@2 <= 1999-04-02, pruning_predicate=d_date_null_count@1 != row_count@2 AND d_date_max@0 >= 1999-02-01 AND d_date_null_count@1 != row_count@2 AND d_date_min@3 <= 1999-04-02, required_guarantees=[] └────────────────────────────────────────────────── ┌───── Stage 5 ── Tasks: t0:[p0..p2] t1:[p0..p2] t2:[p0..p2] - │ RepartitionExec: partitioning=Hash([ws_order_number@1], 3), input_partitions=2 - │ PartitionIsolatorExec: tasks=3 partitions=6 - │ DataSourceExec: file_groups={6 groups: [[/testdata/tpcds/plans_sf1_partitions4/web_sales/part-0.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/web_sales/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/web_sales/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/web_sales/part-2.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/web_sales/part-2.parquet:..], ...]}, projection=[ws_warehouse_sk, ws_order_number], file_type=parquet + │ RepartitionExec: partitioning=Hash([ws_order_number@1], 3), input_partitions=3 + │ DistributedLeafExec: DataSourceExec: file_groups={3 groups: [[/testdata/tpcds/plans_sf1_partitions4/web_sales/part-0.parquet:.., /testdata/tpcds/plans_sf1_partitions4/web_sales/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/web_sales/part-1.parquet:.., /testdata/tpcds/plans_sf1_partitions4/web_sales/part-2.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/web_sales/part-2.parquet:.., /testdata/tpcds/plans_sf1_partitions4/web_sales/part-3.parquet:..]]}, projection=[ws_warehouse_sk, ws_order_number], file_type=parquet └────────────────────────────────────────────────── ┌───── Stage 6 ── Tasks: t0:[p0..p2] t1:[p0..p2] t2:[p0..p2] - │ RepartitionExec: partitioning=Hash([ws_order_number@1], 3), input_partitions=2 - │ PartitionIsolatorExec: tasks=3 partitions=6 - │ DataSourceExec: file_groups={6 groups: [[/testdata/tpcds/plans_sf1_partitions4/web_sales/part-0.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/web_sales/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/web_sales/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/web_sales/part-2.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/web_sales/part-2.parquet:..], ...]}, projection=[ws_warehouse_sk, ws_order_number], file_type=parquet, predicate=DynamicFilter [ empty ] + │ RepartitionExec: partitioning=Hash([ws_order_number@1], 3), input_partitions=3 + │ DistributedLeafExec: DataSourceExec: file_groups={3 groups: [[/testdata/tpcds/plans_sf1_partitions4/web_sales/part-0.parquet:.., /testdata/tpcds/plans_sf1_partitions4/web_sales/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/web_sales/part-1.parquet:.., /testdata/tpcds/plans_sf1_partitions4/web_sales/part-2.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/web_sales/part-2.parquet:.., /testdata/tpcds/plans_sf1_partitions4/web_sales/part-3.parquet:..]]}, projection=[ws_warehouse_sk, ws_order_number], file_type=parquet, predicate=DynamicFilter [ empty ] └────────────────────────────────────────────────── ┌───── Stage 7 ── Tasks: t0:[p0..p1] t1:[p2..p3] │ BroadcastExec: input_partitions=2, consumer_tasks=1, output_partitions=2 - │ PartitionIsolatorExec: tasks=2 partitions=4 - │ DataSourceExec: file_groups={4 groups: [[/testdata/tpcds/plans_sf1_partitions4/web_returns/part-0.parquet], [/testdata/tpcds/plans_sf1_partitions4/web_returns/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/web_returns/part-2.parquet], [/testdata/tpcds/plans_sf1_partitions4/web_returns/part-3.parquet]]}, projection=[wr_order_number], file_type=parquet + │ DistributedLeafExec: DataSourceExec: file_groups={2 groups: [[/testdata/tpcds/plans_sf1_partitions4/web_returns/part-0.parquet, /testdata/tpcds/plans_sf1_partitions4/web_returns/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/web_returns/part-2.parquet, /testdata/tpcds/plans_sf1_partitions4/web_returns/part-3.parquet]]}, projection=[wr_order_number], file_type=parquet └────────────────────────────────────────────────── ┌───── Stage 8 ── Tasks: t0:[p0..p2] t1:[p0..p2] t2:[p0..p2] - │ RepartitionExec: partitioning=Hash([ws_order_number@1], 3), input_partitions=2 - │ PartitionIsolatorExec: tasks=3 partitions=6 - │ DataSourceExec: file_groups={6 groups: [[/testdata/tpcds/plans_sf1_partitions4/web_sales/part-0.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/web_sales/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/web_sales/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/web_sales/part-2.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/web_sales/part-2.parquet:..], ...]}, projection=[ws_warehouse_sk, ws_order_number], file_type=parquet, predicate=DynamicFilter [ empty ] + │ RepartitionExec: partitioning=Hash([ws_order_number@1], 3), input_partitions=3 + │ DistributedLeafExec: DataSourceExec: file_groups={3 groups: [[/testdata/tpcds/plans_sf1_partitions4/web_sales/part-0.parquet:.., /testdata/tpcds/plans_sf1_partitions4/web_sales/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/web_sales/part-1.parquet:.., /testdata/tpcds/plans_sf1_partitions4/web_sales/part-2.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/web_sales/part-2.parquet:.., /testdata/tpcds/plans_sf1_partitions4/web_sales/part-3.parquet:..]]}, projection=[ws_warehouse_sk, ws_order_number], file_type=parquet, predicate=DynamicFilter [ empty ] └────────────────────────────────────────────────── ┌───── Stage 9 ── Tasks: t0:[p0..p2] t1:[p0..p2] t2:[p0..p2] - │ RepartitionExec: partitioning=Hash([ws_order_number@1], 3), input_partitions=2 - │ PartitionIsolatorExec: tasks=3 partitions=6 - │ DataSourceExec: file_groups={6 groups: [[/testdata/tpcds/plans_sf1_partitions4/web_sales/part-0.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/web_sales/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/web_sales/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/web_sales/part-2.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/web_sales/part-2.parquet:..], ...]}, projection=[ws_warehouse_sk, ws_order_number], file_type=parquet, predicate=DynamicFilter [ empty ] + │ RepartitionExec: partitioning=Hash([ws_order_number@1], 3), input_partitions=3 + │ DistributedLeafExec: DataSourceExec: file_groups={3 groups: [[/testdata/tpcds/plans_sf1_partitions4/web_sales/part-0.parquet:.., /testdata/tpcds/plans_sf1_partitions4/web_sales/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/web_sales/part-1.parquet:.., /testdata/tpcds/plans_sf1_partitions4/web_sales/part-2.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/web_sales/part-2.parquet:.., /testdata/tpcds/plans_sf1_partitions4/web_sales/part-3.parquet:..]]}, projection=[ws_warehouse_sk, ws_order_number], file_type=parquet, predicate=DynamicFilter [ empty ] └────────────────────────────────────────────────── "); Ok(()) @@ -10640,15 +9816,15 @@ mod tests { #[tokio::test] async fn test_tpcds_96() -> Result<()> { let display = test_tpcds_query("q96").await?; - assert_snapshot!(display, @" + assert_snapshot!(display, @r" ┌───── DistributedExec ── Tasks: t0:[p0] │ ProjectionExec: expr=[count(Int64(1))@0 as count(*)] │ GlobalLimitExec: skip=0, fetch=100 │ AggregateExec: mode=Final, gby=[], aggr=[count(Int64(1))] │ CoalescePartitionsExec - │ [Stage 4] => NetworkCoalesceExec: output_partitions=6, input_tasks=3 + │ [Stage 4] => NetworkCoalesceExec: output_partitions=9, input_tasks=3 └────────────────────────────────────────────────── - ┌───── Stage 4 ── Tasks: t0:[p0..p1] t1:[p2..p3] t2:[p4..p5] + ┌───── Stage 4 ── Tasks: t0:[p0..p2] t1:[p3..p5] t2:[p6..p8] │ AggregateExec: mode=Partial, gby=[], aggr=[count(Int64(1))] │ HashJoinExec: mode=CollectLeft, join_type=Inner, on=[(CAST(store.s_store_sk AS Float64)@1, ss_store_sk@0)], projection=[] │ CoalescePartitionsExec @@ -10659,32 +9835,28 @@ mod tests { │ HashJoinExec: mode=CollectLeft, join_type=Inner, on=[(CAST(household_demographics.hd_demo_sk AS Float64)@1, ss_hdemo_sk@1)], projection=[ss_sold_time_sk@2, ss_store_sk@4] │ CoalescePartitionsExec │ [Stage 3] => NetworkBroadcastExec: partitions_per_consumer=3, stage_partitions=9, input_tasks=2 - │ PartitionIsolatorExec: tasks=3 partitions=6 - │ DataSourceExec: file_groups={6 groups: [[/testdata/tpcds/plans_sf1_partitions4/store_sales/part-0.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_sales/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_sales/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_sales/part-2.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_sales/part-2.parquet:..], ...]}, projection=[ss_sold_time_sk, ss_hdemo_sk, ss_store_sk], file_type=parquet, predicate=DynamicFilter [ empty ] AND DynamicFilter [ empty ] AND DynamicFilter [ empty ] + │ DistributedLeafExec: DataSourceExec: file_groups={3 groups: [[/testdata/tpcds/plans_sf1_partitions4/store_sales/part-0.parquet:.., /testdata/tpcds/plans_sf1_partitions4/store_sales/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_sales/part-1.parquet:.., /testdata/tpcds/plans_sf1_partitions4/store_sales/part-2.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_sales/part-2.parquet:.., /testdata/tpcds/plans_sf1_partitions4/store_sales/part-3.parquet:..]]}, projection=[ss_sold_time_sk, ss_hdemo_sk, ss_store_sk], file_type=parquet, predicate=DynamicFilter [ empty ] AND DynamicFilter [ empty ] AND DynamicFilter [ empty ] └────────────────────────────────────────────────── ┌───── Stage 1 ── Tasks: t0:[p0..p8] t1:[p9..p17] │ BroadcastExec: input_partitions=3, consumer_tasks=3, output_partitions=9 │ ProjectionExec: expr=[s_store_sk@0 as s_store_sk, CAST(s_store_sk@0 AS Float64) as CAST(store.s_store_sk AS Float64)] │ FilterExec: s_store_name@1 = ese, projection=[s_store_sk@0] │ RepartitionExec: partitioning=RoundRobinBatch(3), input_partitions=2 - │ PartitionIsolatorExec: tasks=2 partitions=4 - │ DataSourceExec: file_groups={4 groups: [[/testdata/tpcds/plans_sf1_partitions4/store/part-0.parquet], [/testdata/tpcds/plans_sf1_partitions4/store/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/store/part-2.parquet], [/testdata/tpcds/plans_sf1_partitions4/store/part-3.parquet]]}, projection=[s_store_sk, s_store_name], file_type=parquet, predicate=s_store_name@5 = ese, pruning_predicate=s_store_name_null_count@2 != row_count@3 AND s_store_name_min@0 <= ese AND ese <= s_store_name_max@1, required_guarantees=[s_store_name in (ese)] + │ DistributedLeafExec: DataSourceExec: file_groups={2 groups: [[/testdata/tpcds/plans_sf1_partitions4/store/part-0.parquet, /testdata/tpcds/plans_sf1_partitions4/store/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/store/part-2.parquet, /testdata/tpcds/plans_sf1_partitions4/store/part-3.parquet]]}, projection=[s_store_sk, s_store_name], file_type=parquet, predicate=s_store_name@5 = ese, pruning_predicate=s_store_name_null_count@2 != row_count@3 AND s_store_name_min@0 <= ese AND ese <= s_store_name_max@1, required_guarantees=[s_store_name in (ese)] └────────────────────────────────────────────────── ┌───── Stage 2 ── Tasks: t0:[p0..p8] t1:[p9..p17] │ BroadcastExec: input_partitions=3, consumer_tasks=3, output_partitions=9 │ ProjectionExec: expr=[t_time_sk@0 as t_time_sk, CAST(t_time_sk@0 AS Float64) as CAST(time_dim.t_time_sk AS Float64)] │ FilterExec: t_hour@1 = 20 AND t_minute@2 >= 30, projection=[t_time_sk@0] │ RepartitionExec: partitioning=RoundRobinBatch(3), input_partitions=2 - │ PartitionIsolatorExec: tasks=2 partitions=4 - │ DataSourceExec: file_groups={4 groups: [[/testdata/tpcds/plans_sf1_partitions4/time_dim/part-0.parquet], [/testdata/tpcds/plans_sf1_partitions4/time_dim/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/time_dim/part-2.parquet], [/testdata/tpcds/plans_sf1_partitions4/time_dim/part-3.parquet]]}, projection=[t_time_sk, t_hour, t_minute], file_type=parquet, predicate=t_hour@3 = 20 AND t_minute@4 >= 30, pruning_predicate=t_hour_null_count@2 != row_count@3 AND t_hour_min@0 <= 20 AND 20 <= t_hour_max@1 AND t_minute_null_count@5 != row_count@3 AND t_minute_max@4 >= 30, required_guarantees=[t_hour in (20)] + │ DistributedLeafExec: DataSourceExec: file_groups={2 groups: [[/testdata/tpcds/plans_sf1_partitions4/time_dim/part-0.parquet, /testdata/tpcds/plans_sf1_partitions4/time_dim/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/time_dim/part-2.parquet, /testdata/tpcds/plans_sf1_partitions4/time_dim/part-3.parquet]]}, projection=[t_time_sk, t_hour, t_minute], file_type=parquet, predicate=t_hour@3 = 20 AND t_minute@4 >= 30, pruning_predicate=t_hour_null_count@2 != row_count@3 AND t_hour_min@0 <= 20 AND 20 <= t_hour_max@1 AND t_minute_null_count@5 != row_count@3 AND t_minute_max@4 >= 30, required_guarantees=[t_hour in (20)] └────────────────────────────────────────────────── ┌───── Stage 3 ── Tasks: t0:[p0..p8] t1:[p9..p17] │ BroadcastExec: input_partitions=3, consumer_tasks=3, output_partitions=9 │ ProjectionExec: expr=[hd_demo_sk@0 as hd_demo_sk, CAST(hd_demo_sk@0 AS Float64) as CAST(household_demographics.hd_demo_sk AS Float64)] │ FilterExec: hd_dep_count@1 = 7, projection=[hd_demo_sk@0] │ RepartitionExec: partitioning=RoundRobinBatch(3), input_partitions=2 - │ PartitionIsolatorExec: tasks=2 partitions=4 - │ DataSourceExec: file_groups={4 groups: [[/testdata/tpcds/plans_sf1_partitions4/household_demographics/part-0.parquet], [/testdata/tpcds/plans_sf1_partitions4/household_demographics/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/household_demographics/part-2.parquet], [/testdata/tpcds/plans_sf1_partitions4/household_demographics/part-3.parquet]]}, projection=[hd_demo_sk, hd_dep_count], file_type=parquet, predicate=hd_dep_count@3 = 7, pruning_predicate=hd_dep_count_null_count@2 != row_count@3 AND hd_dep_count_min@0 <= 7 AND 7 <= hd_dep_count_max@1, required_guarantees=[hd_dep_count in (7)] + │ DistributedLeafExec: DataSourceExec: file_groups={2 groups: [[/testdata/tpcds/plans_sf1_partitions4/household_demographics/part-0.parquet, /testdata/tpcds/plans_sf1_partitions4/household_demographics/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/household_demographics/part-2.parquet, /testdata/tpcds/plans_sf1_partitions4/household_demographics/part-3.parquet]]}, projection=[hd_demo_sk, hd_dep_count], file_type=parquet, predicate=hd_dep_count@3 = 7, pruning_predicate=hd_dep_count_null_count@2 != row_count@3 AND hd_dep_count_min@0 <= 7 AND 7 <= hd_dep_count_max@1, required_guarantees=[hd_dep_count in (7)] └────────────────────────────────────────────────── "); Ok(()) @@ -10692,7 +9864,7 @@ mod tests { #[tokio::test] async fn test_tpcds_97() -> Result<()> { let display = test_tpcds_query("q97").await?; - assert_snapshot!(display, @" + assert_snapshot!(display, @r" ┌───── DistributedExec ── Tasks: t0:[p0] │ ProjectionExec: expr=[sum(CASE WHEN ssci.customer_sk IS NOT NULL AND csci.customer_sk IS NULL THEN Int64(1) ELSE Int64(0) END)@0 as store_only, sum(CASE WHEN ssci.customer_sk IS NULL AND csci.customer_sk IS NOT NULL THEN Int64(1) ELSE Int64(0) END)@1 as catalog_only, sum(CASE WHEN ssci.customer_sk IS NOT NULL AND csci.customer_sk IS NOT NULL THEN Int64(1) ELSE Int64(0) END)@2 as store_and_catalog] │ GlobalLimitExec: skip=0, fetch=100 @@ -10713,38 +9885,34 @@ mod tests { │ [Stage 2] => NetworkShuffleExec: output_partitions=3, input_tasks=3 └────────────────────────────────────────────────── ┌───── Stage 2 ── Tasks: t0:[p0..p5] t1:[p0..p5] t2:[p0..p5] - │ RepartitionExec: partitioning=Hash([cs_bill_customer_sk@0, cs_item_sk@1], 6), input_partitions=2 + │ RepartitionExec: partitioning=Hash([cs_bill_customer_sk@0, cs_item_sk@1], 6), input_partitions=3 │ AggregateExec: mode=Partial, gby=[cs_bill_customer_sk@0 as cs_bill_customer_sk, cs_item_sk@1 as cs_item_sk], aggr=[] │ HashJoinExec: mode=CollectLeft, join_type=Inner, on=[(CAST(date_dim.d_date_sk AS Float64)@1, cs_sold_date_sk@0)], projection=[cs_bill_customer_sk@3, cs_item_sk@4] │ CoalescePartitionsExec │ [Stage 1] => NetworkBroadcastExec: partitions_per_consumer=3, stage_partitions=9, input_tasks=2 - │ PartitionIsolatorExec: tasks=3 partitions=6 - │ DataSourceExec: file_groups={6 groups: [[/testdata/tpcds/plans_sf1_partitions4/catalog_sales/part-0.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/catalog_sales/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/catalog_sales/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/catalog_sales/part-2.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/catalog_sales/part-2.parquet:..], ...]}, projection=[cs_sold_date_sk, cs_bill_customer_sk, cs_item_sk], file_type=parquet, predicate=DynamicFilter [ empty ] + │ DistributedLeafExec: DataSourceExec: file_groups={3 groups: [[/testdata/tpcds/plans_sf1_partitions4/catalog_sales/part-0.parquet:.., /testdata/tpcds/plans_sf1_partitions4/catalog_sales/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/catalog_sales/part-1.parquet:.., /testdata/tpcds/plans_sf1_partitions4/catalog_sales/part-2.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/catalog_sales/part-2.parquet:.., /testdata/tpcds/plans_sf1_partitions4/catalog_sales/part-3.parquet:..]]}, projection=[cs_sold_date_sk, cs_bill_customer_sk, cs_item_sk], file_type=parquet, predicate=DynamicFilter [ empty ] └────────────────────────────────────────────────── ┌───── Stage 1 ── Tasks: t0:[p0..p8] t1:[p9..p17] │ BroadcastExec: input_partitions=3, consumer_tasks=3, output_partitions=9 │ ProjectionExec: expr=[d_date_sk@0 as d_date_sk, CAST(d_date_sk@0 AS Float64) as CAST(date_dim.d_date_sk AS Float64)] │ FilterExec: d_month_seq@1 >= 1200 AND d_month_seq@1 <= 1211, projection=[d_date_sk@0] │ RepartitionExec: partitioning=RoundRobinBatch(3), input_partitions=2 - │ PartitionIsolatorExec: tasks=2 partitions=4 - │ DataSourceExec: file_groups={4 groups: [[/testdata/tpcds/plans_sf1_partitions4/date_dim/part-0.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-2.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-3.parquet]]}, projection=[d_date_sk, d_month_seq], file_type=parquet, predicate=d_month_seq@3 >= 1200 AND d_month_seq@3 <= 1211, pruning_predicate=d_month_seq_null_count@1 != row_count@2 AND d_month_seq_max@0 >= 1200 AND d_month_seq_null_count@1 != row_count@2 AND d_month_seq_min@3 <= 1211, required_guarantees=[] + │ DistributedLeafExec: DataSourceExec: file_groups={2 groups: [[/testdata/tpcds/plans_sf1_partitions4/date_dim/part-0.parquet, /testdata/tpcds/plans_sf1_partitions4/date_dim/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-2.parquet, /testdata/tpcds/plans_sf1_partitions4/date_dim/part-3.parquet]]}, projection=[d_date_sk, d_month_seq], file_type=parquet, predicate=d_month_seq@3 >= 1200 AND d_month_seq@3 <= 1211, pruning_predicate=d_month_seq_null_count@1 != row_count@2 AND d_month_seq_max@0 >= 1200 AND d_month_seq_null_count@1 != row_count@2 AND d_month_seq_min@3 <= 1211, required_guarantees=[] └────────────────────────────────────────────────── ┌───── Stage 5 ── Tasks: t0:[p0..p2] t1:[p0..p2] t2:[p0..p2] - │ RepartitionExec: partitioning=Hash([ss_customer_sk@0, ss_item_sk@1], 3), input_partitions=2 + │ RepartitionExec: partitioning=Hash([ss_customer_sk@0, ss_item_sk@1], 3), input_partitions=3 │ AggregateExec: mode=Partial, gby=[ss_customer_sk@1 as ss_customer_sk, ss_item_sk@0 as ss_item_sk], aggr=[] │ HashJoinExec: mode=CollectLeft, join_type=Inner, on=[(CAST(date_dim.d_date_sk AS Float64)@1, ss_sold_date_sk@0)], projection=[ss_item_sk@3, ss_customer_sk@4] │ CoalescePartitionsExec │ [Stage 4] => NetworkBroadcastExec: partitions_per_consumer=3, stage_partitions=9, input_tasks=2 - │ PartitionIsolatorExec: tasks=3 partitions=6 - │ DataSourceExec: file_groups={6 groups: [[/testdata/tpcds/plans_sf1_partitions4/store_sales/part-0.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_sales/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_sales/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_sales/part-2.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_sales/part-2.parquet:..], ...]}, projection=[ss_sold_date_sk, ss_item_sk, ss_customer_sk], file_type=parquet, predicate=DynamicFilter [ empty ] + │ DistributedLeafExec: DataSourceExec: file_groups={3 groups: [[/testdata/tpcds/plans_sf1_partitions4/store_sales/part-0.parquet:.., /testdata/tpcds/plans_sf1_partitions4/store_sales/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_sales/part-1.parquet:.., /testdata/tpcds/plans_sf1_partitions4/store_sales/part-2.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_sales/part-2.parquet:.., /testdata/tpcds/plans_sf1_partitions4/store_sales/part-3.parquet:..]]}, projection=[ss_sold_date_sk, ss_item_sk, ss_customer_sk], file_type=parquet, predicate=DynamicFilter [ empty ] └────────────────────────────────────────────────── ┌───── Stage 4 ── Tasks: t0:[p0..p8] t1:[p9..p17] │ BroadcastExec: input_partitions=3, consumer_tasks=3, output_partitions=9 │ ProjectionExec: expr=[d_date_sk@0 as d_date_sk, CAST(d_date_sk@0 AS Float64) as CAST(date_dim.d_date_sk AS Float64)] │ FilterExec: d_month_seq@1 >= 1200 AND d_month_seq@1 <= 1211, projection=[d_date_sk@0] │ RepartitionExec: partitioning=RoundRobinBatch(3), input_partitions=2 - │ PartitionIsolatorExec: tasks=2 partitions=4 - │ DataSourceExec: file_groups={4 groups: [[/testdata/tpcds/plans_sf1_partitions4/date_dim/part-0.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-2.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-3.parquet]]}, projection=[d_date_sk, d_month_seq], file_type=parquet, predicate=d_month_seq@3 >= 1200 AND d_month_seq@3 <= 1211, pruning_predicate=d_month_seq_null_count@1 != row_count@2 AND d_month_seq_max@0 >= 1200 AND d_month_seq_null_count@1 != row_count@2 AND d_month_seq_min@3 <= 1211, required_guarantees=[] + │ DistributedLeafExec: DataSourceExec: file_groups={2 groups: [[/testdata/tpcds/plans_sf1_partitions4/date_dim/part-0.parquet, /testdata/tpcds/plans_sf1_partitions4/date_dim/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-2.parquet, /testdata/tpcds/plans_sf1_partitions4/date_dim/part-3.parquet]]}, projection=[d_date_sk, d_month_seq], file_type=parquet, predicate=d_month_seq@3 >= 1200 AND d_month_seq@3 <= 1211, pruning_predicate=d_month_seq_null_count@1 != row_count@2 AND d_month_seq_max@0 >= 1200 AND d_month_seq_null_count@1 != row_count@2 AND d_month_seq_min@3 <= 1211, required_guarantees=[] └────────────────────────────────────────────────── "); Ok(()) @@ -10767,7 +9935,7 @@ mod tests { │ [Stage 3] => NetworkShuffleExec: output_partitions=3, input_tasks=3 └────────────────────────────────────────────────── ┌───── Stage 3 ── Tasks: t0:[p0..p5] t1:[p0..p5] t2:[p0..p5] - │ RepartitionExec: partitioning=Hash([i_item_id@0, i_item_desc@1, i_category@2, i_class@3, i_current_price@4], 6), input_partitions=2 + │ RepartitionExec: partitioning=Hash([i_item_id@0, i_item_desc@1, i_category@2, i_class@3, i_current_price@4], 6), input_partitions=3 │ AggregateExec: mode=Partial, gby=[i_item_id@1 as i_item_id, i_item_desc@2 as i_item_desc, i_category@5 as i_category, i_class@4 as i_class, i_current_price@3 as i_current_price], aggr=[sum(store_sales.ss_ext_sales_price)] │ HashJoinExec: mode=CollectLeft, join_type=Inner, on=[(CAST(date_dim.d_date_sk AS Float64)@1, ss_sold_date_sk@0)], projection=[ss_ext_sales_price@3, i_item_id@4, i_item_desc@5, i_current_price@6, i_class@7, i_category@8] │ CoalescePartitionsExec @@ -10776,23 +9944,20 @@ mod tests { │ HashJoinExec: mode=CollectLeft, join_type=Inner, on=[(i_item_sk@0, ss_item_sk@1)], projection=[i_item_id@1, i_item_desc@2, i_current_price@3, i_class@4, i_category@5, ss_sold_date_sk@6, ss_ext_sales_price@8] │ CoalescePartitionsExec │ [Stage 2] => NetworkBroadcastExec: partitions_per_consumer=3, stage_partitions=9, input_tasks=2 - │ PartitionIsolatorExec: tasks=3 partitions=6 - │ DataSourceExec: file_groups={6 groups: [[/testdata/tpcds/plans_sf1_partitions4/store_sales/part-0.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_sales/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_sales/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_sales/part-2.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_sales/part-2.parquet:..], ...]}, projection=[ss_sold_date_sk, ss_item_sk, ss_ext_sales_price], file_type=parquet, predicate=DynamicFilter [ empty ] AND DynamicFilter [ empty ] + │ DistributedLeafExec: DataSourceExec: file_groups={3 groups: [[/testdata/tpcds/plans_sf1_partitions4/store_sales/part-0.parquet:.., /testdata/tpcds/plans_sf1_partitions4/store_sales/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_sales/part-1.parquet:.., /testdata/tpcds/plans_sf1_partitions4/store_sales/part-2.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/store_sales/part-2.parquet:.., /testdata/tpcds/plans_sf1_partitions4/store_sales/part-3.parquet:..]]}, projection=[ss_sold_date_sk, ss_item_sk, ss_ext_sales_price], file_type=parquet, predicate=DynamicFilter [ empty ] AND DynamicFilter [ empty ] └────────────────────────────────────────────────── ┌───── Stage 1 ── Tasks: t0:[p0..p8] t1:[p9..p17] │ BroadcastExec: input_partitions=3, consumer_tasks=3, output_partitions=9 │ ProjectionExec: expr=[d_date_sk@0 as d_date_sk, CAST(d_date_sk@0 AS Float64) as CAST(date_dim.d_date_sk AS Float64)] │ FilterExec: d_date@1 >= 1999-02-22 AND d_date@1 <= 1999-03-24, projection=[d_date_sk@0] │ RepartitionExec: partitioning=RoundRobinBatch(3), input_partitions=2 - │ PartitionIsolatorExec: tasks=2 partitions=4 - │ DataSourceExec: file_groups={4 groups: [[/testdata/tpcds/plans_sf1_partitions4/date_dim/part-0.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-2.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-3.parquet]]}, projection=[d_date_sk, d_date], file_type=parquet, predicate=d_date@2 >= 1999-02-22 AND d_date@2 <= 1999-03-24, pruning_predicate=d_date_null_count@1 != row_count@2 AND d_date_max@0 >= 1999-02-22 AND d_date_null_count@1 != row_count@2 AND d_date_min@3 <= 1999-03-24, required_guarantees=[] + │ DistributedLeafExec: DataSourceExec: file_groups={2 groups: [[/testdata/tpcds/plans_sf1_partitions4/date_dim/part-0.parquet, /testdata/tpcds/plans_sf1_partitions4/date_dim/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-2.parquet, /testdata/tpcds/plans_sf1_partitions4/date_dim/part-3.parquet]]}, projection=[d_date_sk, d_date], file_type=parquet, predicate=d_date@2 >= 1999-02-22 AND d_date@2 <= 1999-03-24, pruning_predicate=d_date_null_count@1 != row_count@2 AND d_date_max@0 >= 1999-02-22 AND d_date_null_count@1 != row_count@2 AND d_date_min@3 <= 1999-03-24, required_guarantees=[] └────────────────────────────────────────────────── ┌───── Stage 2 ── Tasks: t0:[p0..p8] t1:[p9..p17] │ BroadcastExec: input_partitions=3, consumer_tasks=3, output_partitions=9 │ FilterExec: i_category@5 = Sports OR i_category@5 = Books OR i_category@5 = Home │ RepartitionExec: partitioning=RoundRobinBatch(3), input_partitions=2 - │ PartitionIsolatorExec: tasks=2 partitions=4 - │ DataSourceExec: file_groups={4 groups: [[/testdata/tpcds/plans_sf1_partitions4/item/part-0.parquet], [/testdata/tpcds/plans_sf1_partitions4/item/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/item/part-2.parquet], [/testdata/tpcds/plans_sf1_partitions4/item/part-3.parquet]]}, projection=[i_item_sk, i_item_id, i_item_desc, i_current_price, i_class, i_category], file_type=parquet, predicate=i_category@12 = Sports OR i_category@12 = Books OR i_category@12 = Home, pruning_predicate=i_category_null_count@2 != row_count@3 AND i_category_min@0 <= Sports AND Sports <= i_category_max@1 OR i_category_null_count@2 != row_count@3 AND i_category_min@0 <= Books AND Books <= i_category_max@1 OR i_category_null_count@2 != row_count@3 AND i_category_min@0 <= Home AND Home <= i_category_max@1, required_guarantees=[i_category in (Books, Home, Sports)] + │ DistributedLeafExec: DataSourceExec: file_groups={2 groups: [[/testdata/tpcds/plans_sf1_partitions4/item/part-0.parquet, /testdata/tpcds/plans_sf1_partitions4/item/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/item/part-2.parquet, /testdata/tpcds/plans_sf1_partitions4/item/part-3.parquet]]}, projection=[i_item_sk, i_item_id, i_item_desc, i_current_price, i_class, i_category], file_type=parquet, predicate=i_category@12 = Sports OR i_category@12 = Books OR i_category@12 = Home, pruning_predicate=i_category_null_count@2 != row_count@3 AND i_category_min@0 <= Sports AND Sports <= i_category_max@1 OR i_category_null_count@2 != row_count@3 AND i_category_min@0 <= Books AND Books <= i_category_max@1 OR i_category_null_count@2 != row_count@3 AND i_category_min@0 <= Home AND Home <= i_category_max@1, required_guarantees=[i_category in (Books, Home, Sports)] └────────────────────────────────────────────────── "#); Ok(()) @@ -10800,7 +9965,7 @@ mod tests { #[tokio::test] async fn test_tpcds_99() -> Result<()> { let display = test_tpcds_query("q99").await?; - assert_snapshot!(display, @" + assert_snapshot!(display, @r" ┌───── DistributedExec ── Tasks: t0:[p0] │ SortPreservingMergeExec: [w_substr@0 ASC, sm_type@1 ASC, cc_name_lower@2 ASC], fetch=100 │ [Stage 6] => NetworkCoalesceExec: output_partitions=6, input_tasks=2 @@ -10821,38 +9986,33 @@ mod tests { │ ProjectionExec: expr=[d_date_sk@0 as d_date_sk, CAST(d_date_sk@0 AS Float64) as CAST(date_dim.d_date_sk AS Float64)] │ FilterExec: d_month_seq@1 >= 1200 AND d_month_seq@1 <= 1211, projection=[d_date_sk@0] │ RepartitionExec: partitioning=RoundRobinBatch(3), input_partitions=2 - │ PartitionIsolatorExec: tasks=3 partitions=4 - │ DataSourceExec: file_groups={4 groups: [[/testdata/tpcds/plans_sf1_partitions4/date_dim/part-0.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-2.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-3.parquet]]}, projection=[d_date_sk, d_month_seq], file_type=parquet, predicate=d_month_seq@3 >= 1200 AND d_month_seq@3 <= 1211 AND DynamicFilter [ empty ], pruning_predicate=d_month_seq_null_count@1 != row_count@2 AND d_month_seq_max@0 >= 1200 AND d_month_seq_null_count@1 != row_count@2 AND d_month_seq_min@3 <= 1211, required_guarantees=[] + │ DistributedLeafExec: DataSourceExec: file_groups={2 groups: [[/testdata/tpcds/plans_sf1_partitions4/date_dim/part-0.parquet, /testdata/tpcds/plans_sf1_partitions4/date_dim/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/date_dim/part-2.parquet, /testdata/tpcds/plans_sf1_partitions4/date_dim/part-3.parquet]]}, projection=[d_date_sk, d_month_seq], file_type=parquet, predicate=d_month_seq@3 >= 1200 AND d_month_seq@3 <= 1211 AND DynamicFilter [ empty ], pruning_predicate=d_month_seq_null_count@1 != row_count@2 AND d_month_seq_max@0 >= 1200 AND d_month_seq_null_count@1 != row_count@2 AND d_month_seq_min@3 <= 1211, required_guarantees=[] └────────────────────────────────────────────────── ┌───── Stage 4 ── Tasks: t0:[p0..p5] t1:[p6..p11] t2:[p12..p17] │ BroadcastExec: input_partitions=2, consumer_tasks=3, output_partitions=6 │ HashJoinExec: mode=CollectLeft, join_type=Inner, on=[(cs_call_center_sk@2, CAST(call_center.cc_call_center_sk AS Float64)@2)], projection=[cs_sold_date_sk@0, cs_ship_date_sk@1, w_substr@3, sm_type@4, cc_name@6] │ CoalescePartitionsExec │ [Stage 3] => NetworkBroadcastExec: partitions_per_consumer=2, stage_partitions=6, input_tasks=3 - │ PartitionIsolatorExec: tasks=3 partitions=4 - │ DataSourceExec: file_groups={4 groups: [[/testdata/tpcds/plans_sf1_partitions4/call_center/part-0.parquet], [/testdata/tpcds/plans_sf1_partitions4/call_center/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/call_center/part-2.parquet], [/testdata/tpcds/plans_sf1_partitions4/call_center/part-3.parquet]]}, projection=[cc_call_center_sk, cc_name, CAST(cc_call_center_sk@0 AS Float64) as CAST(call_center.cc_call_center_sk AS Float64)], file_type=parquet, predicate=DynamicFilter [ empty ] + │ DistributedLeafExec: DataSourceExec: file_groups={2 groups: [[/testdata/tpcds/plans_sf1_partitions4/call_center/part-0.parquet, /testdata/tpcds/plans_sf1_partitions4/call_center/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/call_center/part-2.parquet, /testdata/tpcds/plans_sf1_partitions4/call_center/part-3.parquet]]}, projection=[cc_call_center_sk, cc_name, CAST(cc_call_center_sk@0 AS Float64) as CAST(call_center.cc_call_center_sk AS Float64)], file_type=parquet, predicate=DynamicFilter [ empty ] └────────────────────────────────────────────────── ┌───── Stage 3 ── Tasks: t0:[p0..p5] t1:[p6..p11] t2:[p12..p17] │ BroadcastExec: input_partitions=2, consumer_tasks=3, output_partitions=6 │ HashJoinExec: mode=CollectLeft, join_type=Inner, on=[(cs_ship_mode_sk@3, CAST(ship_mode.sm_ship_mode_sk AS Float64)@2)], projection=[cs_sold_date_sk@0, cs_ship_date_sk@1, cs_call_center_sk@2, w_substr@4, sm_type@6] │ CoalescePartitionsExec - │ [Stage 2] => NetworkBroadcastExec: partitions_per_consumer=2, stage_partitions=6, input_tasks=3 - │ PartitionIsolatorExec: tasks=3 partitions=4 - │ DataSourceExec: file_groups={4 groups: [[/testdata/tpcds/plans_sf1_partitions4/ship_mode/part-0.parquet], [/testdata/tpcds/plans_sf1_partitions4/ship_mode/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/ship_mode/part-2.parquet], [/testdata/tpcds/plans_sf1_partitions4/ship_mode/part-3.parquet]]}, projection=[sm_ship_mode_sk, sm_type, CAST(sm_ship_mode_sk@0 AS Float64) as CAST(ship_mode.sm_ship_mode_sk AS Float64)], file_type=parquet, predicate=DynamicFilter [ empty ] + │ [Stage 2] => NetworkBroadcastExec: partitions_per_consumer=3, stage_partitions=9, input_tasks=3 + │ DistributedLeafExec: DataSourceExec: file_groups={2 groups: [[/testdata/tpcds/plans_sf1_partitions4/ship_mode/part-0.parquet, /testdata/tpcds/plans_sf1_partitions4/ship_mode/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/ship_mode/part-2.parquet, /testdata/tpcds/plans_sf1_partitions4/ship_mode/part-3.parquet]]}, projection=[sm_ship_mode_sk, sm_type, CAST(sm_ship_mode_sk@0 AS Float64) as CAST(ship_mode.sm_ship_mode_sk AS Float64)], file_type=parquet, predicate=DynamicFilter [ empty ] └────────────────────────────────────────────────── - ┌───── Stage 2 ── Tasks: t0:[p0..p5] t1:[p6..p11] t2:[p12..p17] - │ BroadcastExec: input_partitions=2, consumer_tasks=3, output_partitions=6 + ┌───── Stage 2 ── Tasks: t0:[p0..p8] t1:[p9..p17] t2:[p18..p26] + │ BroadcastExec: input_partitions=3, consumer_tasks=3, output_partitions=9 │ ProjectionExec: expr=[cs_sold_date_sk@1 as cs_sold_date_sk, cs_ship_date_sk@2 as cs_ship_date_sk, cs_call_center_sk@3 as cs_call_center_sk, cs_ship_mode_sk@4 as cs_ship_mode_sk, w_substr@0 as w_substr] │ HashJoinExec: mode=CollectLeft, join_type=Inner, on=[(CAST(sq1.w_warehouse_sk AS Float64)@2, cs_warehouse_sk@4)], projection=[w_substr@0, cs_sold_date_sk@3, cs_ship_date_sk@4, cs_call_center_sk@5, cs_ship_mode_sk@6] │ CoalescePartitionsExec │ [Stage 1] => NetworkBroadcastExec: partitions_per_consumer=2, stage_partitions=6, input_tasks=2 - │ PartitionIsolatorExec: tasks=3 partitions=6 - │ DataSourceExec: file_groups={6 groups: [[/testdata/tpcds/plans_sf1_partitions4/catalog_sales/part-0.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/catalog_sales/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/catalog_sales/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/catalog_sales/part-2.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/catalog_sales/part-2.parquet:..], ...]}, projection=[cs_sold_date_sk, cs_ship_date_sk, cs_call_center_sk, cs_ship_mode_sk, cs_warehouse_sk], file_type=parquet, predicate=DynamicFilter [ empty ] + │ DistributedLeafExec: DataSourceExec: file_groups={3 groups: [[/testdata/tpcds/plans_sf1_partitions4/catalog_sales/part-0.parquet:.., /testdata/tpcds/plans_sf1_partitions4/catalog_sales/part-1.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/catalog_sales/part-1.parquet:.., /testdata/tpcds/plans_sf1_partitions4/catalog_sales/part-2.parquet:..], [/testdata/tpcds/plans_sf1_partitions4/catalog_sales/part-2.parquet:.., /testdata/tpcds/plans_sf1_partitions4/catalog_sales/part-3.parquet:..]]}, projection=[cs_sold_date_sk, cs_ship_date_sk, cs_call_center_sk, cs_ship_mode_sk, cs_warehouse_sk], file_type=parquet, predicate=DynamicFilter [ empty ] └────────────────────────────────────────────────── ┌───── Stage 1 ── Tasks: t0:[p0..p5] t1:[p6..p11] │ BroadcastExec: input_partitions=2, consumer_tasks=3, output_partitions=6 - │ PartitionIsolatorExec: tasks=2 partitions=4 - │ DataSourceExec: file_groups={4 groups: [[/testdata/tpcds/plans_sf1_partitions4/warehouse/part-0.parquet], [/testdata/tpcds/plans_sf1_partitions4/warehouse/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/warehouse/part-2.parquet], [/testdata/tpcds/plans_sf1_partitions4/warehouse/part-3.parquet]]}, projection=[substr(w_warehouse_name@2, 1, 20) as w_substr, w_warehouse_sk, CAST(w_warehouse_sk@0 AS Float64) as CAST(sq1.w_warehouse_sk AS Float64)], file_type=parquet + │ DistributedLeafExec: DataSourceExec: file_groups={2 groups: [[/testdata/tpcds/plans_sf1_partitions4/warehouse/part-0.parquet, /testdata/tpcds/plans_sf1_partitions4/warehouse/part-1.parquet], [/testdata/tpcds/plans_sf1_partitions4/warehouse/part-2.parquet, /testdata/tpcds/plans_sf1_partitions4/warehouse/part-3.parquet]]}, projection=[substr(w_warehouse_name@2, 1, 20) as w_substr, w_warehouse_sk, CAST(w_warehouse_sk@0 AS Float64) as CAST(sq1.w_warehouse_sk AS Float64)], file_type=parquet └────────────────────────────────────────────────── "); Ok(()) diff --git a/tests/tpch_plans_test.rs b/tests/tpch_plans_test.rs index 5fafd7e7..2096f0f0 100644 --- a/tests/tpch_plans_test.rs +++ b/tests/tpch_plans_test.rs @@ -17,7 +17,7 @@ mod tests { #[tokio::test] async fn test_tpch_1() -> Result<(), Box> { let plan = test_tpch_query("q1").await?; - assert_snapshot!(plan, @" + assert_snapshot!(plan, @r" ┌───── DistributedExec ── Tasks: t0:[p0] │ SortPreservingMergeExec: [l_returnflag@0 ASC NULLS LAST, l_linestatus@1 ASC NULLS LAST] │ [Stage 2] => NetworkCoalesceExec: output_partitions=12, input_tasks=2 @@ -29,12 +29,11 @@ mod tests { │ [Stage 1] => NetworkShuffleExec: output_partitions=6, input_tasks=4 └────────────────────────────────────────────────── ┌───── Stage 1 ── Tasks: t0:[p0..p11] t1:[p0..p11] t2:[p0..p11] t3:[p0..p11] - │ RepartitionExec: partitioning=Hash([l_returnflag@0, l_linestatus@1], 12), input_partitions=4 + │ RepartitionExec: partitioning=Hash([l_returnflag@0, l_linestatus@1], 12), input_partitions=6 │ AggregateExec: mode=Partial, gby=[l_returnflag@5 as l_returnflag, l_linestatus@6 as l_linestatus], aggr=[sum(lineitem.l_quantity), sum(lineitem.l_extendedprice), sum(lineitem.l_extendedprice * Int64(1) - lineitem.l_discount), sum(lineitem.l_extendedprice * Int64(1) - lineitem.l_discount * Int64(1) + lineitem.l_tax), avg(lineitem.l_quantity), avg(lineitem.l_extendedprice), avg(lineitem.l_discount), count(Int64(1))] │ ProjectionExec: expr=[l_extendedprice@1 * (Some(1),20,0 - l_discount@2) as __common_expr_1, l_quantity@0 as l_quantity, l_extendedprice@1 as l_extendedprice, l_discount@2 as l_discount, l_tax@3 as l_tax, l_returnflag@4 as l_returnflag, l_linestatus@5 as l_linestatus] │ FilterExec: l_shipdate@6 <= 1998-09-02, projection=[l_quantity@0, l_extendedprice@1, l_discount@2, l_tax@3, l_returnflag@4, l_linestatus@5] - │ PartitionIsolatorExec: tasks=4 partitions=16 - │ DataSourceExec: file_groups={16 groups: [[/testdata/tpch/plan_sf0.02/lineitem/1.parquet], [/testdata/tpch/plan_sf0.02/lineitem/10.parquet], [/testdata/tpch/plan_sf0.02/lineitem/11.parquet], [/testdata/tpch/plan_sf0.02/lineitem/12.parquet], [/testdata/tpch/plan_sf0.02/lineitem/13.parquet], ...]}, projection=[l_quantity, l_extendedprice, l_discount, l_tax, l_returnflag, l_linestatus, l_shipdate], file_type=parquet, predicate=l_shipdate@10 <= 1998-09-02, pruning_predicate=l_shipdate_null_count@1 != row_count@2 AND l_shipdate_min@0 <= 1998-09-02, required_guarantees=[] + │ DistributedLeafExec: DataSourceExec: file_groups={6 groups: [[/testdata/tpch/plan_sf0.02/lineitem/1.parquet, /testdata/tpch/plan_sf0.02/lineitem/10.parquet, /testdata/tpch/plan_sf0.02/lineitem/11.parquet], [/testdata/tpch/plan_sf0.02/lineitem/12.parquet, /testdata/tpch/plan_sf0.02/lineitem/13.parquet, /testdata/tpch/plan_sf0.02/lineitem/14.parquet], [/testdata/tpch/plan_sf0.02/lineitem/15.parquet, /testdata/tpch/plan_sf0.02/lineitem/16.parquet, /testdata/tpch/plan_sf0.02/lineitem/2.parquet], [/testdata/tpch/plan_sf0.02/lineitem/3.parquet, /testdata/tpch/plan_sf0.02/lineitem/4.parquet, /testdata/tpch/plan_sf0.02/lineitem/5.parquet], [/testdata/tpch/plan_sf0.02/lineitem/6.parquet, /testdata/tpch/plan_sf0.02/lineitem/7.parquet, /testdata/tpch/plan_sf0.02/lineitem/8.parquet], ...]}, projection=[l_quantity, l_extendedprice, l_discount, l_tax, l_returnflag, l_linestatus, l_shipdate], file_type=parquet, predicate=l_shipdate@10 <= 1998-09-02, pruning_predicate=l_shipdate_null_count@1 != row_count@2 AND l_shipdate_min@0 <= 1998-09-02, required_guarantees=[] └────────────────────────────────────────────────── "); Ok(()) @@ -43,7 +42,7 @@ mod tests { #[tokio::test] async fn test_tpch_2() -> Result<(), Box> { let plan = test_tpch_query("q2").await?; - assert_snapshot!(plan, @" + assert_snapshot!(plan, @r" ┌───── DistributedExec ── Tasks: t0:[p0] │ SortPreservingMergeExec: [s_acctbal@0 DESC, n_name@2 ASC NULLS LAST, s_name@1 ASC NULLS LAST, p_partkey@3 ASC NULLS LAST] │ [Stage 11] => NetworkCoalesceExec: output_partitions=24, input_tasks=4 @@ -56,45 +55,40 @@ mod tests { │ [Stage 10] => NetworkShuffleExec: output_partitions=6, input_tasks=3 └────────────────────────────────────────────────── ┌───── Stage 5 ── Tasks: t0:[p0..p23] t1:[p0..p23] t2:[p0..p23] t3:[p0..p23] - │ RepartitionExec: partitioning=Hash([p_partkey@0, ps_supplycost@7], 24), input_partitions=4 + │ RepartitionExec: partitioning=Hash([p_partkey@0, ps_supplycost@7], 24), input_partitions=6 │ HashJoinExec: mode=CollectLeft, join_type=Inner, on=[(r_regionkey@0, n_regionkey@9)], projection=[p_partkey@1, p_mfgr@2, s_name@3, s_address@4, s_phone@5, s_acctbal@6, s_comment@7, ps_supplycost@8, n_name@9] │ CoalescePartitionsExec - │ [Stage 1] => NetworkBroadcastExec: partitions_per_consumer=4, stage_partitions=16, input_tasks=4 + │ [Stage 1] => NetworkBroadcastExec: partitions_per_consumer=6, stage_partitions=24, input_tasks=4 │ ProjectionExec: expr=[p_partkey@2 as p_partkey, p_mfgr@3 as p_mfgr, s_name@4 as s_name, s_address@5 as s_address, s_phone@6 as s_phone, s_acctbal@7 as s_acctbal, s_comment@8 as s_comment, ps_supplycost@9 as ps_supplycost, n_name@0 as n_name, n_regionkey@1 as n_regionkey] │ HashJoinExec: mode=CollectLeft, join_type=Inner, on=[(n_nationkey@0, s_nationkey@4)], projection=[n_name@1, n_regionkey@2, p_partkey@3, p_mfgr@4, s_name@5, s_address@6, s_phone@8, s_acctbal@9, s_comment@10, ps_supplycost@11] │ CoalescePartitionsExec - │ [Stage 2] => NetworkBroadcastExec: partitions_per_consumer=4, stage_partitions=16, input_tasks=4 + │ [Stage 2] => NetworkBroadcastExec: partitions_per_consumer=6, stage_partitions=24, input_tasks=4 │ ProjectionExec: expr=[p_partkey@6 as p_partkey, p_mfgr@7 as p_mfgr, s_name@0 as s_name, s_address@1 as s_address, s_nationkey@2 as s_nationkey, s_phone@3 as s_phone, s_acctbal@4 as s_acctbal, s_comment@5 as s_comment, ps_supplycost@8 as ps_supplycost] │ HashJoinExec: mode=CollectLeft, join_type=Inner, on=[(s_suppkey@0, ps_suppkey@2)], projection=[s_name@1, s_address@2, s_nationkey@3, s_phone@4, s_acctbal@5, s_comment@6, p_partkey@7, p_mfgr@8, ps_supplycost@10] │ CoalescePartitionsExec - │ [Stage 3] => NetworkBroadcastExec: partitions_per_consumer=4, stage_partitions=16, input_tasks=4 + │ [Stage 3] => NetworkBroadcastExec: partitions_per_consumer=6, stage_partitions=24, input_tasks=4 │ HashJoinExec: mode=CollectLeft, join_type=Inner, on=[(p_partkey@0, ps_partkey@0)], projection=[p_partkey@0, p_mfgr@1, ps_suppkey@3, ps_supplycost@4] │ CoalescePartitionsExec - │ [Stage 4] => NetworkBroadcastExec: partitions_per_consumer=4, stage_partitions=16, input_tasks=4 - │ PartitionIsolatorExec: tasks=4 partitions=16 - │ DataSourceExec: file_groups={16 groups: [[/testdata/tpch/plan_sf0.02/partsupp/1.parquet], [/testdata/tpch/plan_sf0.02/partsupp/10.parquet], [/testdata/tpch/plan_sf0.02/partsupp/11.parquet], [/testdata/tpch/plan_sf0.02/partsupp/12.parquet], [/testdata/tpch/plan_sf0.02/partsupp/13.parquet], ...]}, projection=[ps_partkey, ps_suppkey, ps_supplycost], file_type=parquet, predicate=DynamicFilter [ empty ] AND DynamicFilter [ empty ] + │ [Stage 4] => NetworkBroadcastExec: partitions_per_consumer=6, stage_partitions=24, input_tasks=4 + │ DistributedLeafExec: DataSourceExec: file_groups={6 groups: [[/testdata/tpch/plan_sf0.02/partsupp/1.parquet, /testdata/tpch/plan_sf0.02/partsupp/10.parquet, /testdata/tpch/plan_sf0.02/partsupp/11.parquet], [/testdata/tpch/plan_sf0.02/partsupp/12.parquet, /testdata/tpch/plan_sf0.02/partsupp/13.parquet, /testdata/tpch/plan_sf0.02/partsupp/14.parquet], [/testdata/tpch/plan_sf0.02/partsupp/15.parquet, /testdata/tpch/plan_sf0.02/partsupp/16.parquet, /testdata/tpch/plan_sf0.02/partsupp/2.parquet], [/testdata/tpch/plan_sf0.02/partsupp/3.parquet, /testdata/tpch/plan_sf0.02/partsupp/4.parquet, /testdata/tpch/plan_sf0.02/partsupp/5.parquet], [/testdata/tpch/plan_sf0.02/partsupp/6.parquet, /testdata/tpch/plan_sf0.02/partsupp/7.parquet, /testdata/tpch/plan_sf0.02/partsupp/8.parquet], ...]}, projection=[ps_partkey, ps_suppkey, ps_supplycost], file_type=parquet, predicate=DynamicFilter [ empty ] AND DynamicFilter [ empty ] └────────────────────────────────────────────────── - ┌───── Stage 1 ── Tasks: t0:[p0..p15] t1:[p16..p31] t2:[p32..p47] t3:[p48..p63] - │ BroadcastExec: input_partitions=4, consumer_tasks=4, output_partitions=16 + ┌───── Stage 1 ── Tasks: t0:[p0..p23] t1:[p24..p47] t2:[p48..p71] t3:[p72..p95] + │ BroadcastExec: input_partitions=6, consumer_tasks=4, output_partitions=24 │ FilterExec: r_name@1 = EUROPE, projection=[r_regionkey@0] - │ PartitionIsolatorExec: tasks=4 partitions=16 - │ DataSourceExec: file_groups={16 groups: [[/testdata/tpch/plan_sf0.02/region/1.parquet], [/testdata/tpch/plan_sf0.02/region/10.parquet], [/testdata/tpch/plan_sf0.02/region/11.parquet], [/testdata/tpch/plan_sf0.02/region/12.parquet], [/testdata/tpch/plan_sf0.02/region/13.parquet], ...]}, projection=[r_regionkey, r_name], file_type=parquet, predicate=r_name@1 = EUROPE, pruning_predicate=r_name_null_count@2 != row_count@3 AND r_name_min@0 <= EUROPE AND EUROPE <= r_name_max@1, required_guarantees=[r_name in (EUROPE)] + │ DistributedLeafExec: DataSourceExec: file_groups={6 groups: [[/testdata/tpch/plan_sf0.02/region/1.parquet, /testdata/tpch/plan_sf0.02/region/10.parquet, /testdata/tpch/plan_sf0.02/region/11.parquet], [/testdata/tpch/plan_sf0.02/region/12.parquet, /testdata/tpch/plan_sf0.02/region/13.parquet, /testdata/tpch/plan_sf0.02/region/14.parquet], [/testdata/tpch/plan_sf0.02/region/15.parquet, /testdata/tpch/plan_sf0.02/region/16.parquet, /testdata/tpch/plan_sf0.02/region/2.parquet], [/testdata/tpch/plan_sf0.02/region/3.parquet, /testdata/tpch/plan_sf0.02/region/4.parquet, /testdata/tpch/plan_sf0.02/region/5.parquet], [/testdata/tpch/plan_sf0.02/region/6.parquet, /testdata/tpch/plan_sf0.02/region/7.parquet, /testdata/tpch/plan_sf0.02/region/8.parquet], ...]}, projection=[r_regionkey, r_name], file_type=parquet, predicate=r_name@1 = EUROPE, pruning_predicate=r_name_null_count@2 != row_count@3 AND r_name_min@0 <= EUROPE AND EUROPE <= r_name_max@1, required_guarantees=[r_name in (EUROPE)] └────────────────────────────────────────────────── - ┌───── Stage 2 ── Tasks: t0:[p0..p15] t1:[p16..p31] t2:[p32..p47] t3:[p48..p63] - │ BroadcastExec: input_partitions=4, consumer_tasks=4, output_partitions=16 - │ PartitionIsolatorExec: tasks=4 partitions=16 - │ DataSourceExec: file_groups={16 groups: [[/testdata/tpch/plan_sf0.02/nation/1.parquet], [/testdata/tpch/plan_sf0.02/nation/10.parquet], [/testdata/tpch/plan_sf0.02/nation/11.parquet], [/testdata/tpch/plan_sf0.02/nation/12.parquet], [/testdata/tpch/plan_sf0.02/nation/13.parquet], ...]}, projection=[n_nationkey, n_name, n_regionkey], file_type=parquet, predicate=DynamicFilter [ empty ] + ┌───── Stage 2 ── Tasks: t0:[p0..p23] t1:[p24..p47] t2:[p48..p71] t3:[p72..p95] + │ BroadcastExec: input_partitions=6, consumer_tasks=4, output_partitions=24 + │ DistributedLeafExec: DataSourceExec: file_groups={6 groups: [[/testdata/tpch/plan_sf0.02/nation/1.parquet, /testdata/tpch/plan_sf0.02/nation/10.parquet, /testdata/tpch/plan_sf0.02/nation/11.parquet], [/testdata/tpch/plan_sf0.02/nation/12.parquet, /testdata/tpch/plan_sf0.02/nation/13.parquet, /testdata/tpch/plan_sf0.02/nation/14.parquet], [/testdata/tpch/plan_sf0.02/nation/15.parquet, /testdata/tpch/plan_sf0.02/nation/16.parquet, /testdata/tpch/plan_sf0.02/nation/2.parquet], [/testdata/tpch/plan_sf0.02/nation/3.parquet, /testdata/tpch/plan_sf0.02/nation/4.parquet, /testdata/tpch/plan_sf0.02/nation/5.parquet], [/testdata/tpch/plan_sf0.02/nation/6.parquet, /testdata/tpch/plan_sf0.02/nation/7.parquet, /testdata/tpch/plan_sf0.02/nation/8.parquet], ...]}, projection=[n_nationkey, n_name, n_regionkey], file_type=parquet, predicate=DynamicFilter [ empty ] └────────────────────────────────────────────────── - ┌───── Stage 3 ── Tasks: t0:[p0..p15] t1:[p16..p31] t2:[p32..p47] t3:[p48..p63] - │ BroadcastExec: input_partitions=4, consumer_tasks=4, output_partitions=16 - │ PartitionIsolatorExec: tasks=4 partitions=16 - │ DataSourceExec: file_groups={16 groups: [[/testdata/tpch/plan_sf0.02/supplier/1.parquet], [/testdata/tpch/plan_sf0.02/supplier/10.parquet], [/testdata/tpch/plan_sf0.02/supplier/11.parquet], [/testdata/tpch/plan_sf0.02/supplier/12.parquet], [/testdata/tpch/plan_sf0.02/supplier/13.parquet], ...]}, projection=[s_suppkey, s_name, s_address, s_nationkey, s_phone, s_acctbal, s_comment], file_type=parquet, predicate=DynamicFilter [ empty ] + ┌───── Stage 3 ── Tasks: t0:[p0..p23] t1:[p24..p47] t2:[p48..p71] t3:[p72..p95] + │ BroadcastExec: input_partitions=6, consumer_tasks=4, output_partitions=24 + │ DistributedLeafExec: DataSourceExec: file_groups={6 groups: [[/testdata/tpch/plan_sf0.02/supplier/1.parquet, /testdata/tpch/plan_sf0.02/supplier/10.parquet, /testdata/tpch/plan_sf0.02/supplier/11.parquet], [/testdata/tpch/plan_sf0.02/supplier/12.parquet, /testdata/tpch/plan_sf0.02/supplier/13.parquet, /testdata/tpch/plan_sf0.02/supplier/14.parquet], [/testdata/tpch/plan_sf0.02/supplier/15.parquet, /testdata/tpch/plan_sf0.02/supplier/16.parquet, /testdata/tpch/plan_sf0.02/supplier/2.parquet], [/testdata/tpch/plan_sf0.02/supplier/3.parquet, /testdata/tpch/plan_sf0.02/supplier/4.parquet, /testdata/tpch/plan_sf0.02/supplier/5.parquet], [/testdata/tpch/plan_sf0.02/supplier/6.parquet, /testdata/tpch/plan_sf0.02/supplier/7.parquet, /testdata/tpch/plan_sf0.02/supplier/8.parquet], ...]}, projection=[s_suppkey, s_name, s_address, s_nationkey, s_phone, s_acctbal, s_comment], file_type=parquet, predicate=DynamicFilter [ empty ] └────────────────────────────────────────────────── - ┌───── Stage 4 ── Tasks: t0:[p0..p15] t1:[p16..p31] t2:[p32..p47] t3:[p48..p63] - │ BroadcastExec: input_partitions=4, consumer_tasks=4, output_partitions=16 + ┌───── Stage 4 ── Tasks: t0:[p0..p23] t1:[p24..p47] t2:[p48..p71] t3:[p72..p95] + │ BroadcastExec: input_partitions=6, consumer_tasks=4, output_partitions=24 │ FilterExec: p_size@3 = 15 AND p_type@2 LIKE %BRASS, projection=[p_partkey@0, p_mfgr@1] - │ PartitionIsolatorExec: tasks=4 partitions=16 - │ DataSourceExec: file_groups={16 groups: [[/testdata/tpch/plan_sf0.02/part/1.parquet], [/testdata/tpch/plan_sf0.02/part/10.parquet], [/testdata/tpch/plan_sf0.02/part/11.parquet], [/testdata/tpch/plan_sf0.02/part/12.parquet], [/testdata/tpch/plan_sf0.02/part/13.parquet], ...]}, projection=[p_partkey, p_mfgr, p_type, p_size], file_type=parquet, predicate=p_size@5 = 15 AND p_type@4 LIKE %BRASS, pruning_predicate=p_size_null_count@2 != row_count@3 AND p_size_min@0 <= 15 AND 15 <= p_size_max@1, required_guarantees=[p_size in (15)] + │ DistributedLeafExec: DataSourceExec: file_groups={6 groups: [[/testdata/tpch/plan_sf0.02/part/1.parquet, /testdata/tpch/plan_sf0.02/part/10.parquet, /testdata/tpch/plan_sf0.02/part/11.parquet], [/testdata/tpch/plan_sf0.02/part/12.parquet, /testdata/tpch/plan_sf0.02/part/13.parquet, /testdata/tpch/plan_sf0.02/part/14.parquet], [/testdata/tpch/plan_sf0.02/part/15.parquet, /testdata/tpch/plan_sf0.02/part/16.parquet, /testdata/tpch/plan_sf0.02/part/2.parquet], [/testdata/tpch/plan_sf0.02/part/3.parquet, /testdata/tpch/plan_sf0.02/part/4.parquet, /testdata/tpch/plan_sf0.02/part/5.parquet], [/testdata/tpch/plan_sf0.02/part/6.parquet, /testdata/tpch/plan_sf0.02/part/7.parquet, /testdata/tpch/plan_sf0.02/part/8.parquet], ...]}, projection=[p_partkey, p_mfgr, p_type, p_size], file_type=parquet, predicate=p_size@5 = 15 AND p_type@4 LIKE %BRASS, pruning_predicate=p_size_null_count@2 != row_count@3 AND p_size_min@0 <= 15 AND 15 <= p_size_max@1, required_guarantees=[p_size in (15)] └────────────────────────────────────────────────── ┌───── Stage 10 ── Tasks: t0:[p0..p23] t1:[p0..p23] t2:[p0..p23] │ RepartitionExec: partitioning=Hash([ps_partkey@1, min(partsupp.ps_supplycost)@0], 24), input_partitions=6 @@ -103,37 +97,33 @@ mod tests { │ [Stage 9] => NetworkShuffleExec: output_partitions=6, input_tasks=4 └────────────────────────────────────────────────── ┌───── Stage 9 ── Tasks: t0:[p0..p17] t1:[p0..p17] t2:[p0..p17] t3:[p0..p17] - │ RepartitionExec: partitioning=Hash([ps_partkey@0], 18), input_partitions=4 + │ RepartitionExec: partitioning=Hash([ps_partkey@0], 18), input_partitions=6 │ AggregateExec: mode=Partial, gby=[ps_partkey@0 as ps_partkey], aggr=[min(partsupp.ps_supplycost)] │ HashJoinExec: mode=CollectLeft, join_type=Inner, on=[(r_regionkey@0, n_regionkey@2)], projection=[ps_partkey@1, ps_supplycost@2] │ CoalescePartitionsExec - │ [Stage 6] => NetworkBroadcastExec: partitions_per_consumer=4, stage_partitions=16, input_tasks=4 + │ [Stage 6] => NetworkBroadcastExec: partitions_per_consumer=6, stage_partitions=24, input_tasks=4 │ ProjectionExec: expr=[ps_partkey@1 as ps_partkey, ps_supplycost@2 as ps_supplycost, n_regionkey@0 as n_regionkey] │ HashJoinExec: mode=CollectLeft, join_type=Inner, on=[(n_nationkey@0, s_nationkey@2)], projection=[n_regionkey@1, ps_partkey@2, ps_supplycost@3] │ CoalescePartitionsExec - │ [Stage 7] => NetworkBroadcastExec: partitions_per_consumer=4, stage_partitions=16, input_tasks=4 + │ [Stage 7] => NetworkBroadcastExec: partitions_per_consumer=6, stage_partitions=24, input_tasks=4 │ ProjectionExec: expr=[ps_partkey@1 as ps_partkey, ps_supplycost@2 as ps_supplycost, s_nationkey@0 as s_nationkey] │ HashJoinExec: mode=CollectLeft, join_type=Inner, on=[(s_suppkey@0, ps_suppkey@1)], projection=[s_nationkey@1, ps_partkey@2, ps_supplycost@4] │ CoalescePartitionsExec - │ [Stage 8] => NetworkBroadcastExec: partitions_per_consumer=4, stage_partitions=16, input_tasks=4 - │ PartitionIsolatorExec: tasks=4 partitions=16 - │ DataSourceExec: file_groups={16 groups: [[/testdata/tpch/plan_sf0.02/partsupp/1.parquet], [/testdata/tpch/plan_sf0.02/partsupp/10.parquet], [/testdata/tpch/plan_sf0.02/partsupp/11.parquet], [/testdata/tpch/plan_sf0.02/partsupp/12.parquet], [/testdata/tpch/plan_sf0.02/partsupp/13.parquet], ...]}, projection=[ps_partkey, ps_suppkey, ps_supplycost], file_type=parquet, predicate=DynamicFilter [ empty ] + │ [Stage 8] => NetworkBroadcastExec: partitions_per_consumer=6, stage_partitions=24, input_tasks=4 + │ DistributedLeafExec: DataSourceExec: file_groups={6 groups: [[/testdata/tpch/plan_sf0.02/partsupp/1.parquet, /testdata/tpch/plan_sf0.02/partsupp/10.parquet, /testdata/tpch/plan_sf0.02/partsupp/11.parquet], [/testdata/tpch/plan_sf0.02/partsupp/12.parquet, /testdata/tpch/plan_sf0.02/partsupp/13.parquet, /testdata/tpch/plan_sf0.02/partsupp/14.parquet], [/testdata/tpch/plan_sf0.02/partsupp/15.parquet, /testdata/tpch/plan_sf0.02/partsupp/16.parquet, /testdata/tpch/plan_sf0.02/partsupp/2.parquet], [/testdata/tpch/plan_sf0.02/partsupp/3.parquet, /testdata/tpch/plan_sf0.02/partsupp/4.parquet, /testdata/tpch/plan_sf0.02/partsupp/5.parquet], [/testdata/tpch/plan_sf0.02/partsupp/6.parquet, /testdata/tpch/plan_sf0.02/partsupp/7.parquet, /testdata/tpch/plan_sf0.02/partsupp/8.parquet], ...]}, projection=[ps_partkey, ps_suppkey, ps_supplycost], file_type=parquet, predicate=DynamicFilter [ empty ] └────────────────────────────────────────────────── - ┌───── Stage 6 ── Tasks: t0:[p0..p15] t1:[p16..p31] t2:[p32..p47] t3:[p48..p63] - │ BroadcastExec: input_partitions=4, consumer_tasks=4, output_partitions=16 + ┌───── Stage 6 ── Tasks: t0:[p0..p23] t1:[p24..p47] t2:[p48..p71] t3:[p72..p95] + │ BroadcastExec: input_partitions=6, consumer_tasks=4, output_partitions=24 │ FilterExec: r_name@1 = EUROPE, projection=[r_regionkey@0] - │ PartitionIsolatorExec: tasks=4 partitions=16 - │ DataSourceExec: file_groups={16 groups: [[/testdata/tpch/plan_sf0.02/region/1.parquet], [/testdata/tpch/plan_sf0.02/region/10.parquet], [/testdata/tpch/plan_sf0.02/region/11.parquet], [/testdata/tpch/plan_sf0.02/region/12.parquet], [/testdata/tpch/plan_sf0.02/region/13.parquet], ...]}, projection=[r_regionkey, r_name], file_type=parquet, predicate=r_name@1 = EUROPE, pruning_predicate=r_name_null_count@2 != row_count@3 AND r_name_min@0 <= EUROPE AND EUROPE <= r_name_max@1, required_guarantees=[r_name in (EUROPE)] + │ DistributedLeafExec: DataSourceExec: file_groups={6 groups: [[/testdata/tpch/plan_sf0.02/region/1.parquet, /testdata/tpch/plan_sf0.02/region/10.parquet, /testdata/tpch/plan_sf0.02/region/11.parquet], [/testdata/tpch/plan_sf0.02/region/12.parquet, /testdata/tpch/plan_sf0.02/region/13.parquet, /testdata/tpch/plan_sf0.02/region/14.parquet], [/testdata/tpch/plan_sf0.02/region/15.parquet, /testdata/tpch/plan_sf0.02/region/16.parquet, /testdata/tpch/plan_sf0.02/region/2.parquet], [/testdata/tpch/plan_sf0.02/region/3.parquet, /testdata/tpch/plan_sf0.02/region/4.parquet, /testdata/tpch/plan_sf0.02/region/5.parquet], [/testdata/tpch/plan_sf0.02/region/6.parquet, /testdata/tpch/plan_sf0.02/region/7.parquet, /testdata/tpch/plan_sf0.02/region/8.parquet], ...]}, projection=[r_regionkey, r_name], file_type=parquet, predicate=r_name@1 = EUROPE, pruning_predicate=r_name_null_count@2 != row_count@3 AND r_name_min@0 <= EUROPE AND EUROPE <= r_name_max@1, required_guarantees=[r_name in (EUROPE)] └────────────────────────────────────────────────── - ┌───── Stage 7 ── Tasks: t0:[p0..p15] t1:[p16..p31] t2:[p32..p47] t3:[p48..p63] - │ BroadcastExec: input_partitions=4, consumer_tasks=4, output_partitions=16 - │ PartitionIsolatorExec: tasks=4 partitions=16 - │ DataSourceExec: file_groups={16 groups: [[/testdata/tpch/plan_sf0.02/nation/1.parquet], [/testdata/tpch/plan_sf0.02/nation/10.parquet], [/testdata/tpch/plan_sf0.02/nation/11.parquet], [/testdata/tpch/plan_sf0.02/nation/12.parquet], [/testdata/tpch/plan_sf0.02/nation/13.parquet], ...]}, projection=[n_nationkey, n_regionkey], file_type=parquet, predicate=DynamicFilter [ empty ] + ┌───── Stage 7 ── Tasks: t0:[p0..p23] t1:[p24..p47] t2:[p48..p71] t3:[p72..p95] + │ BroadcastExec: input_partitions=6, consumer_tasks=4, output_partitions=24 + │ DistributedLeafExec: DataSourceExec: file_groups={6 groups: [[/testdata/tpch/plan_sf0.02/nation/1.parquet, /testdata/tpch/plan_sf0.02/nation/10.parquet, /testdata/tpch/plan_sf0.02/nation/11.parquet], [/testdata/tpch/plan_sf0.02/nation/12.parquet, /testdata/tpch/plan_sf0.02/nation/13.parquet, /testdata/tpch/plan_sf0.02/nation/14.parquet], [/testdata/tpch/plan_sf0.02/nation/15.parquet, /testdata/tpch/plan_sf0.02/nation/16.parquet, /testdata/tpch/plan_sf0.02/nation/2.parquet], [/testdata/tpch/plan_sf0.02/nation/3.parquet, /testdata/tpch/plan_sf0.02/nation/4.parquet, /testdata/tpch/plan_sf0.02/nation/5.parquet], [/testdata/tpch/plan_sf0.02/nation/6.parquet, /testdata/tpch/plan_sf0.02/nation/7.parquet, /testdata/tpch/plan_sf0.02/nation/8.parquet], ...]}, projection=[n_nationkey, n_regionkey], file_type=parquet, predicate=DynamicFilter [ empty ] └────────────────────────────────────────────────── - ┌───── Stage 8 ── Tasks: t0:[p0..p15] t1:[p16..p31] t2:[p32..p47] t3:[p48..p63] - │ BroadcastExec: input_partitions=4, consumer_tasks=4, output_partitions=16 - │ PartitionIsolatorExec: tasks=4 partitions=16 - │ DataSourceExec: file_groups={16 groups: [[/testdata/tpch/plan_sf0.02/supplier/1.parquet], [/testdata/tpch/plan_sf0.02/supplier/10.parquet], [/testdata/tpch/plan_sf0.02/supplier/11.parquet], [/testdata/tpch/plan_sf0.02/supplier/12.parquet], [/testdata/tpch/plan_sf0.02/supplier/13.parquet], ...]}, projection=[s_suppkey, s_nationkey], file_type=parquet, predicate=DynamicFilter [ empty ] + ┌───── Stage 8 ── Tasks: t0:[p0..p23] t1:[p24..p47] t2:[p48..p71] t3:[p72..p95] + │ BroadcastExec: input_partitions=6, consumer_tasks=4, output_partitions=24 + │ DistributedLeafExec: DataSourceExec: file_groups={6 groups: [[/testdata/tpch/plan_sf0.02/supplier/1.parquet, /testdata/tpch/plan_sf0.02/supplier/10.parquet, /testdata/tpch/plan_sf0.02/supplier/11.parquet], [/testdata/tpch/plan_sf0.02/supplier/12.parquet, /testdata/tpch/plan_sf0.02/supplier/13.parquet, /testdata/tpch/plan_sf0.02/supplier/14.parquet], [/testdata/tpch/plan_sf0.02/supplier/15.parquet, /testdata/tpch/plan_sf0.02/supplier/16.parquet, /testdata/tpch/plan_sf0.02/supplier/2.parquet], [/testdata/tpch/plan_sf0.02/supplier/3.parquet, /testdata/tpch/plan_sf0.02/supplier/4.parquet, /testdata/tpch/plan_sf0.02/supplier/5.parquet], [/testdata/tpch/plan_sf0.02/supplier/6.parquet, /testdata/tpch/plan_sf0.02/supplier/7.parquet, /testdata/tpch/plan_sf0.02/supplier/8.parquet], ...]}, projection=[s_suppkey, s_nationkey], file_type=parquet, predicate=DynamicFilter [ empty ] └────────────────────────────────────────────────── "); Ok(()) @@ -142,7 +132,7 @@ mod tests { #[tokio::test] async fn test_tpch_3() -> Result<(), Box> { let plan = test_tpch_query("q3").await?; - assert_snapshot!(plan, @" + assert_snapshot!(plan, @r" ┌───── DistributedExec ── Tasks: t0:[p0] │ SortPreservingMergeExec: [revenue@1 DESC, o_orderdate@2 ASC NULLS LAST] │ [Stage 4] => NetworkCoalesceExec: output_partitions=18, input_tasks=3 @@ -154,29 +144,26 @@ mod tests { │ [Stage 3] => NetworkShuffleExec: output_partitions=6, input_tasks=4 └────────────────────────────────────────────────── ┌───── Stage 3 ── Tasks: t0:[p0..p17] t1:[p0..p17] t2:[p0..p17] t3:[p0..p17] - │ RepartitionExec: partitioning=Hash([l_orderkey@0, o_orderdate@1, o_shippriority@2], 18), input_partitions=4 + │ RepartitionExec: partitioning=Hash([l_orderkey@0, o_orderdate@1, o_shippriority@2], 18), input_partitions=6 │ AggregateExec: mode=Partial, gby=[l_orderkey@2 as l_orderkey, o_orderdate@0 as o_orderdate, o_shippriority@1 as o_shippriority], aggr=[sum(lineitem.l_extendedprice * Int64(1) - lineitem.l_discount)] │ HashJoinExec: mode=CollectLeft, join_type=Inner, on=[(o_orderkey@0, l_orderkey@0)], projection=[o_orderdate@1, o_shippriority@2, l_orderkey@3, l_extendedprice@4, l_discount@5] │ CoalescePartitionsExec - │ [Stage 2] => NetworkBroadcastExec: partitions_per_consumer=4, stage_partitions=16, input_tasks=4 + │ [Stage 2] => NetworkBroadcastExec: partitions_per_consumer=6, stage_partitions=24, input_tasks=4 │ FilterExec: l_shipdate@3 > 1995-03-15, projection=[l_orderkey@0, l_extendedprice@1, l_discount@2] - │ PartitionIsolatorExec: tasks=4 partitions=16 - │ DataSourceExec: file_groups={16 groups: [[/testdata/tpch/plan_sf0.02/lineitem/1.parquet], [/testdata/tpch/plan_sf0.02/lineitem/10.parquet], [/testdata/tpch/plan_sf0.02/lineitem/11.parquet], [/testdata/tpch/plan_sf0.02/lineitem/12.parquet], [/testdata/tpch/plan_sf0.02/lineitem/13.parquet], ...]}, projection=[l_orderkey, l_extendedprice, l_discount, l_shipdate], file_type=parquet, predicate=l_shipdate@10 > 1995-03-15 AND DynamicFilter [ empty ], pruning_predicate=l_shipdate_null_count@1 != row_count@2 AND l_shipdate_max@0 > 1995-03-15, required_guarantees=[] + │ DistributedLeafExec: DataSourceExec: file_groups={6 groups: [[/testdata/tpch/plan_sf0.02/lineitem/1.parquet, /testdata/tpch/plan_sf0.02/lineitem/10.parquet, /testdata/tpch/plan_sf0.02/lineitem/11.parquet], [/testdata/tpch/plan_sf0.02/lineitem/12.parquet, /testdata/tpch/plan_sf0.02/lineitem/13.parquet, /testdata/tpch/plan_sf0.02/lineitem/14.parquet], [/testdata/tpch/plan_sf0.02/lineitem/15.parquet, /testdata/tpch/plan_sf0.02/lineitem/16.parquet, /testdata/tpch/plan_sf0.02/lineitem/2.parquet], [/testdata/tpch/plan_sf0.02/lineitem/3.parquet, /testdata/tpch/plan_sf0.02/lineitem/4.parquet, /testdata/tpch/plan_sf0.02/lineitem/5.parquet], [/testdata/tpch/plan_sf0.02/lineitem/6.parquet, /testdata/tpch/plan_sf0.02/lineitem/7.parquet, /testdata/tpch/plan_sf0.02/lineitem/8.parquet], ...]}, projection=[l_orderkey, l_extendedprice, l_discount, l_shipdate], file_type=parquet, predicate=l_shipdate@10 > 1995-03-15 AND DynamicFilter [ empty ], pruning_predicate=l_shipdate_null_count@1 != row_count@2 AND l_shipdate_max@0 > 1995-03-15, required_guarantees=[] └────────────────────────────────────────────────── - ┌───── Stage 2 ── Tasks: t0:[p0..p15] t1:[p16..p31] t2:[p32..p47] t3:[p48..p63] - │ BroadcastExec: input_partitions=4, consumer_tasks=4, output_partitions=16 + ┌───── Stage 2 ── Tasks: t0:[p0..p23] t1:[p24..p47] t2:[p48..p71] t3:[p72..p95] + │ BroadcastExec: input_partitions=6, consumer_tasks=4, output_partitions=24 │ HashJoinExec: mode=CollectLeft, join_type=Inner, on=[(c_custkey@0, o_custkey@1)], projection=[o_orderkey@1, o_orderdate@3, o_shippriority@4] │ CoalescePartitionsExec - │ [Stage 1] => NetworkBroadcastExec: partitions_per_consumer=4, stage_partitions=16, input_tasks=4 + │ [Stage 1] => NetworkBroadcastExec: partitions_per_consumer=6, stage_partitions=24, input_tasks=4 │ FilterExec: o_orderdate@2 < 1995-03-15 - │ PartitionIsolatorExec: tasks=4 partitions=16 - │ DataSourceExec: file_groups={16 groups: [[/testdata/tpch/plan_sf0.02/orders/1.parquet], [/testdata/tpch/plan_sf0.02/orders/10.parquet], [/testdata/tpch/plan_sf0.02/orders/11.parquet], [/testdata/tpch/plan_sf0.02/orders/12.parquet], [/testdata/tpch/plan_sf0.02/orders/13.parquet], ...]}, projection=[o_orderkey, o_custkey, o_orderdate, o_shippriority], file_type=parquet, predicate=o_orderdate@4 < 1995-03-15 AND DynamicFilter [ empty ], pruning_predicate=o_orderdate_null_count@1 != row_count@2 AND o_orderdate_min@0 < 1995-03-15, required_guarantees=[] + │ DistributedLeafExec: DataSourceExec: file_groups={6 groups: [[/testdata/tpch/plan_sf0.02/orders/1.parquet, /testdata/tpch/plan_sf0.02/orders/10.parquet, /testdata/tpch/plan_sf0.02/orders/11.parquet], [/testdata/tpch/plan_sf0.02/orders/12.parquet, /testdata/tpch/plan_sf0.02/orders/13.parquet, /testdata/tpch/plan_sf0.02/orders/14.parquet], [/testdata/tpch/plan_sf0.02/orders/15.parquet, /testdata/tpch/plan_sf0.02/orders/16.parquet, /testdata/tpch/plan_sf0.02/orders/2.parquet], [/testdata/tpch/plan_sf0.02/orders/3.parquet, /testdata/tpch/plan_sf0.02/orders/4.parquet, /testdata/tpch/plan_sf0.02/orders/5.parquet], [/testdata/tpch/plan_sf0.02/orders/6.parquet, /testdata/tpch/plan_sf0.02/orders/7.parquet, /testdata/tpch/plan_sf0.02/orders/8.parquet], ...]}, projection=[o_orderkey, o_custkey, o_orderdate, o_shippriority], file_type=parquet, predicate=o_orderdate@4 < 1995-03-15 AND DynamicFilter [ empty ], pruning_predicate=o_orderdate_null_count@1 != row_count@2 AND o_orderdate_min@0 < 1995-03-15, required_guarantees=[] └────────────────────────────────────────────────── - ┌───── Stage 1 ── Tasks: t0:[p0..p15] t1:[p16..p31] t2:[p32..p47] t3:[p48..p63] - │ BroadcastExec: input_partitions=4, consumer_tasks=4, output_partitions=16 + ┌───── Stage 1 ── Tasks: t0:[p0..p23] t1:[p24..p47] t2:[p48..p71] t3:[p72..p95] + │ BroadcastExec: input_partitions=6, consumer_tasks=4, output_partitions=24 │ FilterExec: c_mktsegment@1 = BUILDING, projection=[c_custkey@0] - │ PartitionIsolatorExec: tasks=4 partitions=16 - │ DataSourceExec: file_groups={16 groups: [[/testdata/tpch/plan_sf0.02/customer/1.parquet], [/testdata/tpch/plan_sf0.02/customer/10.parquet], [/testdata/tpch/plan_sf0.02/customer/11.parquet], [/testdata/tpch/plan_sf0.02/customer/12.parquet], [/testdata/tpch/plan_sf0.02/customer/13.parquet], ...]}, projection=[c_custkey, c_mktsegment], file_type=parquet, predicate=c_mktsegment@6 = BUILDING, pruning_predicate=c_mktsegment_null_count@2 != row_count@3 AND c_mktsegment_min@0 <= BUILDING AND BUILDING <= c_mktsegment_max@1, required_guarantees=[c_mktsegment in (BUILDING)] + │ DistributedLeafExec: DataSourceExec: file_groups={6 groups: [[/testdata/tpch/plan_sf0.02/customer/1.parquet, /testdata/tpch/plan_sf0.02/customer/10.parquet, /testdata/tpch/plan_sf0.02/customer/11.parquet], [/testdata/tpch/plan_sf0.02/customer/12.parquet, /testdata/tpch/plan_sf0.02/customer/13.parquet, /testdata/tpch/plan_sf0.02/customer/14.parquet], [/testdata/tpch/plan_sf0.02/customer/15.parquet, /testdata/tpch/plan_sf0.02/customer/16.parquet, /testdata/tpch/plan_sf0.02/customer/2.parquet], [/testdata/tpch/plan_sf0.02/customer/3.parquet, /testdata/tpch/plan_sf0.02/customer/4.parquet, /testdata/tpch/plan_sf0.02/customer/5.parquet], [/testdata/tpch/plan_sf0.02/customer/6.parquet, /testdata/tpch/plan_sf0.02/customer/7.parquet, /testdata/tpch/plan_sf0.02/customer/8.parquet], ...]}, projection=[c_custkey, c_mktsegment], file_type=parquet, predicate=c_mktsegment@6 = BUILDING, pruning_predicate=c_mktsegment_null_count@2 != row_count@3 AND c_mktsegment_min@0 <= BUILDING AND BUILDING <= c_mktsegment_max@1, required_guarantees=[c_mktsegment in (BUILDING)] └────────────────────────────────────────────────── "); Ok(()) @@ -185,7 +172,7 @@ mod tests { #[tokio::test] async fn test_tpch_4() -> Result<(), Box> { let plan = test_tpch_query("q4").await?; - assert_snapshot!(plan, @" + assert_snapshot!(plan, @r" ┌───── DistributedExec ── Tasks: t0:[p0] │ SortPreservingMergeExec: [o_orderpriority@0 ASC NULLS LAST] │ SortExec: expr=[o_orderpriority@0 ASC NULLS LAST], preserve_partitioning=[true] @@ -195,14 +182,13 @@ mod tests { │ AggregateExec: mode=Partial, gby=[o_orderpriority@0 as o_orderpriority], aggr=[count(Int64(1))] │ HashJoinExec: mode=CollectLeft, join_type=LeftSemi, on=[(o_orderkey@0, l_orderkey@0)], projection=[o_orderpriority@1] │ CoalescePartitionsExec - │ [Stage 1] => NetworkCoalesceExec: output_partitions=16, input_tasks=4 + │ [Stage 1] => NetworkCoalesceExec: output_partitions=24, input_tasks=4 │ FilterExec: l_receiptdate@2 > l_commitdate@1, projection=[l_orderkey@0] │ DataSourceExec: file_groups={6 groups: [[/testdata/tpch/plan_sf0.02/lineitem/1.parquet, /testdata/tpch/plan_sf0.02/lineitem/10.parquet, /testdata/tpch/plan_sf0.02/lineitem/11.parquet], [/testdata/tpch/plan_sf0.02/lineitem/12.parquet, /testdata/tpch/plan_sf0.02/lineitem/13.parquet, /testdata/tpch/plan_sf0.02/lineitem/14.parquet], [/testdata/tpch/plan_sf0.02/lineitem/15.parquet, /testdata/tpch/plan_sf0.02/lineitem/16.parquet, /testdata/tpch/plan_sf0.02/lineitem/2.parquet], [/testdata/tpch/plan_sf0.02/lineitem/3.parquet, /testdata/tpch/plan_sf0.02/lineitem/4.parquet, /testdata/tpch/plan_sf0.02/lineitem/5.parquet], [/testdata/tpch/plan_sf0.02/lineitem/6.parquet, /testdata/tpch/plan_sf0.02/lineitem/7.parquet, /testdata/tpch/plan_sf0.02/lineitem/8.parquet], ...]}, projection=[l_orderkey, l_commitdate, l_receiptdate], file_type=parquet, predicate=l_receiptdate@12 > l_commitdate@11 └────────────────────────────────────────────────── - ┌───── Stage 1 ── Tasks: t0:[p0..p3] t1:[p4..p7] t2:[p8..p11] t3:[p12..p15] + ┌───── Stage 1 ── Tasks: t0:[p0..p5] t1:[p6..p11] t2:[p12..p17] t3:[p18..p23] │ FilterExec: o_orderdate@1 >= 1993-07-01 AND o_orderdate@1 < 1993-10-01, projection=[o_orderkey@0, o_orderpriority@2] - │ PartitionIsolatorExec: tasks=4 partitions=16 - │ DataSourceExec: file_groups={16 groups: [[/testdata/tpch/plan_sf0.02/orders/1.parquet], [/testdata/tpch/plan_sf0.02/orders/10.parquet], [/testdata/tpch/plan_sf0.02/orders/11.parquet], [/testdata/tpch/plan_sf0.02/orders/12.parquet], [/testdata/tpch/plan_sf0.02/orders/13.parquet], ...]}, projection=[o_orderkey, o_orderdate, o_orderpriority], file_type=parquet, predicate=o_orderdate@4 >= 1993-07-01 AND o_orderdate@4 < 1993-10-01, pruning_predicate=o_orderdate_null_count@1 != row_count@2 AND o_orderdate_max@0 >= 1993-07-01 AND o_orderdate_null_count@1 != row_count@2 AND o_orderdate_min@3 < 1993-10-01, required_guarantees=[] + │ DistributedLeafExec: DataSourceExec: file_groups={6 groups: [[/testdata/tpch/plan_sf0.02/orders/1.parquet, /testdata/tpch/plan_sf0.02/orders/10.parquet, /testdata/tpch/plan_sf0.02/orders/11.parquet], [/testdata/tpch/plan_sf0.02/orders/12.parquet, /testdata/tpch/plan_sf0.02/orders/13.parquet, /testdata/tpch/plan_sf0.02/orders/14.parquet], [/testdata/tpch/plan_sf0.02/orders/15.parquet, /testdata/tpch/plan_sf0.02/orders/16.parquet, /testdata/tpch/plan_sf0.02/orders/2.parquet], [/testdata/tpch/plan_sf0.02/orders/3.parquet, /testdata/tpch/plan_sf0.02/orders/4.parquet, /testdata/tpch/plan_sf0.02/orders/5.parquet], [/testdata/tpch/plan_sf0.02/orders/6.parquet, /testdata/tpch/plan_sf0.02/orders/7.parquet, /testdata/tpch/plan_sf0.02/orders/8.parquet], ...]}, projection=[o_orderkey, o_orderdate, o_orderpriority], file_type=parquet, predicate=o_orderdate@4 >= 1993-07-01 AND o_orderdate@4 < 1993-10-01, pruning_predicate=o_orderdate_null_count@1 != row_count@2 AND o_orderdate_max@0 >= 1993-07-01 AND o_orderdate_null_count@1 != row_count@2 AND o_orderdate_min@3 < 1993-10-01, required_guarantees=[] └────────────────────────────────────────────────── "); Ok(()) @@ -211,7 +197,7 @@ mod tests { #[tokio::test] async fn test_tpch_5() -> Result<(), Box> { let plan = test_tpch_query("q5").await?; - assert_snapshot!(plan, @" + assert_snapshot!(plan, @r" ┌───── DistributedExec ── Tasks: t0:[p0] │ SortPreservingMergeExec: [revenue@1 DESC] │ [Stage 7] => NetworkCoalesceExec: output_partitions=18, input_tasks=3 @@ -223,54 +209,48 @@ mod tests { │ [Stage 6] => NetworkShuffleExec: output_partitions=6, input_tasks=4 └────────────────────────────────────────────────── ┌───── Stage 6 ── Tasks: t0:[p0..p17] t1:[p0..p17] t2:[p0..p17] t3:[p0..p17] - │ RepartitionExec: partitioning=Hash([n_name@0], 18), input_partitions=4 + │ RepartitionExec: partitioning=Hash([n_name@0], 18), input_partitions=6 │ AggregateExec: mode=Partial, gby=[n_name@2 as n_name], aggr=[sum(lineitem.l_extendedprice * Int64(1) - lineitem.l_discount)] │ HashJoinExec: mode=CollectLeft, join_type=Inner, on=[(r_regionkey@0, n_regionkey@3)], projection=[l_extendedprice@1, l_discount@2, n_name@3] │ CoalescePartitionsExec - │ [Stage 1] => NetworkBroadcastExec: partitions_per_consumer=4, stage_partitions=16, input_tasks=4 + │ [Stage 1] => NetworkBroadcastExec: partitions_per_consumer=6, stage_partitions=24, input_tasks=4 │ ProjectionExec: expr=[l_extendedprice@2 as l_extendedprice, l_discount@3 as l_discount, n_name@0 as n_name, n_regionkey@1 as n_regionkey] │ HashJoinExec: mode=CollectLeft, join_type=Inner, on=[(n_nationkey@0, s_nationkey@2)], projection=[n_name@1, n_regionkey@2, l_extendedprice@3, l_discount@4] │ CoalescePartitionsExec - │ [Stage 2] => NetworkBroadcastExec: partitions_per_consumer=4, stage_partitions=16, input_tasks=4 + │ [Stage 2] => NetworkBroadcastExec: partitions_per_consumer=6, stage_partitions=24, input_tasks=4 │ ProjectionExec: expr=[l_extendedprice@1 as l_extendedprice, l_discount@2 as l_discount, s_nationkey@0 as s_nationkey] │ HashJoinExec: mode=CollectLeft, join_type=Inner, on=[(s_suppkey@0, l_suppkey@1), (s_nationkey@1, c_nationkey@0)], projection=[s_nationkey@1, l_extendedprice@4, l_discount@5] │ CoalescePartitionsExec - │ [Stage 3] => NetworkBroadcastExec: partitions_per_consumer=4, stage_partitions=16, input_tasks=4 + │ [Stage 3] => NetworkBroadcastExec: partitions_per_consumer=6, stage_partitions=24, input_tasks=4 │ HashJoinExec: mode=CollectLeft, join_type=Inner, on=[(o_orderkey@1, l_orderkey@0)], projection=[c_nationkey@0, l_suppkey@3, l_extendedprice@4, l_discount@5] │ CoalescePartitionsExec - │ [Stage 5] => NetworkBroadcastExec: partitions_per_consumer=4, stage_partitions=16, input_tasks=4 - │ PartitionIsolatorExec: tasks=4 partitions=16 - │ DataSourceExec: file_groups={16 groups: [[/testdata/tpch/plan_sf0.02/lineitem/1.parquet], [/testdata/tpch/plan_sf0.02/lineitem/10.parquet], [/testdata/tpch/plan_sf0.02/lineitem/11.parquet], [/testdata/tpch/plan_sf0.02/lineitem/12.parquet], [/testdata/tpch/plan_sf0.02/lineitem/13.parquet], ...]}, projection=[l_orderkey, l_suppkey, l_extendedprice, l_discount], file_type=parquet, predicate=DynamicFilter [ empty ] + │ [Stage 5] => NetworkBroadcastExec: partitions_per_consumer=6, stage_partitions=24, input_tasks=4 + │ DistributedLeafExec: DataSourceExec: file_groups={6 groups: [[/testdata/tpch/plan_sf0.02/lineitem/1.parquet, /testdata/tpch/plan_sf0.02/lineitem/10.parquet, /testdata/tpch/plan_sf0.02/lineitem/11.parquet], [/testdata/tpch/plan_sf0.02/lineitem/12.parquet, /testdata/tpch/plan_sf0.02/lineitem/13.parquet, /testdata/tpch/plan_sf0.02/lineitem/14.parquet], [/testdata/tpch/plan_sf0.02/lineitem/15.parquet, /testdata/tpch/plan_sf0.02/lineitem/16.parquet, /testdata/tpch/plan_sf0.02/lineitem/2.parquet], [/testdata/tpch/plan_sf0.02/lineitem/3.parquet, /testdata/tpch/plan_sf0.02/lineitem/4.parquet, /testdata/tpch/plan_sf0.02/lineitem/5.parquet], [/testdata/tpch/plan_sf0.02/lineitem/6.parquet, /testdata/tpch/plan_sf0.02/lineitem/7.parquet, /testdata/tpch/plan_sf0.02/lineitem/8.parquet], ...]}, projection=[l_orderkey, l_suppkey, l_extendedprice, l_discount], file_type=parquet, predicate=DynamicFilter [ empty ] └────────────────────────────────────────────────── - ┌───── Stage 1 ── Tasks: t0:[p0..p15] t1:[p16..p31] t2:[p32..p47] t3:[p48..p63] - │ BroadcastExec: input_partitions=4, consumer_tasks=4, output_partitions=16 + ┌───── Stage 1 ── Tasks: t0:[p0..p23] t1:[p24..p47] t2:[p48..p71] t3:[p72..p95] + │ BroadcastExec: input_partitions=6, consumer_tasks=4, output_partitions=24 │ FilterExec: r_name@1 = ASIA, projection=[r_regionkey@0] - │ PartitionIsolatorExec: tasks=4 partitions=16 - │ DataSourceExec: file_groups={16 groups: [[/testdata/tpch/plan_sf0.02/region/1.parquet], [/testdata/tpch/plan_sf0.02/region/10.parquet], [/testdata/tpch/plan_sf0.02/region/11.parquet], [/testdata/tpch/plan_sf0.02/region/12.parquet], [/testdata/tpch/plan_sf0.02/region/13.parquet], ...]}, projection=[r_regionkey, r_name], file_type=parquet, predicate=r_name@1 = ASIA, pruning_predicate=r_name_null_count@2 != row_count@3 AND r_name_min@0 <= ASIA AND ASIA <= r_name_max@1, required_guarantees=[r_name in (ASIA)] + │ DistributedLeafExec: DataSourceExec: file_groups={6 groups: [[/testdata/tpch/plan_sf0.02/region/1.parquet, /testdata/tpch/plan_sf0.02/region/10.parquet, /testdata/tpch/plan_sf0.02/region/11.parquet], [/testdata/tpch/plan_sf0.02/region/12.parquet, /testdata/tpch/plan_sf0.02/region/13.parquet, /testdata/tpch/plan_sf0.02/region/14.parquet], [/testdata/tpch/plan_sf0.02/region/15.parquet, /testdata/tpch/plan_sf0.02/region/16.parquet, /testdata/tpch/plan_sf0.02/region/2.parquet], [/testdata/tpch/plan_sf0.02/region/3.parquet, /testdata/tpch/plan_sf0.02/region/4.parquet, /testdata/tpch/plan_sf0.02/region/5.parquet], [/testdata/tpch/plan_sf0.02/region/6.parquet, /testdata/tpch/plan_sf0.02/region/7.parquet, /testdata/tpch/plan_sf0.02/region/8.parquet], ...]}, projection=[r_regionkey, r_name], file_type=parquet, predicate=r_name@1 = ASIA, pruning_predicate=r_name_null_count@2 != row_count@3 AND r_name_min@0 <= ASIA AND ASIA <= r_name_max@1, required_guarantees=[r_name in (ASIA)] └────────────────────────────────────────────────── - ┌───── Stage 2 ── Tasks: t0:[p0..p15] t1:[p16..p31] t2:[p32..p47] t3:[p48..p63] - │ BroadcastExec: input_partitions=4, consumer_tasks=4, output_partitions=16 - │ PartitionIsolatorExec: tasks=4 partitions=16 - │ DataSourceExec: file_groups={16 groups: [[/testdata/tpch/plan_sf0.02/nation/1.parquet], [/testdata/tpch/plan_sf0.02/nation/10.parquet], [/testdata/tpch/plan_sf0.02/nation/11.parquet], [/testdata/tpch/plan_sf0.02/nation/12.parquet], [/testdata/tpch/plan_sf0.02/nation/13.parquet], ...]}, projection=[n_nationkey, n_name, n_regionkey], file_type=parquet, predicate=DynamicFilter [ empty ] + ┌───── Stage 2 ── Tasks: t0:[p0..p23] t1:[p24..p47] t2:[p48..p71] t3:[p72..p95] + │ BroadcastExec: input_partitions=6, consumer_tasks=4, output_partitions=24 + │ DistributedLeafExec: DataSourceExec: file_groups={6 groups: [[/testdata/tpch/plan_sf0.02/nation/1.parquet, /testdata/tpch/plan_sf0.02/nation/10.parquet, /testdata/tpch/plan_sf0.02/nation/11.parquet], [/testdata/tpch/plan_sf0.02/nation/12.parquet, /testdata/tpch/plan_sf0.02/nation/13.parquet, /testdata/tpch/plan_sf0.02/nation/14.parquet], [/testdata/tpch/plan_sf0.02/nation/15.parquet, /testdata/tpch/plan_sf0.02/nation/16.parquet, /testdata/tpch/plan_sf0.02/nation/2.parquet], [/testdata/tpch/plan_sf0.02/nation/3.parquet, /testdata/tpch/plan_sf0.02/nation/4.parquet, /testdata/tpch/plan_sf0.02/nation/5.parquet], [/testdata/tpch/plan_sf0.02/nation/6.parquet, /testdata/tpch/plan_sf0.02/nation/7.parquet, /testdata/tpch/plan_sf0.02/nation/8.parquet], ...]}, projection=[n_nationkey, n_name, n_regionkey], file_type=parquet, predicate=DynamicFilter [ empty ] └────────────────────────────────────────────────── - ┌───── Stage 3 ── Tasks: t0:[p0..p15] t1:[p16..p31] t2:[p32..p47] t3:[p48..p63] - │ BroadcastExec: input_partitions=4, consumer_tasks=4, output_partitions=16 - │ PartitionIsolatorExec: tasks=4 partitions=16 - │ DataSourceExec: file_groups={16 groups: [[/testdata/tpch/plan_sf0.02/supplier/1.parquet], [/testdata/tpch/plan_sf0.02/supplier/10.parquet], [/testdata/tpch/plan_sf0.02/supplier/11.parquet], [/testdata/tpch/plan_sf0.02/supplier/12.parquet], [/testdata/tpch/plan_sf0.02/supplier/13.parquet], ...]}, projection=[s_suppkey, s_nationkey], file_type=parquet, predicate=DynamicFilter [ empty ] + ┌───── Stage 3 ── Tasks: t0:[p0..p23] t1:[p24..p47] t2:[p48..p71] t3:[p72..p95] + │ BroadcastExec: input_partitions=6, consumer_tasks=4, output_partitions=24 + │ DistributedLeafExec: DataSourceExec: file_groups={6 groups: [[/testdata/tpch/plan_sf0.02/supplier/1.parquet, /testdata/tpch/plan_sf0.02/supplier/10.parquet, /testdata/tpch/plan_sf0.02/supplier/11.parquet], [/testdata/tpch/plan_sf0.02/supplier/12.parquet, /testdata/tpch/plan_sf0.02/supplier/13.parquet, /testdata/tpch/plan_sf0.02/supplier/14.parquet], [/testdata/tpch/plan_sf0.02/supplier/15.parquet, /testdata/tpch/plan_sf0.02/supplier/16.parquet, /testdata/tpch/plan_sf0.02/supplier/2.parquet], [/testdata/tpch/plan_sf0.02/supplier/3.parquet, /testdata/tpch/plan_sf0.02/supplier/4.parquet, /testdata/tpch/plan_sf0.02/supplier/5.parquet], [/testdata/tpch/plan_sf0.02/supplier/6.parquet, /testdata/tpch/plan_sf0.02/supplier/7.parquet, /testdata/tpch/plan_sf0.02/supplier/8.parquet], ...]}, projection=[s_suppkey, s_nationkey], file_type=parquet, predicate=DynamicFilter [ empty ] └────────────────────────────────────────────────── - ┌───── Stage 5 ── Tasks: t0:[p0..p15] t1:[p16..p31] t2:[p32..p47] t3:[p48..p63] - │ BroadcastExec: input_partitions=4, consumer_tasks=4, output_partitions=16 + ┌───── Stage 5 ── Tasks: t0:[p0..p23] t1:[p24..p47] t2:[p48..p71] t3:[p72..p95] + │ BroadcastExec: input_partitions=6, consumer_tasks=4, output_partitions=24 │ HashJoinExec: mode=CollectLeft, join_type=Inner, on=[(c_custkey@0, o_custkey@1)], projection=[c_nationkey@1, o_orderkey@2] │ CoalescePartitionsExec - │ [Stage 4] => NetworkBroadcastExec: partitions_per_consumer=4, stage_partitions=16, input_tasks=4 + │ [Stage 4] => NetworkBroadcastExec: partitions_per_consumer=6, stage_partitions=24, input_tasks=4 │ FilterExec: o_orderdate@2 >= 1994-01-01 AND o_orderdate@2 < 1995-01-01, projection=[o_orderkey@0, o_custkey@1] - │ PartitionIsolatorExec: tasks=4 partitions=16 - │ DataSourceExec: file_groups={16 groups: [[/testdata/tpch/plan_sf0.02/orders/1.parquet], [/testdata/tpch/plan_sf0.02/orders/10.parquet], [/testdata/tpch/plan_sf0.02/orders/11.parquet], [/testdata/tpch/plan_sf0.02/orders/12.parquet], [/testdata/tpch/plan_sf0.02/orders/13.parquet], ...]}, projection=[o_orderkey, o_custkey, o_orderdate], file_type=parquet, predicate=o_orderdate@4 >= 1994-01-01 AND o_orderdate@4 < 1995-01-01 AND DynamicFilter [ empty ], pruning_predicate=o_orderdate_null_count@1 != row_count@2 AND o_orderdate_max@0 >= 1994-01-01 AND o_orderdate_null_count@1 != row_count@2 AND o_orderdate_min@3 < 1995-01-01, required_guarantees=[] + │ DistributedLeafExec: DataSourceExec: file_groups={6 groups: [[/testdata/tpch/plan_sf0.02/orders/1.parquet, /testdata/tpch/plan_sf0.02/orders/10.parquet, /testdata/tpch/plan_sf0.02/orders/11.parquet], [/testdata/tpch/plan_sf0.02/orders/12.parquet, /testdata/tpch/plan_sf0.02/orders/13.parquet, /testdata/tpch/plan_sf0.02/orders/14.parquet], [/testdata/tpch/plan_sf0.02/orders/15.parquet, /testdata/tpch/plan_sf0.02/orders/16.parquet, /testdata/tpch/plan_sf0.02/orders/2.parquet], [/testdata/tpch/plan_sf0.02/orders/3.parquet, /testdata/tpch/plan_sf0.02/orders/4.parquet, /testdata/tpch/plan_sf0.02/orders/5.parquet], [/testdata/tpch/plan_sf0.02/orders/6.parquet, /testdata/tpch/plan_sf0.02/orders/7.parquet, /testdata/tpch/plan_sf0.02/orders/8.parquet], ...]}, projection=[o_orderkey, o_custkey, o_orderdate], file_type=parquet, predicate=o_orderdate@4 >= 1994-01-01 AND o_orderdate@4 < 1995-01-01 AND DynamicFilter [ empty ], pruning_predicate=o_orderdate_null_count@1 != row_count@2 AND o_orderdate_max@0 >= 1994-01-01 AND o_orderdate_null_count@1 != row_count@2 AND o_orderdate_min@3 < 1995-01-01, required_guarantees=[] └────────────────────────────────────────────────── - ┌───── Stage 4 ── Tasks: t0:[p0..p15] t1:[p16..p31] t2:[p32..p47] t3:[p48..p63] - │ BroadcastExec: input_partitions=4, consumer_tasks=4, output_partitions=16 - │ PartitionIsolatorExec: tasks=4 partitions=16 - │ DataSourceExec: file_groups={16 groups: [[/testdata/tpch/plan_sf0.02/customer/1.parquet], [/testdata/tpch/plan_sf0.02/customer/10.parquet], [/testdata/tpch/plan_sf0.02/customer/11.parquet], [/testdata/tpch/plan_sf0.02/customer/12.parquet], [/testdata/tpch/plan_sf0.02/customer/13.parquet], ...]}, projection=[c_custkey, c_nationkey], file_type=parquet + ┌───── Stage 4 ── Tasks: t0:[p0..p23] t1:[p24..p47] t2:[p48..p71] t3:[p72..p95] + │ BroadcastExec: input_partitions=6, consumer_tasks=4, output_partitions=24 + │ DistributedLeafExec: DataSourceExec: file_groups={6 groups: [[/testdata/tpch/plan_sf0.02/customer/1.parquet, /testdata/tpch/plan_sf0.02/customer/10.parquet, /testdata/tpch/plan_sf0.02/customer/11.parquet], [/testdata/tpch/plan_sf0.02/customer/12.parquet, /testdata/tpch/plan_sf0.02/customer/13.parquet, /testdata/tpch/plan_sf0.02/customer/14.parquet], [/testdata/tpch/plan_sf0.02/customer/15.parquet, /testdata/tpch/plan_sf0.02/customer/16.parquet, /testdata/tpch/plan_sf0.02/customer/2.parquet], [/testdata/tpch/plan_sf0.02/customer/3.parquet, /testdata/tpch/plan_sf0.02/customer/4.parquet, /testdata/tpch/plan_sf0.02/customer/5.parquet], [/testdata/tpch/plan_sf0.02/customer/6.parquet, /testdata/tpch/plan_sf0.02/customer/7.parquet, /testdata/tpch/plan_sf0.02/customer/8.parquet], ...]}, projection=[c_custkey, c_nationkey], file_type=parquet └────────────────────────────────────────────────── "); Ok(()) @@ -279,18 +259,17 @@ mod tests { #[tokio::test] async fn test_tpch_6() -> Result<(), Box> { let plan = test_tpch_query("q6").await?; - assert_snapshot!(plan, @" + assert_snapshot!(plan, @r" ┌───── DistributedExec ── Tasks: t0:[p0] │ ProjectionExec: expr=[sum(lineitem.l_extendedprice * lineitem.l_discount)@0 as revenue] │ AggregateExec: mode=Final, gby=[], aggr=[sum(lineitem.l_extendedprice * lineitem.l_discount)] │ CoalescePartitionsExec - │ [Stage 1] => NetworkCoalesceExec: output_partitions=16, input_tasks=4 + │ [Stage 1] => NetworkCoalesceExec: output_partitions=24, input_tasks=4 └────────────────────────────────────────────────── - ┌───── Stage 1 ── Tasks: t0:[p0..p3] t1:[p4..p7] t2:[p8..p11] t3:[p12..p15] + ┌───── Stage 1 ── Tasks: t0:[p0..p5] t1:[p6..p11] t2:[p12..p17] t3:[p18..p23] │ AggregateExec: mode=Partial, gby=[], aggr=[sum(lineitem.l_extendedprice * lineitem.l_discount)] │ FilterExec: l_shipdate@3 >= 1994-01-01 AND l_shipdate@3 < 1995-01-01 AND l_discount@2 >= Some(5),15,2 AND l_discount@2 <= Some(7),15,2 AND l_quantity@0 < Some(2400),15,2, projection=[l_extendedprice@1, l_discount@2] - │ PartitionIsolatorExec: tasks=4 partitions=16 - │ DataSourceExec: file_groups={16 groups: [[/testdata/tpch/plan_sf0.02/lineitem/1.parquet], [/testdata/tpch/plan_sf0.02/lineitem/10.parquet], [/testdata/tpch/plan_sf0.02/lineitem/11.parquet], [/testdata/tpch/plan_sf0.02/lineitem/12.parquet], [/testdata/tpch/plan_sf0.02/lineitem/13.parquet], ...]}, projection=[l_quantity, l_extendedprice, l_discount, l_shipdate], file_type=parquet, predicate=l_shipdate@10 >= 1994-01-01 AND l_shipdate@10 < 1995-01-01 AND l_discount@6 >= Some(5),15,2 AND l_discount@6 <= Some(7),15,2 AND l_quantity@4 < Some(2400),15,2, pruning_predicate=l_shipdate_null_count@1 != row_count@2 AND l_shipdate_max@0 >= 1994-01-01 AND l_shipdate_null_count@1 != row_count@2 AND l_shipdate_min@3 < 1995-01-01 AND l_discount_null_count@5 != row_count@2 AND l_discount_max@4 >= Some(5),15,2 AND l_discount_null_count@5 != row_count@2 AND l_discount_min@6 <= Some(7),15,2 AND l_quantity_null_count@8 != row_count@2 AND l_quantity_min@7 < Some(2400),15,2, required_guarantees=[] + │ DistributedLeafExec: DataSourceExec: file_groups={6 groups: [[/testdata/tpch/plan_sf0.02/lineitem/1.parquet, /testdata/tpch/plan_sf0.02/lineitem/10.parquet, /testdata/tpch/plan_sf0.02/lineitem/11.parquet], [/testdata/tpch/plan_sf0.02/lineitem/12.parquet, /testdata/tpch/plan_sf0.02/lineitem/13.parquet, /testdata/tpch/plan_sf0.02/lineitem/14.parquet], [/testdata/tpch/plan_sf0.02/lineitem/15.parquet, /testdata/tpch/plan_sf0.02/lineitem/16.parquet, /testdata/tpch/plan_sf0.02/lineitem/2.parquet], [/testdata/tpch/plan_sf0.02/lineitem/3.parquet, /testdata/tpch/plan_sf0.02/lineitem/4.parquet, /testdata/tpch/plan_sf0.02/lineitem/5.parquet], [/testdata/tpch/plan_sf0.02/lineitem/6.parquet, /testdata/tpch/plan_sf0.02/lineitem/7.parquet, /testdata/tpch/plan_sf0.02/lineitem/8.parquet], ...]}, projection=[l_quantity, l_extendedprice, l_discount, l_shipdate], file_type=parquet, predicate=l_shipdate@10 >= 1994-01-01 AND l_shipdate@10 < 1995-01-01 AND l_discount@6 >= Some(5),15,2 AND l_discount@6 <= Some(7),15,2 AND l_quantity@4 < Some(2400),15,2, pruning_predicate=l_shipdate_null_count@1 != row_count@2 AND l_shipdate_max@0 >= 1994-01-01 AND l_shipdate_null_count@1 != row_count@2 AND l_shipdate_min@3 < 1995-01-01 AND l_discount_null_count@5 != row_count@2 AND l_discount_max@4 >= Some(5),15,2 AND l_discount_null_count@5 != row_count@2 AND l_discount_min@6 <= Some(7),15,2 AND l_quantity_null_count@8 != row_count@2 AND l_quantity_min@7 < Some(2400),15,2, required_guarantees=[] └────────────────────────────────────────────────── "); Ok(()) @@ -299,7 +278,7 @@ mod tests { #[tokio::test] async fn test_tpch_7() -> Result<(), Box> { let plan = test_tpch_query("q7").await?; - assert_snapshot!(plan, @" + assert_snapshot!(plan, @r" ┌───── DistributedExec ── Tasks: t0:[p0] │ SortPreservingMergeExec: [supp_nation@0 ASC NULLS LAST, cust_nation@1 ASC NULLS LAST, l_year@2 ASC NULLS LAST] │ [Stage 7] => NetworkCoalesceExec: output_partitions=18, input_tasks=3 @@ -311,57 +290,51 @@ mod tests { │ [Stage 6] => NetworkShuffleExec: output_partitions=6, input_tasks=4 └────────────────────────────────────────────────── ┌───── Stage 6 ── Tasks: t0:[p0..p17] t1:[p0..p17] t2:[p0..p17] t3:[p0..p17] - │ RepartitionExec: partitioning=Hash([supp_nation@0, cust_nation@1, l_year@2], 18), input_partitions=4 + │ RepartitionExec: partitioning=Hash([supp_nation@0, cust_nation@1, l_year@2], 18), input_partitions=6 │ AggregateExec: mode=Partial, gby=[supp_nation@0 as supp_nation, cust_nation@1 as cust_nation, l_year@2 as l_year], aggr=[sum(shipping.volume)] │ ProjectionExec: expr=[n_name@4 as supp_nation, n_name@0 as cust_nation, date_part(YEAR, l_shipdate@3) as l_year, l_extendedprice@1 * (Some(1),20,0 - l_discount@2) as volume] │ HashJoinExec: mode=CollectLeft, join_type=Inner, on=[(n_nationkey@0, c_nationkey@3)], filter=n_name@0 = FRANCE AND n_name@1 = GERMANY OR n_name@0 = GERMANY AND n_name@1 = FRANCE, projection=[n_name@1, l_extendedprice@2, l_discount@3, l_shipdate@4, n_name@6] │ CoalescePartitionsExec - │ [Stage 1] => NetworkBroadcastExec: partitions_per_consumer=4, stage_partitions=16, input_tasks=4 + │ [Stage 1] => NetworkBroadcastExec: partitions_per_consumer=6, stage_partitions=24, input_tasks=4 │ ProjectionExec: expr=[l_extendedprice@1 as l_extendedprice, l_discount@2 as l_discount, l_shipdate@3 as l_shipdate, c_nationkey@4 as c_nationkey, n_name@0 as n_name] │ HashJoinExec: mode=CollectLeft, join_type=Inner, on=[(n_nationkey@0, s_nationkey@0)], projection=[n_name@1, l_extendedprice@3, l_discount@4, l_shipdate@5, c_nationkey@6] │ CoalescePartitionsExec - │ [Stage 2] => NetworkBroadcastExec: partitions_per_consumer=4, stage_partitions=16, input_tasks=4 + │ [Stage 2] => NetworkBroadcastExec: partitions_per_consumer=6, stage_partitions=24, input_tasks=4 │ ProjectionExec: expr=[s_nationkey@1 as s_nationkey, l_extendedprice@2 as l_extendedprice, l_discount@3 as l_discount, l_shipdate@4 as l_shipdate, c_nationkey@0 as c_nationkey] │ HashJoinExec: mode=CollectLeft, join_type=Inner, on=[(c_custkey@0, o_custkey@4)], projection=[c_nationkey@1, s_nationkey@2, l_extendedprice@3, l_discount@4, l_shipdate@5] │ CoalescePartitionsExec - │ [Stage 3] => NetworkBroadcastExec: partitions_per_consumer=4, stage_partitions=16, input_tasks=4 + │ [Stage 3] => NetworkBroadcastExec: partitions_per_consumer=6, stage_partitions=24, input_tasks=4 │ ProjectionExec: expr=[s_nationkey@1 as s_nationkey, l_extendedprice@2 as l_extendedprice, l_discount@3 as l_discount, l_shipdate@4 as l_shipdate, o_custkey@0 as o_custkey] │ HashJoinExec: mode=CollectLeft, join_type=Inner, on=[(o_orderkey@0, l_orderkey@1)], projection=[o_custkey@1, s_nationkey@2, l_extendedprice@4, l_discount@5, l_shipdate@6] │ CoalescePartitionsExec - │ [Stage 4] => NetworkBroadcastExec: partitions_per_consumer=4, stage_partitions=16, input_tasks=4 + │ [Stage 4] => NetworkBroadcastExec: partitions_per_consumer=6, stage_partitions=24, input_tasks=4 │ HashJoinExec: mode=CollectLeft, join_type=Inner, on=[(s_suppkey@0, l_suppkey@1)], projection=[s_nationkey@1, l_orderkey@2, l_extendedprice@4, l_discount@5, l_shipdate@6] │ CoalescePartitionsExec - │ [Stage 5] => NetworkBroadcastExec: partitions_per_consumer=4, stage_partitions=16, input_tasks=4 + │ [Stage 5] => NetworkBroadcastExec: partitions_per_consumer=6, stage_partitions=24, input_tasks=4 │ FilterExec: l_shipdate@4 >= 1995-01-01 AND l_shipdate@4 <= 1996-12-31 - │ PartitionIsolatorExec: tasks=4 partitions=16 - │ DataSourceExec: file_groups={16 groups: [[/testdata/tpch/plan_sf0.02/lineitem/1.parquet], [/testdata/tpch/plan_sf0.02/lineitem/10.parquet], [/testdata/tpch/plan_sf0.02/lineitem/11.parquet], [/testdata/tpch/plan_sf0.02/lineitem/12.parquet], [/testdata/tpch/plan_sf0.02/lineitem/13.parquet], ...]}, projection=[l_orderkey, l_suppkey, l_extendedprice, l_discount, l_shipdate], file_type=parquet, predicate=l_shipdate@10 >= 1995-01-01 AND l_shipdate@10 <= 1996-12-31 AND DynamicFilter [ empty ] AND DynamicFilter [ empty ], pruning_predicate=l_shipdate_null_count@1 != row_count@2 AND l_shipdate_max@0 >= 1995-01-01 AND l_shipdate_null_count@1 != row_count@2 AND l_shipdate_min@3 <= 1996-12-31, required_guarantees=[] + │ DistributedLeafExec: DataSourceExec: file_groups={6 groups: [[/testdata/tpch/plan_sf0.02/lineitem/1.parquet, /testdata/tpch/plan_sf0.02/lineitem/10.parquet, /testdata/tpch/plan_sf0.02/lineitem/11.parquet], [/testdata/tpch/plan_sf0.02/lineitem/12.parquet, /testdata/tpch/plan_sf0.02/lineitem/13.parquet, /testdata/tpch/plan_sf0.02/lineitem/14.parquet], [/testdata/tpch/plan_sf0.02/lineitem/15.parquet, /testdata/tpch/plan_sf0.02/lineitem/16.parquet, /testdata/tpch/plan_sf0.02/lineitem/2.parquet], [/testdata/tpch/plan_sf0.02/lineitem/3.parquet, /testdata/tpch/plan_sf0.02/lineitem/4.parquet, /testdata/tpch/plan_sf0.02/lineitem/5.parquet], [/testdata/tpch/plan_sf0.02/lineitem/6.parquet, /testdata/tpch/plan_sf0.02/lineitem/7.parquet, /testdata/tpch/plan_sf0.02/lineitem/8.parquet], ...]}, projection=[l_orderkey, l_suppkey, l_extendedprice, l_discount, l_shipdate], file_type=parquet, predicate=l_shipdate@10 >= 1995-01-01 AND l_shipdate@10 <= 1996-12-31 AND DynamicFilter [ empty ] AND DynamicFilter [ empty ], pruning_predicate=l_shipdate_null_count@1 != row_count@2 AND l_shipdate_max@0 >= 1995-01-01 AND l_shipdate_null_count@1 != row_count@2 AND l_shipdate_min@3 <= 1996-12-31, required_guarantees=[] └────────────────────────────────────────────────── - ┌───── Stage 1 ── Tasks: t0:[p0..p15] t1:[p16..p31] t2:[p32..p47] t3:[p48..p63] - │ BroadcastExec: input_partitions=4, consumer_tasks=4, output_partitions=16 + ┌───── Stage 1 ── Tasks: t0:[p0..p23] t1:[p24..p47] t2:[p48..p71] t3:[p72..p95] + │ BroadcastExec: input_partitions=6, consumer_tasks=4, output_partitions=24 │ FilterExec: n_name@1 = GERMANY OR n_name@1 = FRANCE - │ PartitionIsolatorExec: tasks=4 partitions=16 - │ DataSourceExec: file_groups={16 groups: [[/testdata/tpch/plan_sf0.02/nation/1.parquet], [/testdata/tpch/plan_sf0.02/nation/10.parquet], [/testdata/tpch/plan_sf0.02/nation/11.parquet], [/testdata/tpch/plan_sf0.02/nation/12.parquet], [/testdata/tpch/plan_sf0.02/nation/13.parquet], ...]}, projection=[n_nationkey, n_name], file_type=parquet, predicate=n_name@1 = GERMANY OR n_name@1 = FRANCE, pruning_predicate=n_name_null_count@2 != row_count@3 AND n_name_min@0 <= GERMANY AND GERMANY <= n_name_max@1 OR n_name_null_count@2 != row_count@3 AND n_name_min@0 <= FRANCE AND FRANCE <= n_name_max@1, required_guarantees=[n_name in (FRANCE, GERMANY)] + │ DistributedLeafExec: DataSourceExec: file_groups={6 groups: [[/testdata/tpch/plan_sf0.02/nation/1.parquet, /testdata/tpch/plan_sf0.02/nation/10.parquet, /testdata/tpch/plan_sf0.02/nation/11.parquet], [/testdata/tpch/plan_sf0.02/nation/12.parquet, /testdata/tpch/plan_sf0.02/nation/13.parquet, /testdata/tpch/plan_sf0.02/nation/14.parquet], [/testdata/tpch/plan_sf0.02/nation/15.parquet, /testdata/tpch/plan_sf0.02/nation/16.parquet, /testdata/tpch/plan_sf0.02/nation/2.parquet], [/testdata/tpch/plan_sf0.02/nation/3.parquet, /testdata/tpch/plan_sf0.02/nation/4.parquet, /testdata/tpch/plan_sf0.02/nation/5.parquet], [/testdata/tpch/plan_sf0.02/nation/6.parquet, /testdata/tpch/plan_sf0.02/nation/7.parquet, /testdata/tpch/plan_sf0.02/nation/8.parquet], ...]}, projection=[n_nationkey, n_name], file_type=parquet, predicate=n_name@1 = GERMANY OR n_name@1 = FRANCE, pruning_predicate=n_name_null_count@2 != row_count@3 AND n_name_min@0 <= GERMANY AND GERMANY <= n_name_max@1 OR n_name_null_count@2 != row_count@3 AND n_name_min@0 <= FRANCE AND FRANCE <= n_name_max@1, required_guarantees=[n_name in (FRANCE, GERMANY)] └────────────────────────────────────────────────── - ┌───── Stage 2 ── Tasks: t0:[p0..p15] t1:[p16..p31] t2:[p32..p47] t3:[p48..p63] - │ BroadcastExec: input_partitions=4, consumer_tasks=4, output_partitions=16 + ┌───── Stage 2 ── Tasks: t0:[p0..p23] t1:[p24..p47] t2:[p48..p71] t3:[p72..p95] + │ BroadcastExec: input_partitions=6, consumer_tasks=4, output_partitions=24 │ FilterExec: n_name@1 = FRANCE OR n_name@1 = GERMANY - │ PartitionIsolatorExec: tasks=4 partitions=16 - │ DataSourceExec: file_groups={16 groups: [[/testdata/tpch/plan_sf0.02/nation/1.parquet], [/testdata/tpch/plan_sf0.02/nation/10.parquet], [/testdata/tpch/plan_sf0.02/nation/11.parquet], [/testdata/tpch/plan_sf0.02/nation/12.parquet], [/testdata/tpch/plan_sf0.02/nation/13.parquet], ...]}, projection=[n_nationkey, n_name], file_type=parquet, predicate=n_name@1 = FRANCE OR n_name@1 = GERMANY, pruning_predicate=n_name_null_count@2 != row_count@3 AND n_name_min@0 <= FRANCE AND FRANCE <= n_name_max@1 OR n_name_null_count@2 != row_count@3 AND n_name_min@0 <= GERMANY AND GERMANY <= n_name_max@1, required_guarantees=[n_name in (FRANCE, GERMANY)] + │ DistributedLeafExec: DataSourceExec: file_groups={6 groups: [[/testdata/tpch/plan_sf0.02/nation/1.parquet, /testdata/tpch/plan_sf0.02/nation/10.parquet, /testdata/tpch/plan_sf0.02/nation/11.parquet], [/testdata/tpch/plan_sf0.02/nation/12.parquet, /testdata/tpch/plan_sf0.02/nation/13.parquet, /testdata/tpch/plan_sf0.02/nation/14.parquet], [/testdata/tpch/plan_sf0.02/nation/15.parquet, /testdata/tpch/plan_sf0.02/nation/16.parquet, /testdata/tpch/plan_sf0.02/nation/2.parquet], [/testdata/tpch/plan_sf0.02/nation/3.parquet, /testdata/tpch/plan_sf0.02/nation/4.parquet, /testdata/tpch/plan_sf0.02/nation/5.parquet], [/testdata/tpch/plan_sf0.02/nation/6.parquet, /testdata/tpch/plan_sf0.02/nation/7.parquet, /testdata/tpch/plan_sf0.02/nation/8.parquet], ...]}, projection=[n_nationkey, n_name], file_type=parquet, predicate=n_name@1 = FRANCE OR n_name@1 = GERMANY, pruning_predicate=n_name_null_count@2 != row_count@3 AND n_name_min@0 <= FRANCE AND FRANCE <= n_name_max@1 OR n_name_null_count@2 != row_count@3 AND n_name_min@0 <= GERMANY AND GERMANY <= n_name_max@1, required_guarantees=[n_name in (FRANCE, GERMANY)] └────────────────────────────────────────────────── - ┌───── Stage 3 ── Tasks: t0:[p0..p15] t1:[p16..p31] t2:[p32..p47] t3:[p48..p63] - │ BroadcastExec: input_partitions=4, consumer_tasks=4, output_partitions=16 - │ PartitionIsolatorExec: tasks=4 partitions=16 - │ DataSourceExec: file_groups={16 groups: [[/testdata/tpch/plan_sf0.02/customer/1.parquet], [/testdata/tpch/plan_sf0.02/customer/10.parquet], [/testdata/tpch/plan_sf0.02/customer/11.parquet], [/testdata/tpch/plan_sf0.02/customer/12.parquet], [/testdata/tpch/plan_sf0.02/customer/13.parquet], ...]}, projection=[c_custkey, c_nationkey], file_type=parquet, predicate=DynamicFilter [ empty ] + ┌───── Stage 3 ── Tasks: t0:[p0..p23] t1:[p24..p47] t2:[p48..p71] t3:[p72..p95] + │ BroadcastExec: input_partitions=6, consumer_tasks=4, output_partitions=24 + │ DistributedLeafExec: DataSourceExec: file_groups={6 groups: [[/testdata/tpch/plan_sf0.02/customer/1.parquet, /testdata/tpch/plan_sf0.02/customer/10.parquet, /testdata/tpch/plan_sf0.02/customer/11.parquet], [/testdata/tpch/plan_sf0.02/customer/12.parquet, /testdata/tpch/plan_sf0.02/customer/13.parquet, /testdata/tpch/plan_sf0.02/customer/14.parquet], [/testdata/tpch/plan_sf0.02/customer/15.parquet, /testdata/tpch/plan_sf0.02/customer/16.parquet, /testdata/tpch/plan_sf0.02/customer/2.parquet], [/testdata/tpch/plan_sf0.02/customer/3.parquet, /testdata/tpch/plan_sf0.02/customer/4.parquet, /testdata/tpch/plan_sf0.02/customer/5.parquet], [/testdata/tpch/plan_sf0.02/customer/6.parquet, /testdata/tpch/plan_sf0.02/customer/7.parquet, /testdata/tpch/plan_sf0.02/customer/8.parquet], ...]}, projection=[c_custkey, c_nationkey], file_type=parquet, predicate=DynamicFilter [ empty ] └────────────────────────────────────────────────── - ┌───── Stage 4 ── Tasks: t0:[p0..p15] t1:[p16..p31] t2:[p32..p47] t3:[p48..p63] - │ BroadcastExec: input_partitions=4, consumer_tasks=4, output_partitions=16 - │ PartitionIsolatorExec: tasks=4 partitions=16 - │ DataSourceExec: file_groups={16 groups: [[/testdata/tpch/plan_sf0.02/orders/1.parquet], [/testdata/tpch/plan_sf0.02/orders/10.parquet], [/testdata/tpch/plan_sf0.02/orders/11.parquet], [/testdata/tpch/plan_sf0.02/orders/12.parquet], [/testdata/tpch/plan_sf0.02/orders/13.parquet], ...]}, projection=[o_orderkey, o_custkey], file_type=parquet, predicate=DynamicFilter [ empty ] + ┌───── Stage 4 ── Tasks: t0:[p0..p23] t1:[p24..p47] t2:[p48..p71] t3:[p72..p95] + │ BroadcastExec: input_partitions=6, consumer_tasks=4, output_partitions=24 + │ DistributedLeafExec: DataSourceExec: file_groups={6 groups: [[/testdata/tpch/plan_sf0.02/orders/1.parquet, /testdata/tpch/plan_sf0.02/orders/10.parquet, /testdata/tpch/plan_sf0.02/orders/11.parquet], [/testdata/tpch/plan_sf0.02/orders/12.parquet, /testdata/tpch/plan_sf0.02/orders/13.parquet, /testdata/tpch/plan_sf0.02/orders/14.parquet], [/testdata/tpch/plan_sf0.02/orders/15.parquet, /testdata/tpch/plan_sf0.02/orders/16.parquet, /testdata/tpch/plan_sf0.02/orders/2.parquet], [/testdata/tpch/plan_sf0.02/orders/3.parquet, /testdata/tpch/plan_sf0.02/orders/4.parquet, /testdata/tpch/plan_sf0.02/orders/5.parquet], [/testdata/tpch/plan_sf0.02/orders/6.parquet, /testdata/tpch/plan_sf0.02/orders/7.parquet, /testdata/tpch/plan_sf0.02/orders/8.parquet], ...]}, projection=[o_orderkey, o_custkey], file_type=parquet, predicate=DynamicFilter [ empty ] └────────────────────────────────────────────────── - ┌───── Stage 5 ── Tasks: t0:[p0..p15] t1:[p16..p31] t2:[p32..p47] t3:[p48..p63] - │ BroadcastExec: input_partitions=4, consumer_tasks=4, output_partitions=16 - │ PartitionIsolatorExec: tasks=4 partitions=16 - │ DataSourceExec: file_groups={16 groups: [[/testdata/tpch/plan_sf0.02/supplier/1.parquet], [/testdata/tpch/plan_sf0.02/supplier/10.parquet], [/testdata/tpch/plan_sf0.02/supplier/11.parquet], [/testdata/tpch/plan_sf0.02/supplier/12.parquet], [/testdata/tpch/plan_sf0.02/supplier/13.parquet], ...]}, projection=[s_suppkey, s_nationkey], file_type=parquet, predicate=DynamicFilter [ empty ] + ┌───── Stage 5 ── Tasks: t0:[p0..p23] t1:[p24..p47] t2:[p48..p71] t3:[p72..p95] + │ BroadcastExec: input_partitions=6, consumer_tasks=4, output_partitions=24 + │ DistributedLeafExec: DataSourceExec: file_groups={6 groups: [[/testdata/tpch/plan_sf0.02/supplier/1.parquet, /testdata/tpch/plan_sf0.02/supplier/10.parquet, /testdata/tpch/plan_sf0.02/supplier/11.parquet], [/testdata/tpch/plan_sf0.02/supplier/12.parquet, /testdata/tpch/plan_sf0.02/supplier/13.parquet, /testdata/tpch/plan_sf0.02/supplier/14.parquet], [/testdata/tpch/plan_sf0.02/supplier/15.parquet, /testdata/tpch/plan_sf0.02/supplier/16.parquet, /testdata/tpch/plan_sf0.02/supplier/2.parquet], [/testdata/tpch/plan_sf0.02/supplier/3.parquet, /testdata/tpch/plan_sf0.02/supplier/4.parquet, /testdata/tpch/plan_sf0.02/supplier/5.parquet], [/testdata/tpch/plan_sf0.02/supplier/6.parquet, /testdata/tpch/plan_sf0.02/supplier/7.parquet, /testdata/tpch/plan_sf0.02/supplier/8.parquet], ...]}, projection=[s_suppkey, s_nationkey], file_type=parquet, predicate=DynamicFilter [ empty ] └────────────────────────────────────────────────── "); Ok(()) @@ -382,75 +355,67 @@ mod tests { │ [Stage 8] => NetworkShuffleExec: output_partitions=6, input_tasks=4 └────────────────────────────────────────────────── ┌───── Stage 8 ── Tasks: t0:[p0..p17] t1:[p0..p17] t2:[p0..p17] t3:[p0..p17] - │ RepartitionExec: partitioning=Hash([o_year@0], 18), input_partitions=4 + │ RepartitionExec: partitioning=Hash([o_year@0], 18), input_partitions=6 │ AggregateExec: mode=Partial, gby=[o_year@0 as o_year], aggr=[sum(CASE WHEN all_nations.nation = Utf8("BRAZIL") THEN all_nations.volume ELSE Int64(0) END), sum(all_nations.volume)] │ ProjectionExec: expr=[date_part(YEAR, o_orderdate@2) as o_year, l_extendedprice@0 * (Some(1),20,0 - l_discount@1) as volume, n_name@3 as nation] │ HashJoinExec: mode=CollectLeft, join_type=Inner, on=[(r_regionkey@0, n_regionkey@3)], projection=[l_extendedprice@1, l_discount@2, o_orderdate@3, n_name@5] │ CoalescePartitionsExec - │ [Stage 1] => NetworkBroadcastExec: partitions_per_consumer=4, stage_partitions=16, input_tasks=4 + │ [Stage 1] => NetworkBroadcastExec: partitions_per_consumer=6, stage_partitions=24, input_tasks=4 │ ProjectionExec: expr=[l_extendedprice@1 as l_extendedprice, l_discount@2 as l_discount, o_orderdate@3 as o_orderdate, n_regionkey@4 as n_regionkey, n_name@0 as n_name] │ HashJoinExec: mode=CollectLeft, join_type=Inner, on=[(n_nationkey@0, s_nationkey@2)], projection=[n_name@1, l_extendedprice@2, l_discount@3, o_orderdate@5, n_regionkey@6] │ CoalescePartitionsExec - │ [Stage 2] => NetworkBroadcastExec: partitions_per_consumer=4, stage_partitions=16, input_tasks=4 + │ [Stage 2] => NetworkBroadcastExec: partitions_per_consumer=6, stage_partitions=24, input_tasks=4 │ ProjectionExec: expr=[l_extendedprice@1 as l_extendedprice, l_discount@2 as l_discount, s_nationkey@3 as s_nationkey, o_orderdate@4 as o_orderdate, n_regionkey@0 as n_regionkey] │ HashJoinExec: mode=CollectLeft, join_type=Inner, on=[(n_nationkey@0, c_nationkey@4)], projection=[n_regionkey@1, l_extendedprice@2, l_discount@3, s_nationkey@4, o_orderdate@5] │ CoalescePartitionsExec - │ [Stage 3] => NetworkBroadcastExec: partitions_per_consumer=4, stage_partitions=16, input_tasks=4 + │ [Stage 3] => NetworkBroadcastExec: partitions_per_consumer=6, stage_partitions=24, input_tasks=4 │ ProjectionExec: expr=[l_extendedprice@1 as l_extendedprice, l_discount@2 as l_discount, s_nationkey@3 as s_nationkey, o_orderdate@4 as o_orderdate, c_nationkey@0 as c_nationkey] │ HashJoinExec: mode=CollectLeft, join_type=Inner, on=[(c_custkey@0, o_custkey@3)], projection=[c_nationkey@1, l_extendedprice@2, l_discount@3, s_nationkey@4, o_orderdate@6] │ CoalescePartitionsExec - │ [Stage 4] => NetworkBroadcastExec: partitions_per_consumer=4, stage_partitions=16, input_tasks=4 + │ [Stage 4] => NetworkBroadcastExec: partitions_per_consumer=6, stage_partitions=24, input_tasks=4 │ ProjectionExec: expr=[l_extendedprice@2 as l_extendedprice, l_discount@3 as l_discount, s_nationkey@4 as s_nationkey, o_custkey@0 as o_custkey, o_orderdate@1 as o_orderdate] │ HashJoinExec: mode=CollectLeft, join_type=Inner, on=[(o_orderkey@0, l_orderkey@0)], projection=[o_custkey@1, o_orderdate@2, l_extendedprice@4, l_discount@5, s_nationkey@6] │ CoalescePartitionsExec - │ [Stage 5] => NetworkBroadcastExec: partitions_per_consumer=4, stage_partitions=16, input_tasks=4 + │ [Stage 5] => NetworkBroadcastExec: partitions_per_consumer=6, stage_partitions=24, input_tasks=4 │ ProjectionExec: expr=[l_orderkey@1 as l_orderkey, l_extendedprice@2 as l_extendedprice, l_discount@3 as l_discount, s_nationkey@0 as s_nationkey] │ HashJoinExec: mode=CollectLeft, join_type=Inner, on=[(s_suppkey@0, l_suppkey@1)], projection=[s_nationkey@1, l_orderkey@2, l_extendedprice@4, l_discount@5] │ CoalescePartitionsExec - │ [Stage 6] => NetworkBroadcastExec: partitions_per_consumer=4, stage_partitions=16, input_tasks=4 + │ [Stage 6] => NetworkBroadcastExec: partitions_per_consumer=6, stage_partitions=24, input_tasks=4 │ HashJoinExec: mode=CollectLeft, join_type=Inner, on=[(p_partkey@0, l_partkey@1)], projection=[l_orderkey@1, l_suppkey@3, l_extendedprice@4, l_discount@5] │ CoalescePartitionsExec - │ [Stage 7] => NetworkBroadcastExec: partitions_per_consumer=4, stage_partitions=16, input_tasks=4 - │ PartitionIsolatorExec: tasks=4 partitions=16 - │ DataSourceExec: file_groups={16 groups: [[/testdata/tpch/plan_sf0.02/lineitem/1.parquet], [/testdata/tpch/plan_sf0.02/lineitem/10.parquet], [/testdata/tpch/plan_sf0.02/lineitem/11.parquet], [/testdata/tpch/plan_sf0.02/lineitem/12.parquet], [/testdata/tpch/plan_sf0.02/lineitem/13.parquet], ...]}, projection=[l_orderkey, l_partkey, l_suppkey, l_extendedprice, l_discount], file_type=parquet, predicate=DynamicFilter [ empty ] AND DynamicFilter [ empty ] AND DynamicFilter [ empty ] + │ [Stage 7] => NetworkBroadcastExec: partitions_per_consumer=6, stage_partitions=24, input_tasks=4 + │ DistributedLeafExec: DataSourceExec: file_groups={6 groups: [[/testdata/tpch/plan_sf0.02/lineitem/1.parquet, /testdata/tpch/plan_sf0.02/lineitem/10.parquet, /testdata/tpch/plan_sf0.02/lineitem/11.parquet], [/testdata/tpch/plan_sf0.02/lineitem/12.parquet, /testdata/tpch/plan_sf0.02/lineitem/13.parquet, /testdata/tpch/plan_sf0.02/lineitem/14.parquet], [/testdata/tpch/plan_sf0.02/lineitem/15.parquet, /testdata/tpch/plan_sf0.02/lineitem/16.parquet, /testdata/tpch/plan_sf0.02/lineitem/2.parquet], [/testdata/tpch/plan_sf0.02/lineitem/3.parquet, /testdata/tpch/plan_sf0.02/lineitem/4.parquet, /testdata/tpch/plan_sf0.02/lineitem/5.parquet], [/testdata/tpch/plan_sf0.02/lineitem/6.parquet, /testdata/tpch/plan_sf0.02/lineitem/7.parquet, /testdata/tpch/plan_sf0.02/lineitem/8.parquet], ...]}, projection=[l_orderkey, l_partkey, l_suppkey, l_extendedprice, l_discount], file_type=parquet, predicate=DynamicFilter [ empty ] AND DynamicFilter [ empty ] AND DynamicFilter [ empty ] └────────────────────────────────────────────────── - ┌───── Stage 1 ── Tasks: t0:[p0..p15] t1:[p16..p31] t2:[p32..p47] t3:[p48..p63] - │ BroadcastExec: input_partitions=4, consumer_tasks=4, output_partitions=16 + ┌───── Stage 1 ── Tasks: t0:[p0..p23] t1:[p24..p47] t2:[p48..p71] t3:[p72..p95] + │ BroadcastExec: input_partitions=6, consumer_tasks=4, output_partitions=24 │ FilterExec: r_name@1 = AMERICA, projection=[r_regionkey@0] - │ PartitionIsolatorExec: tasks=4 partitions=16 - │ DataSourceExec: file_groups={16 groups: [[/testdata/tpch/plan_sf0.02/region/1.parquet], [/testdata/tpch/plan_sf0.02/region/10.parquet], [/testdata/tpch/plan_sf0.02/region/11.parquet], [/testdata/tpch/plan_sf0.02/region/12.parquet], [/testdata/tpch/plan_sf0.02/region/13.parquet], ...]}, projection=[r_regionkey, r_name], file_type=parquet, predicate=r_name@1 = AMERICA, pruning_predicate=r_name_null_count@2 != row_count@3 AND r_name_min@0 <= AMERICA AND AMERICA <= r_name_max@1, required_guarantees=[r_name in (AMERICA)] + │ DistributedLeafExec: DataSourceExec: file_groups={6 groups: [[/testdata/tpch/plan_sf0.02/region/1.parquet, /testdata/tpch/plan_sf0.02/region/10.parquet, /testdata/tpch/plan_sf0.02/region/11.parquet], [/testdata/tpch/plan_sf0.02/region/12.parquet, /testdata/tpch/plan_sf0.02/region/13.parquet, /testdata/tpch/plan_sf0.02/region/14.parquet], [/testdata/tpch/plan_sf0.02/region/15.parquet, /testdata/tpch/plan_sf0.02/region/16.parquet, /testdata/tpch/plan_sf0.02/region/2.parquet], [/testdata/tpch/plan_sf0.02/region/3.parquet, /testdata/tpch/plan_sf0.02/region/4.parquet, /testdata/tpch/plan_sf0.02/region/5.parquet], [/testdata/tpch/plan_sf0.02/region/6.parquet, /testdata/tpch/plan_sf0.02/region/7.parquet, /testdata/tpch/plan_sf0.02/region/8.parquet], ...]}, projection=[r_regionkey, r_name], file_type=parquet, predicate=r_name@1 = AMERICA, pruning_predicate=r_name_null_count@2 != row_count@3 AND r_name_min@0 <= AMERICA AND AMERICA <= r_name_max@1, required_guarantees=[r_name in (AMERICA)] └────────────────────────────────────────────────── - ┌───── Stage 2 ── Tasks: t0:[p0..p15] t1:[p16..p31] t2:[p32..p47] t3:[p48..p63] - │ BroadcastExec: input_partitions=4, consumer_tasks=4, output_partitions=16 - │ PartitionIsolatorExec: tasks=4 partitions=16 - │ DataSourceExec: file_groups={16 groups: [[/testdata/tpch/plan_sf0.02/nation/1.parquet], [/testdata/tpch/plan_sf0.02/nation/10.parquet], [/testdata/tpch/plan_sf0.02/nation/11.parquet], [/testdata/tpch/plan_sf0.02/nation/12.parquet], [/testdata/tpch/plan_sf0.02/nation/13.parquet], ...]}, projection=[n_nationkey, n_name], file_type=parquet + ┌───── Stage 2 ── Tasks: t0:[p0..p23] t1:[p24..p47] t2:[p48..p71] t3:[p72..p95] + │ BroadcastExec: input_partitions=6, consumer_tasks=4, output_partitions=24 + │ DistributedLeafExec: DataSourceExec: file_groups={6 groups: [[/testdata/tpch/plan_sf0.02/nation/1.parquet, /testdata/tpch/plan_sf0.02/nation/10.parquet, /testdata/tpch/plan_sf0.02/nation/11.parquet], [/testdata/tpch/plan_sf0.02/nation/12.parquet, /testdata/tpch/plan_sf0.02/nation/13.parquet, /testdata/tpch/plan_sf0.02/nation/14.parquet], [/testdata/tpch/plan_sf0.02/nation/15.parquet, /testdata/tpch/plan_sf0.02/nation/16.parquet, /testdata/tpch/plan_sf0.02/nation/2.parquet], [/testdata/tpch/plan_sf0.02/nation/3.parquet, /testdata/tpch/plan_sf0.02/nation/4.parquet, /testdata/tpch/plan_sf0.02/nation/5.parquet], [/testdata/tpch/plan_sf0.02/nation/6.parquet, /testdata/tpch/plan_sf0.02/nation/7.parquet, /testdata/tpch/plan_sf0.02/nation/8.parquet], ...]}, projection=[n_nationkey, n_name], file_type=parquet └────────────────────────────────────────────────── - ┌───── Stage 3 ── Tasks: t0:[p0..p15] t1:[p16..p31] t2:[p32..p47] t3:[p48..p63] - │ BroadcastExec: input_partitions=4, consumer_tasks=4, output_partitions=16 - │ PartitionIsolatorExec: tasks=4 partitions=16 - │ DataSourceExec: file_groups={16 groups: [[/testdata/tpch/plan_sf0.02/nation/1.parquet], [/testdata/tpch/plan_sf0.02/nation/10.parquet], [/testdata/tpch/plan_sf0.02/nation/11.parquet], [/testdata/tpch/plan_sf0.02/nation/12.parquet], [/testdata/tpch/plan_sf0.02/nation/13.parquet], ...]}, projection=[n_nationkey, n_regionkey], file_type=parquet, predicate=DynamicFilter [ empty ] + ┌───── Stage 3 ── Tasks: t0:[p0..p23] t1:[p24..p47] t2:[p48..p71] t3:[p72..p95] + │ BroadcastExec: input_partitions=6, consumer_tasks=4, output_partitions=24 + │ DistributedLeafExec: DataSourceExec: file_groups={6 groups: [[/testdata/tpch/plan_sf0.02/nation/1.parquet, /testdata/tpch/plan_sf0.02/nation/10.parquet, /testdata/tpch/plan_sf0.02/nation/11.parquet], [/testdata/tpch/plan_sf0.02/nation/12.parquet, /testdata/tpch/plan_sf0.02/nation/13.parquet, /testdata/tpch/plan_sf0.02/nation/14.parquet], [/testdata/tpch/plan_sf0.02/nation/15.parquet, /testdata/tpch/plan_sf0.02/nation/16.parquet, /testdata/tpch/plan_sf0.02/nation/2.parquet], [/testdata/tpch/plan_sf0.02/nation/3.parquet, /testdata/tpch/plan_sf0.02/nation/4.parquet, /testdata/tpch/plan_sf0.02/nation/5.parquet], [/testdata/tpch/plan_sf0.02/nation/6.parquet, /testdata/tpch/plan_sf0.02/nation/7.parquet, /testdata/tpch/plan_sf0.02/nation/8.parquet], ...]}, projection=[n_nationkey, n_regionkey], file_type=parquet, predicate=DynamicFilter [ empty ] └────────────────────────────────────────────────── - ┌───── Stage 4 ── Tasks: t0:[p0..p15] t1:[p16..p31] t2:[p32..p47] t3:[p48..p63] - │ BroadcastExec: input_partitions=4, consumer_tasks=4, output_partitions=16 - │ PartitionIsolatorExec: tasks=4 partitions=16 - │ DataSourceExec: file_groups={16 groups: [[/testdata/tpch/plan_sf0.02/customer/1.parquet], [/testdata/tpch/plan_sf0.02/customer/10.parquet], [/testdata/tpch/plan_sf0.02/customer/11.parquet], [/testdata/tpch/plan_sf0.02/customer/12.parquet], [/testdata/tpch/plan_sf0.02/customer/13.parquet], ...]}, projection=[c_custkey, c_nationkey], file_type=parquet, predicate=DynamicFilter [ empty ] + ┌───── Stage 4 ── Tasks: t0:[p0..p23] t1:[p24..p47] t2:[p48..p71] t3:[p72..p95] + │ BroadcastExec: input_partitions=6, consumer_tasks=4, output_partitions=24 + │ DistributedLeafExec: DataSourceExec: file_groups={6 groups: [[/testdata/tpch/plan_sf0.02/customer/1.parquet, /testdata/tpch/plan_sf0.02/customer/10.parquet, /testdata/tpch/plan_sf0.02/customer/11.parquet], [/testdata/tpch/plan_sf0.02/customer/12.parquet, /testdata/tpch/plan_sf0.02/customer/13.parquet, /testdata/tpch/plan_sf0.02/customer/14.parquet], [/testdata/tpch/plan_sf0.02/customer/15.parquet, /testdata/tpch/plan_sf0.02/customer/16.parquet, /testdata/tpch/plan_sf0.02/customer/2.parquet], [/testdata/tpch/plan_sf0.02/customer/3.parquet, /testdata/tpch/plan_sf0.02/customer/4.parquet, /testdata/tpch/plan_sf0.02/customer/5.parquet], [/testdata/tpch/plan_sf0.02/customer/6.parquet, /testdata/tpch/plan_sf0.02/customer/7.parquet, /testdata/tpch/plan_sf0.02/customer/8.parquet], ...]}, projection=[c_custkey, c_nationkey], file_type=parquet, predicate=DynamicFilter [ empty ] └────────────────────────────────────────────────── - ┌───── Stage 5 ── Tasks: t0:[p0..p15] t1:[p16..p31] t2:[p32..p47] t3:[p48..p63] - │ BroadcastExec: input_partitions=4, consumer_tasks=4, output_partitions=16 + ┌───── Stage 5 ── Tasks: t0:[p0..p23] t1:[p24..p47] t2:[p48..p71] t3:[p72..p95] + │ BroadcastExec: input_partitions=6, consumer_tasks=4, output_partitions=24 │ FilterExec: o_orderdate@2 >= 1995-01-01 AND o_orderdate@2 <= 1996-12-31 - │ PartitionIsolatorExec: tasks=4 partitions=16 - │ DataSourceExec: file_groups={16 groups: [[/testdata/tpch/plan_sf0.02/orders/1.parquet], [/testdata/tpch/plan_sf0.02/orders/10.parquet], [/testdata/tpch/plan_sf0.02/orders/11.parquet], [/testdata/tpch/plan_sf0.02/orders/12.parquet], [/testdata/tpch/plan_sf0.02/orders/13.parquet], ...]}, projection=[o_orderkey, o_custkey, o_orderdate], file_type=parquet, predicate=o_orderdate@4 >= 1995-01-01 AND o_orderdate@4 <= 1996-12-31 AND DynamicFilter [ empty ], pruning_predicate=o_orderdate_null_count@1 != row_count@2 AND o_orderdate_max@0 >= 1995-01-01 AND o_orderdate_null_count@1 != row_count@2 AND o_orderdate_min@3 <= 1996-12-31, required_guarantees=[] + │ DistributedLeafExec: DataSourceExec: file_groups={6 groups: [[/testdata/tpch/plan_sf0.02/orders/1.parquet, /testdata/tpch/plan_sf0.02/orders/10.parquet, /testdata/tpch/plan_sf0.02/orders/11.parquet], [/testdata/tpch/plan_sf0.02/orders/12.parquet, /testdata/tpch/plan_sf0.02/orders/13.parquet, /testdata/tpch/plan_sf0.02/orders/14.parquet], [/testdata/tpch/plan_sf0.02/orders/15.parquet, /testdata/tpch/plan_sf0.02/orders/16.parquet, /testdata/tpch/plan_sf0.02/orders/2.parquet], [/testdata/tpch/plan_sf0.02/orders/3.parquet, /testdata/tpch/plan_sf0.02/orders/4.parquet, /testdata/tpch/plan_sf0.02/orders/5.parquet], [/testdata/tpch/plan_sf0.02/orders/6.parquet, /testdata/tpch/plan_sf0.02/orders/7.parquet, /testdata/tpch/plan_sf0.02/orders/8.parquet], ...]}, projection=[o_orderkey, o_custkey, o_orderdate], file_type=parquet, predicate=o_orderdate@4 >= 1995-01-01 AND o_orderdate@4 <= 1996-12-31 AND DynamicFilter [ empty ], pruning_predicate=o_orderdate_null_count@1 != row_count@2 AND o_orderdate_max@0 >= 1995-01-01 AND o_orderdate_null_count@1 != row_count@2 AND o_orderdate_min@3 <= 1996-12-31, required_guarantees=[] └────────────────────────────────────────────────── - ┌───── Stage 6 ── Tasks: t0:[p0..p15] t1:[p16..p31] t2:[p32..p47] t3:[p48..p63] - │ BroadcastExec: input_partitions=4, consumer_tasks=4, output_partitions=16 - │ PartitionIsolatorExec: tasks=4 partitions=16 - │ DataSourceExec: file_groups={16 groups: [[/testdata/tpch/plan_sf0.02/supplier/1.parquet], [/testdata/tpch/plan_sf0.02/supplier/10.parquet], [/testdata/tpch/plan_sf0.02/supplier/11.parquet], [/testdata/tpch/plan_sf0.02/supplier/12.parquet], [/testdata/tpch/plan_sf0.02/supplier/13.parquet], ...]}, projection=[s_suppkey, s_nationkey], file_type=parquet, predicate=DynamicFilter [ empty ] + ┌───── Stage 6 ── Tasks: t0:[p0..p23] t1:[p24..p47] t2:[p48..p71] t3:[p72..p95] + │ BroadcastExec: input_partitions=6, consumer_tasks=4, output_partitions=24 + │ DistributedLeafExec: DataSourceExec: file_groups={6 groups: [[/testdata/tpch/plan_sf0.02/supplier/1.parquet, /testdata/tpch/plan_sf0.02/supplier/10.parquet, /testdata/tpch/plan_sf0.02/supplier/11.parquet], [/testdata/tpch/plan_sf0.02/supplier/12.parquet, /testdata/tpch/plan_sf0.02/supplier/13.parquet, /testdata/tpch/plan_sf0.02/supplier/14.parquet], [/testdata/tpch/plan_sf0.02/supplier/15.parquet, /testdata/tpch/plan_sf0.02/supplier/16.parquet, /testdata/tpch/plan_sf0.02/supplier/2.parquet], [/testdata/tpch/plan_sf0.02/supplier/3.parquet, /testdata/tpch/plan_sf0.02/supplier/4.parquet, /testdata/tpch/plan_sf0.02/supplier/5.parquet], [/testdata/tpch/plan_sf0.02/supplier/6.parquet, /testdata/tpch/plan_sf0.02/supplier/7.parquet, /testdata/tpch/plan_sf0.02/supplier/8.parquet], ...]}, projection=[s_suppkey, s_nationkey], file_type=parquet, predicate=DynamicFilter [ empty ] └────────────────────────────────────────────────── - ┌───── Stage 7 ── Tasks: t0:[p0..p15] t1:[p16..p31] t2:[p32..p47] t3:[p48..p63] - │ BroadcastExec: input_partitions=4, consumer_tasks=4, output_partitions=16 + ┌───── Stage 7 ── Tasks: t0:[p0..p23] t1:[p24..p47] t2:[p48..p71] t3:[p72..p95] + │ BroadcastExec: input_partitions=6, consumer_tasks=4, output_partitions=24 │ FilterExec: p_type@1 = ECONOMY ANODIZED STEEL, projection=[p_partkey@0] - │ PartitionIsolatorExec: tasks=4 partitions=16 - │ DataSourceExec: file_groups={16 groups: [[/testdata/tpch/plan_sf0.02/part/1.parquet], [/testdata/tpch/plan_sf0.02/part/10.parquet], [/testdata/tpch/plan_sf0.02/part/11.parquet], [/testdata/tpch/plan_sf0.02/part/12.parquet], [/testdata/tpch/plan_sf0.02/part/13.parquet], ...]}, projection=[p_partkey, p_type], file_type=parquet, predicate=p_type@4 = ECONOMY ANODIZED STEEL, pruning_predicate=p_type_null_count@2 != row_count@3 AND p_type_min@0 <= ECONOMY ANODIZED STEEL AND ECONOMY ANODIZED STEEL <= p_type_max@1, required_guarantees=[p_type in (ECONOMY ANODIZED STEEL)] + │ DistributedLeafExec: DataSourceExec: file_groups={6 groups: [[/testdata/tpch/plan_sf0.02/part/1.parquet, /testdata/tpch/plan_sf0.02/part/10.parquet, /testdata/tpch/plan_sf0.02/part/11.parquet], [/testdata/tpch/plan_sf0.02/part/12.parquet, /testdata/tpch/plan_sf0.02/part/13.parquet, /testdata/tpch/plan_sf0.02/part/14.parquet], [/testdata/tpch/plan_sf0.02/part/15.parquet, /testdata/tpch/plan_sf0.02/part/16.parquet, /testdata/tpch/plan_sf0.02/part/2.parquet], [/testdata/tpch/plan_sf0.02/part/3.parquet, /testdata/tpch/plan_sf0.02/part/4.parquet, /testdata/tpch/plan_sf0.02/part/5.parquet], [/testdata/tpch/plan_sf0.02/part/6.parquet, /testdata/tpch/plan_sf0.02/part/7.parquet, /testdata/tpch/plan_sf0.02/part/8.parquet], ...]}, projection=[p_partkey, p_type], file_type=parquet, predicate=p_type@4 = ECONOMY ANODIZED STEEL, pruning_predicate=p_type_null_count@2 != row_count@3 AND p_type_min@0 <= ECONOMY ANODIZED STEEL AND ECONOMY ANODIZED STEEL <= p_type_max@1, required_guarantees=[p_type in (ECONOMY ANODIZED STEEL)] └────────────────────────────────────────────────── "#); Ok(()) @@ -459,7 +424,7 @@ mod tests { #[tokio::test] async fn test_tpch_9() -> Result<(), Box> { let plan = test_tpch_query("q9").await?; - assert_snapshot!(plan, @" + assert_snapshot!(plan, @r" ┌───── DistributedExec ── Tasks: t0:[p0] │ SortPreservingMergeExec: [nation@0 ASC NULLS LAST, o_year@1 DESC] │ [Stage 7] => NetworkCoalesceExec: output_partitions=18, input_tasks=3 @@ -471,55 +436,49 @@ mod tests { │ [Stage 6] => NetworkShuffleExec: output_partitions=6, input_tasks=4 └────────────────────────────────────────────────── ┌───── Stage 6 ── Tasks: t0:[p0..p17] t1:[p0..p17] t2:[p0..p17] t3:[p0..p17] - │ RepartitionExec: partitioning=Hash([nation@0, o_year@1], 18), input_partitions=4 + │ RepartitionExec: partitioning=Hash([nation@0, o_year@1], 18), input_partitions=6 │ AggregateExec: mode=Partial, gby=[nation@0 as nation, o_year@1 as o_year], aggr=[sum(profit.amount)] │ ProjectionExec: expr=[n_name@0 as nation, date_part(YEAR, o_orderdate@5) as o_year, l_extendedprice@2 * (Some(1),20,0 - l_discount@3) - ps_supplycost@4 * l_quantity@1 as amount] │ HashJoinExec: mode=CollectLeft, join_type=Inner, on=[(n_nationkey@0, s_nationkey@3)], projection=[n_name@1, l_quantity@2, l_extendedprice@3, l_discount@4, ps_supplycost@6, o_orderdate@7] │ CoalescePartitionsExec - │ [Stage 1] => NetworkBroadcastExec: partitions_per_consumer=4, stage_partitions=16, input_tasks=4 + │ [Stage 1] => NetworkBroadcastExec: partitions_per_consumer=6, stage_partitions=24, input_tasks=4 │ ProjectionExec: expr=[l_quantity@1 as l_quantity, l_extendedprice@2 as l_extendedprice, l_discount@3 as l_discount, s_nationkey@4 as s_nationkey, ps_supplycost@5 as ps_supplycost, o_orderdate@0 as o_orderdate] │ HashJoinExec: mode=CollectLeft, join_type=Inner, on=[(o_orderkey@0, l_orderkey@0)], projection=[o_orderdate@1, l_quantity@3, l_extendedprice@4, l_discount@5, s_nationkey@6, ps_supplycost@7] │ CoalescePartitionsExec - │ [Stage 2] => NetworkBroadcastExec: partitions_per_consumer=4, stage_partitions=16, input_tasks=4 + │ [Stage 2] => NetworkBroadcastExec: partitions_per_consumer=6, stage_partitions=24, input_tasks=4 │ ProjectionExec: expr=[l_orderkey@1 as l_orderkey, l_quantity@2 as l_quantity, l_extendedprice@3 as l_extendedprice, l_discount@4 as l_discount, s_nationkey@5 as s_nationkey, ps_supplycost@0 as ps_supplycost] │ HashJoinExec: mode=CollectLeft, join_type=Inner, on=[(ps_suppkey@1, l_suppkey@2), (ps_partkey@0, l_partkey@1)], projection=[ps_supplycost@2, l_orderkey@3, l_quantity@6, l_extendedprice@7, l_discount@8, s_nationkey@9] │ CoalescePartitionsExec - │ [Stage 3] => NetworkBroadcastExec: partitions_per_consumer=4, stage_partitions=16, input_tasks=4 + │ [Stage 3] => NetworkBroadcastExec: partitions_per_consumer=6, stage_partitions=24, input_tasks=4 │ ProjectionExec: expr=[l_orderkey@1 as l_orderkey, l_partkey@2 as l_partkey, l_suppkey@3 as l_suppkey, l_quantity@4 as l_quantity, l_extendedprice@5 as l_extendedprice, l_discount@6 as l_discount, s_nationkey@0 as s_nationkey] │ HashJoinExec: mode=CollectLeft, join_type=Inner, on=[(s_suppkey@0, l_suppkey@2)], projection=[s_nationkey@1, l_orderkey@2, l_partkey@3, l_suppkey@4, l_quantity@5, l_extendedprice@6, l_discount@7] │ CoalescePartitionsExec - │ [Stage 4] => NetworkBroadcastExec: partitions_per_consumer=4, stage_partitions=16, input_tasks=4 + │ [Stage 4] => NetworkBroadcastExec: partitions_per_consumer=6, stage_partitions=24, input_tasks=4 │ HashJoinExec: mode=CollectLeft, join_type=Inner, on=[(p_partkey@0, l_partkey@1)], projection=[l_orderkey@1, l_partkey@2, l_suppkey@3, l_quantity@4, l_extendedprice@5, l_discount@6] │ CoalescePartitionsExec - │ [Stage 5] => NetworkBroadcastExec: partitions_per_consumer=4, stage_partitions=16, input_tasks=4 - │ PartitionIsolatorExec: tasks=4 partitions=16 - │ DataSourceExec: file_groups={16 groups: [[/testdata/tpch/plan_sf0.02/lineitem/1.parquet], [/testdata/tpch/plan_sf0.02/lineitem/10.parquet], [/testdata/tpch/plan_sf0.02/lineitem/11.parquet], [/testdata/tpch/plan_sf0.02/lineitem/12.parquet], [/testdata/tpch/plan_sf0.02/lineitem/13.parquet], ...]}, projection=[l_orderkey, l_partkey, l_suppkey, l_quantity, l_extendedprice, l_discount], file_type=parquet, predicate=DynamicFilter [ empty ] AND DynamicFilter [ empty ] AND DynamicFilter [ empty ] AND DynamicFilter [ empty ] + │ [Stage 5] => NetworkBroadcastExec: partitions_per_consumer=6, stage_partitions=24, input_tasks=4 + │ DistributedLeafExec: DataSourceExec: file_groups={6 groups: [[/testdata/tpch/plan_sf0.02/lineitem/1.parquet, /testdata/tpch/plan_sf0.02/lineitem/10.parquet, /testdata/tpch/plan_sf0.02/lineitem/11.parquet], [/testdata/tpch/plan_sf0.02/lineitem/12.parquet, /testdata/tpch/plan_sf0.02/lineitem/13.parquet, /testdata/tpch/plan_sf0.02/lineitem/14.parquet], [/testdata/tpch/plan_sf0.02/lineitem/15.parquet, /testdata/tpch/plan_sf0.02/lineitem/16.parquet, /testdata/tpch/plan_sf0.02/lineitem/2.parquet], [/testdata/tpch/plan_sf0.02/lineitem/3.parquet, /testdata/tpch/plan_sf0.02/lineitem/4.parquet, /testdata/tpch/plan_sf0.02/lineitem/5.parquet], [/testdata/tpch/plan_sf0.02/lineitem/6.parquet, /testdata/tpch/plan_sf0.02/lineitem/7.parquet, /testdata/tpch/plan_sf0.02/lineitem/8.parquet], ...]}, projection=[l_orderkey, l_partkey, l_suppkey, l_quantity, l_extendedprice, l_discount], file_type=parquet, predicate=DynamicFilter [ empty ] AND DynamicFilter [ empty ] AND DynamicFilter [ empty ] AND DynamicFilter [ empty ] └────────────────────────────────────────────────── - ┌───── Stage 1 ── Tasks: t0:[p0..p15] t1:[p16..p31] t2:[p32..p47] t3:[p48..p63] - │ BroadcastExec: input_partitions=4, consumer_tasks=4, output_partitions=16 - │ PartitionIsolatorExec: tasks=4 partitions=16 - │ DataSourceExec: file_groups={16 groups: [[/testdata/tpch/plan_sf0.02/nation/1.parquet], [/testdata/tpch/plan_sf0.02/nation/10.parquet], [/testdata/tpch/plan_sf0.02/nation/11.parquet], [/testdata/tpch/plan_sf0.02/nation/12.parquet], [/testdata/tpch/plan_sf0.02/nation/13.parquet], ...]}, projection=[n_nationkey, n_name], file_type=parquet + ┌───── Stage 1 ── Tasks: t0:[p0..p23] t1:[p24..p47] t2:[p48..p71] t3:[p72..p95] + │ BroadcastExec: input_partitions=6, consumer_tasks=4, output_partitions=24 + │ DistributedLeafExec: DataSourceExec: file_groups={6 groups: [[/testdata/tpch/plan_sf0.02/nation/1.parquet, /testdata/tpch/plan_sf0.02/nation/10.parquet, /testdata/tpch/plan_sf0.02/nation/11.parquet], [/testdata/tpch/plan_sf0.02/nation/12.parquet, /testdata/tpch/plan_sf0.02/nation/13.parquet, /testdata/tpch/plan_sf0.02/nation/14.parquet], [/testdata/tpch/plan_sf0.02/nation/15.parquet, /testdata/tpch/plan_sf0.02/nation/16.parquet, /testdata/tpch/plan_sf0.02/nation/2.parquet], [/testdata/tpch/plan_sf0.02/nation/3.parquet, /testdata/tpch/plan_sf0.02/nation/4.parquet, /testdata/tpch/plan_sf0.02/nation/5.parquet], [/testdata/tpch/plan_sf0.02/nation/6.parquet, /testdata/tpch/plan_sf0.02/nation/7.parquet, /testdata/tpch/plan_sf0.02/nation/8.parquet], ...]}, projection=[n_nationkey, n_name], file_type=parquet └────────────────────────────────────────────────── - ┌───── Stage 2 ── Tasks: t0:[p0..p15] t1:[p16..p31] t2:[p32..p47] t3:[p48..p63] - │ BroadcastExec: input_partitions=4, consumer_tasks=4, output_partitions=16 - │ PartitionIsolatorExec: tasks=4 partitions=16 - │ DataSourceExec: file_groups={16 groups: [[/testdata/tpch/plan_sf0.02/orders/1.parquet], [/testdata/tpch/plan_sf0.02/orders/10.parquet], [/testdata/tpch/plan_sf0.02/orders/11.parquet], [/testdata/tpch/plan_sf0.02/orders/12.parquet], [/testdata/tpch/plan_sf0.02/orders/13.parquet], ...]}, projection=[o_orderkey, o_orderdate], file_type=parquet + ┌───── Stage 2 ── Tasks: t0:[p0..p23] t1:[p24..p47] t2:[p48..p71] t3:[p72..p95] + │ BroadcastExec: input_partitions=6, consumer_tasks=4, output_partitions=24 + │ DistributedLeafExec: DataSourceExec: file_groups={6 groups: [[/testdata/tpch/plan_sf0.02/orders/1.parquet, /testdata/tpch/plan_sf0.02/orders/10.parquet, /testdata/tpch/plan_sf0.02/orders/11.parquet], [/testdata/tpch/plan_sf0.02/orders/12.parquet, /testdata/tpch/plan_sf0.02/orders/13.parquet, /testdata/tpch/plan_sf0.02/orders/14.parquet], [/testdata/tpch/plan_sf0.02/orders/15.parquet, /testdata/tpch/plan_sf0.02/orders/16.parquet, /testdata/tpch/plan_sf0.02/orders/2.parquet], [/testdata/tpch/plan_sf0.02/orders/3.parquet, /testdata/tpch/plan_sf0.02/orders/4.parquet, /testdata/tpch/plan_sf0.02/orders/5.parquet], [/testdata/tpch/plan_sf0.02/orders/6.parquet, /testdata/tpch/plan_sf0.02/orders/7.parquet, /testdata/tpch/plan_sf0.02/orders/8.parquet], ...]}, projection=[o_orderkey, o_orderdate], file_type=parquet └────────────────────────────────────────────────── - ┌───── Stage 3 ── Tasks: t0:[p0..p15] t1:[p16..p31] t2:[p32..p47] t3:[p48..p63] - │ BroadcastExec: input_partitions=4, consumer_tasks=4, output_partitions=16 - │ PartitionIsolatorExec: tasks=4 partitions=16 - │ DataSourceExec: file_groups={16 groups: [[/testdata/tpch/plan_sf0.02/partsupp/1.parquet], [/testdata/tpch/plan_sf0.02/partsupp/10.parquet], [/testdata/tpch/plan_sf0.02/partsupp/11.parquet], [/testdata/tpch/plan_sf0.02/partsupp/12.parquet], [/testdata/tpch/plan_sf0.02/partsupp/13.parquet], ...]}, projection=[ps_partkey, ps_suppkey, ps_supplycost], file_type=parquet + ┌───── Stage 3 ── Tasks: t0:[p0..p23] t1:[p24..p47] t2:[p48..p71] t3:[p72..p95] + │ BroadcastExec: input_partitions=6, consumer_tasks=4, output_partitions=24 + │ DistributedLeafExec: DataSourceExec: file_groups={6 groups: [[/testdata/tpch/plan_sf0.02/partsupp/1.parquet, /testdata/tpch/plan_sf0.02/partsupp/10.parquet, /testdata/tpch/plan_sf0.02/partsupp/11.parquet], [/testdata/tpch/plan_sf0.02/partsupp/12.parquet, /testdata/tpch/plan_sf0.02/partsupp/13.parquet, /testdata/tpch/plan_sf0.02/partsupp/14.parquet], [/testdata/tpch/plan_sf0.02/partsupp/15.parquet, /testdata/tpch/plan_sf0.02/partsupp/16.parquet, /testdata/tpch/plan_sf0.02/partsupp/2.parquet], [/testdata/tpch/plan_sf0.02/partsupp/3.parquet, /testdata/tpch/plan_sf0.02/partsupp/4.parquet, /testdata/tpch/plan_sf0.02/partsupp/5.parquet], [/testdata/tpch/plan_sf0.02/partsupp/6.parquet, /testdata/tpch/plan_sf0.02/partsupp/7.parquet, /testdata/tpch/plan_sf0.02/partsupp/8.parquet], ...]}, projection=[ps_partkey, ps_suppkey, ps_supplycost], file_type=parquet └────────────────────────────────────────────────── - ┌───── Stage 4 ── Tasks: t0:[p0..p15] t1:[p16..p31] t2:[p32..p47] t3:[p48..p63] - │ BroadcastExec: input_partitions=4, consumer_tasks=4, output_partitions=16 - │ PartitionIsolatorExec: tasks=4 partitions=16 - │ DataSourceExec: file_groups={16 groups: [[/testdata/tpch/plan_sf0.02/supplier/1.parquet], [/testdata/tpch/plan_sf0.02/supplier/10.parquet], [/testdata/tpch/plan_sf0.02/supplier/11.parquet], [/testdata/tpch/plan_sf0.02/supplier/12.parquet], [/testdata/tpch/plan_sf0.02/supplier/13.parquet], ...]}, projection=[s_suppkey, s_nationkey], file_type=parquet, predicate=DynamicFilter [ empty ] + ┌───── Stage 4 ── Tasks: t0:[p0..p23] t1:[p24..p47] t2:[p48..p71] t3:[p72..p95] + │ BroadcastExec: input_partitions=6, consumer_tasks=4, output_partitions=24 + │ DistributedLeafExec: DataSourceExec: file_groups={6 groups: [[/testdata/tpch/plan_sf0.02/supplier/1.parquet, /testdata/tpch/plan_sf0.02/supplier/10.parquet, /testdata/tpch/plan_sf0.02/supplier/11.parquet], [/testdata/tpch/plan_sf0.02/supplier/12.parquet, /testdata/tpch/plan_sf0.02/supplier/13.parquet, /testdata/tpch/plan_sf0.02/supplier/14.parquet], [/testdata/tpch/plan_sf0.02/supplier/15.parquet, /testdata/tpch/plan_sf0.02/supplier/16.parquet, /testdata/tpch/plan_sf0.02/supplier/2.parquet], [/testdata/tpch/plan_sf0.02/supplier/3.parquet, /testdata/tpch/plan_sf0.02/supplier/4.parquet, /testdata/tpch/plan_sf0.02/supplier/5.parquet], [/testdata/tpch/plan_sf0.02/supplier/6.parquet, /testdata/tpch/plan_sf0.02/supplier/7.parquet, /testdata/tpch/plan_sf0.02/supplier/8.parquet], ...]}, projection=[s_suppkey, s_nationkey], file_type=parquet, predicate=DynamicFilter [ empty ] └────────────────────────────────────────────────── - ┌───── Stage 5 ── Tasks: t0:[p0..p15] t1:[p16..p31] t2:[p32..p47] t3:[p48..p63] - │ BroadcastExec: input_partitions=4, consumer_tasks=4, output_partitions=16 + ┌───── Stage 5 ── Tasks: t0:[p0..p23] t1:[p24..p47] t2:[p48..p71] t3:[p72..p95] + │ BroadcastExec: input_partitions=6, consumer_tasks=4, output_partitions=24 │ FilterExec: p_name@1 LIKE %green%, projection=[p_partkey@0] - │ PartitionIsolatorExec: tasks=4 partitions=16 - │ DataSourceExec: file_groups={16 groups: [[/testdata/tpch/plan_sf0.02/part/1.parquet], [/testdata/tpch/plan_sf0.02/part/10.parquet], [/testdata/tpch/plan_sf0.02/part/11.parquet], [/testdata/tpch/plan_sf0.02/part/12.parquet], [/testdata/tpch/plan_sf0.02/part/13.parquet], ...]}, projection=[p_partkey, p_name], file_type=parquet, predicate=p_name@1 LIKE %green% + │ DistributedLeafExec: DataSourceExec: file_groups={6 groups: [[/testdata/tpch/plan_sf0.02/part/1.parquet, /testdata/tpch/plan_sf0.02/part/10.parquet, /testdata/tpch/plan_sf0.02/part/11.parquet], [/testdata/tpch/plan_sf0.02/part/12.parquet, /testdata/tpch/plan_sf0.02/part/13.parquet, /testdata/tpch/plan_sf0.02/part/14.parquet], [/testdata/tpch/plan_sf0.02/part/15.parquet, /testdata/tpch/plan_sf0.02/part/16.parquet, /testdata/tpch/plan_sf0.02/part/2.parquet], [/testdata/tpch/plan_sf0.02/part/3.parquet, /testdata/tpch/plan_sf0.02/part/4.parquet, /testdata/tpch/plan_sf0.02/part/5.parquet], [/testdata/tpch/plan_sf0.02/part/6.parquet, /testdata/tpch/plan_sf0.02/part/7.parquet, /testdata/tpch/plan_sf0.02/part/8.parquet], ...]}, projection=[p_partkey, p_name], file_type=parquet, predicate=p_name@1 LIKE %green% └────────────────────────────────────────────────── "); Ok(()) @@ -528,7 +487,7 @@ mod tests { #[tokio::test] async fn test_tpch_10() -> Result<(), Box> { let plan = test_tpch_query("q10").await?; - assert_snapshot!(plan, @" + assert_snapshot!(plan, @r" ┌───── DistributedExec ── Tasks: t0:[p0] │ SortPreservingMergeExec: [revenue@2 DESC] │ [Stage 5] => NetworkCoalesceExec: output_partitions=18, input_tasks=3 @@ -540,37 +499,33 @@ mod tests { │ [Stage 4] => NetworkShuffleExec: output_partitions=6, input_tasks=4 └────────────────────────────────────────────────── ┌───── Stage 4 ── Tasks: t0:[p0..p17] t1:[p0..p17] t2:[p0..p17] t3:[p0..p17] - │ RepartitionExec: partitioning=Hash([c_custkey@0, c_name@1, c_acctbal@2, c_phone@3, n_name@4, c_address@5, c_comment@6], 18), input_partitions=4 + │ RepartitionExec: partitioning=Hash([c_custkey@0, c_name@1, c_acctbal@2, c_phone@3, n_name@4, c_address@5, c_comment@6], 18), input_partitions=6 │ AggregateExec: mode=Partial, gby=[c_custkey@0 as c_custkey, c_name@1 as c_name, c_acctbal@4 as c_acctbal, c_phone@3 as c_phone, n_name@8 as n_name, c_address@2 as c_address, c_comment@5 as c_comment], aggr=[sum(lineitem.l_extendedprice * Int64(1) - lineitem.l_discount)] │ ProjectionExec: expr=[c_custkey@1 as c_custkey, c_name@2 as c_name, c_address@3 as c_address, c_phone@4 as c_phone, c_acctbal@5 as c_acctbal, c_comment@6 as c_comment, l_extendedprice@7 as l_extendedprice, l_discount@8 as l_discount, n_name@0 as n_name] │ HashJoinExec: mode=CollectLeft, join_type=Inner, on=[(n_nationkey@0, c_nationkey@3)], projection=[n_name@1, c_custkey@2, c_name@3, c_address@4, c_phone@6, c_acctbal@7, c_comment@8, l_extendedprice@9, l_discount@10] │ CoalescePartitionsExec - │ [Stage 1] => NetworkBroadcastExec: partitions_per_consumer=4, stage_partitions=16, input_tasks=4 + │ [Stage 1] => NetworkBroadcastExec: partitions_per_consumer=6, stage_partitions=24, input_tasks=4 │ HashJoinExec: mode=CollectLeft, join_type=Inner, on=[(o_orderkey@7, l_orderkey@0)], projection=[c_custkey@0, c_name@1, c_address@2, c_nationkey@3, c_phone@4, c_acctbal@5, c_comment@6, l_extendedprice@9, l_discount@10] │ CoalescePartitionsExec - │ [Stage 3] => NetworkBroadcastExec: partitions_per_consumer=4, stage_partitions=16, input_tasks=4 + │ [Stage 3] => NetworkBroadcastExec: partitions_per_consumer=6, stage_partitions=24, input_tasks=4 │ FilterExec: l_returnflag@3 = R, projection=[l_orderkey@0, l_extendedprice@1, l_discount@2] - │ PartitionIsolatorExec: tasks=4 partitions=16 - │ DataSourceExec: file_groups={16 groups: [[/testdata/tpch/plan_sf0.02/lineitem/1.parquet], [/testdata/tpch/plan_sf0.02/lineitem/10.parquet], [/testdata/tpch/plan_sf0.02/lineitem/11.parquet], [/testdata/tpch/plan_sf0.02/lineitem/12.parquet], [/testdata/tpch/plan_sf0.02/lineitem/13.parquet], ...]}, projection=[l_orderkey, l_extendedprice, l_discount, l_returnflag], file_type=parquet, predicate=l_returnflag@8 = R AND DynamicFilter [ empty ], pruning_predicate=l_returnflag_null_count@2 != row_count@3 AND l_returnflag_min@0 <= R AND R <= l_returnflag_max@1, required_guarantees=[l_returnflag in (R)] + │ DistributedLeafExec: DataSourceExec: file_groups={6 groups: [[/testdata/tpch/plan_sf0.02/lineitem/1.parquet, /testdata/tpch/plan_sf0.02/lineitem/10.parquet, /testdata/tpch/plan_sf0.02/lineitem/11.parquet], [/testdata/tpch/plan_sf0.02/lineitem/12.parquet, /testdata/tpch/plan_sf0.02/lineitem/13.parquet, /testdata/tpch/plan_sf0.02/lineitem/14.parquet], [/testdata/tpch/plan_sf0.02/lineitem/15.parquet, /testdata/tpch/plan_sf0.02/lineitem/16.parquet, /testdata/tpch/plan_sf0.02/lineitem/2.parquet], [/testdata/tpch/plan_sf0.02/lineitem/3.parquet, /testdata/tpch/plan_sf0.02/lineitem/4.parquet, /testdata/tpch/plan_sf0.02/lineitem/5.parquet], [/testdata/tpch/plan_sf0.02/lineitem/6.parquet, /testdata/tpch/plan_sf0.02/lineitem/7.parquet, /testdata/tpch/plan_sf0.02/lineitem/8.parquet], ...]}, projection=[l_orderkey, l_extendedprice, l_discount, l_returnflag], file_type=parquet, predicate=l_returnflag@8 = R AND DynamicFilter [ empty ], pruning_predicate=l_returnflag_null_count@2 != row_count@3 AND l_returnflag_min@0 <= R AND R <= l_returnflag_max@1, required_guarantees=[l_returnflag in (R)] └────────────────────────────────────────────────── - ┌───── Stage 1 ── Tasks: t0:[p0..p15] t1:[p16..p31] t2:[p32..p47] t3:[p48..p63] - │ BroadcastExec: input_partitions=4, consumer_tasks=4, output_partitions=16 - │ PartitionIsolatorExec: tasks=4 partitions=16 - │ DataSourceExec: file_groups={16 groups: [[/testdata/tpch/plan_sf0.02/nation/1.parquet], [/testdata/tpch/plan_sf0.02/nation/10.parquet], [/testdata/tpch/plan_sf0.02/nation/11.parquet], [/testdata/tpch/plan_sf0.02/nation/12.parquet], [/testdata/tpch/plan_sf0.02/nation/13.parquet], ...]}, projection=[n_nationkey, n_name], file_type=parquet + ┌───── Stage 1 ── Tasks: t0:[p0..p23] t1:[p24..p47] t2:[p48..p71] t3:[p72..p95] + │ BroadcastExec: input_partitions=6, consumer_tasks=4, output_partitions=24 + │ DistributedLeafExec: DataSourceExec: file_groups={6 groups: [[/testdata/tpch/plan_sf0.02/nation/1.parquet, /testdata/tpch/plan_sf0.02/nation/10.parquet, /testdata/tpch/plan_sf0.02/nation/11.parquet], [/testdata/tpch/plan_sf0.02/nation/12.parquet, /testdata/tpch/plan_sf0.02/nation/13.parquet, /testdata/tpch/plan_sf0.02/nation/14.parquet], [/testdata/tpch/plan_sf0.02/nation/15.parquet, /testdata/tpch/plan_sf0.02/nation/16.parquet, /testdata/tpch/plan_sf0.02/nation/2.parquet], [/testdata/tpch/plan_sf0.02/nation/3.parquet, /testdata/tpch/plan_sf0.02/nation/4.parquet, /testdata/tpch/plan_sf0.02/nation/5.parquet], [/testdata/tpch/plan_sf0.02/nation/6.parquet, /testdata/tpch/plan_sf0.02/nation/7.parquet, /testdata/tpch/plan_sf0.02/nation/8.parquet], ...]}, projection=[n_nationkey, n_name], file_type=parquet └────────────────────────────────────────────────── - ┌───── Stage 3 ── Tasks: t0:[p0..p15] t1:[p16..p31] t2:[p32..p47] t3:[p48..p63] - │ BroadcastExec: input_partitions=4, consumer_tasks=4, output_partitions=16 + ┌───── Stage 3 ── Tasks: t0:[p0..p23] t1:[p24..p47] t2:[p48..p71] t3:[p72..p95] + │ BroadcastExec: input_partitions=6, consumer_tasks=4, output_partitions=24 │ HashJoinExec: mode=CollectLeft, join_type=Inner, on=[(c_custkey@0, o_custkey@1)], projection=[c_custkey@0, c_name@1, c_address@2, c_nationkey@3, c_phone@4, c_acctbal@5, c_comment@6, o_orderkey@7] │ CoalescePartitionsExec - │ [Stage 2] => NetworkBroadcastExec: partitions_per_consumer=4, stage_partitions=16, input_tasks=4 + │ [Stage 2] => NetworkBroadcastExec: partitions_per_consumer=6, stage_partitions=24, input_tasks=4 │ FilterExec: o_orderdate@2 >= 1993-10-01 AND o_orderdate@2 < 1994-01-01, projection=[o_orderkey@0, o_custkey@1] - │ PartitionIsolatorExec: tasks=4 partitions=16 - │ DataSourceExec: file_groups={16 groups: [[/testdata/tpch/plan_sf0.02/orders/1.parquet], [/testdata/tpch/plan_sf0.02/orders/10.parquet], [/testdata/tpch/plan_sf0.02/orders/11.parquet], [/testdata/tpch/plan_sf0.02/orders/12.parquet], [/testdata/tpch/plan_sf0.02/orders/13.parquet], ...]}, projection=[o_orderkey, o_custkey, o_orderdate], file_type=parquet, predicate=o_orderdate@4 >= 1993-10-01 AND o_orderdate@4 < 1994-01-01 AND DynamicFilter [ empty ], pruning_predicate=o_orderdate_null_count@1 != row_count@2 AND o_orderdate_max@0 >= 1993-10-01 AND o_orderdate_null_count@1 != row_count@2 AND o_orderdate_min@3 < 1994-01-01, required_guarantees=[] + │ DistributedLeafExec: DataSourceExec: file_groups={6 groups: [[/testdata/tpch/plan_sf0.02/orders/1.parquet, /testdata/tpch/plan_sf0.02/orders/10.parquet, /testdata/tpch/plan_sf0.02/orders/11.parquet], [/testdata/tpch/plan_sf0.02/orders/12.parquet, /testdata/tpch/plan_sf0.02/orders/13.parquet, /testdata/tpch/plan_sf0.02/orders/14.parquet], [/testdata/tpch/plan_sf0.02/orders/15.parquet, /testdata/tpch/plan_sf0.02/orders/16.parquet, /testdata/tpch/plan_sf0.02/orders/2.parquet], [/testdata/tpch/plan_sf0.02/orders/3.parquet, /testdata/tpch/plan_sf0.02/orders/4.parquet, /testdata/tpch/plan_sf0.02/orders/5.parquet], [/testdata/tpch/plan_sf0.02/orders/6.parquet, /testdata/tpch/plan_sf0.02/orders/7.parquet, /testdata/tpch/plan_sf0.02/orders/8.parquet], ...]}, projection=[o_orderkey, o_custkey, o_orderdate], file_type=parquet, predicate=o_orderdate@4 >= 1993-10-01 AND o_orderdate@4 < 1994-01-01 AND DynamicFilter [ empty ], pruning_predicate=o_orderdate_null_count@1 != row_count@2 AND o_orderdate_max@0 >= 1993-10-01 AND o_orderdate_null_count@1 != row_count@2 AND o_orderdate_min@3 < 1994-01-01, required_guarantees=[] └────────────────────────────────────────────────── - ┌───── Stage 2 ── Tasks: t0:[p0..p15] t1:[p16..p31] t2:[p32..p47] t3:[p48..p63] - │ BroadcastExec: input_partitions=4, consumer_tasks=4, output_partitions=16 - │ PartitionIsolatorExec: tasks=4 partitions=16 - │ DataSourceExec: file_groups={16 groups: [[/testdata/tpch/plan_sf0.02/customer/1.parquet], [/testdata/tpch/plan_sf0.02/customer/10.parquet], [/testdata/tpch/plan_sf0.02/customer/11.parquet], [/testdata/tpch/plan_sf0.02/customer/12.parquet], [/testdata/tpch/plan_sf0.02/customer/13.parquet], ...]}, projection=[c_custkey, c_name, c_address, c_nationkey, c_phone, c_acctbal, c_comment], file_type=parquet, predicate=DynamicFilter [ empty ] + ┌───── Stage 2 ── Tasks: t0:[p0..p23] t1:[p24..p47] t2:[p48..p71] t3:[p72..p95] + │ BroadcastExec: input_partitions=6, consumer_tasks=4, output_partitions=24 + │ DistributedLeafExec: DataSourceExec: file_groups={6 groups: [[/testdata/tpch/plan_sf0.02/customer/1.parquet, /testdata/tpch/plan_sf0.02/customer/10.parquet, /testdata/tpch/plan_sf0.02/customer/11.parquet], [/testdata/tpch/plan_sf0.02/customer/12.parquet, /testdata/tpch/plan_sf0.02/customer/13.parquet, /testdata/tpch/plan_sf0.02/customer/14.parquet], [/testdata/tpch/plan_sf0.02/customer/15.parquet, /testdata/tpch/plan_sf0.02/customer/16.parquet, /testdata/tpch/plan_sf0.02/customer/2.parquet], [/testdata/tpch/plan_sf0.02/customer/3.parquet, /testdata/tpch/plan_sf0.02/customer/4.parquet, /testdata/tpch/plan_sf0.02/customer/5.parquet], [/testdata/tpch/plan_sf0.02/customer/6.parquet, /testdata/tpch/plan_sf0.02/customer/7.parquet, /testdata/tpch/plan_sf0.02/customer/8.parquet], ...]}, projection=[c_custkey, c_name, c_address, c_nationkey, c_phone, c_acctbal, c_comment], file_type=parquet, predicate=DynamicFilter [ empty ] └────────────────────────────────────────────────── "); Ok(()) @@ -579,7 +534,7 @@ mod tests { #[tokio::test] async fn test_tpch_11() -> Result<(), Box> { let plan = test_tpch_query("q11").await?; - assert_snapshot!(plan, @" + assert_snapshot!(plan, @r" ┌───── DistributedExec ── Tasks: t0:[p0] │ SortPreservingMergeExec: [value@1 DESC] │ SortExec: expr=[value@1 DESC], preserve_partitioning=[true] @@ -588,57 +543,51 @@ mod tests { │ ProjectionExec: expr=[CAST(CAST(sum(partsupp.ps_supplycost * partsupp.ps_availqty)@0 AS Float64) * 0.0001 AS Decimal128(38, 15)) as sum(partsupp.ps_supplycost * partsupp.ps_availqty) * Float64(0.0001)] │ AggregateExec: mode=Final, gby=[], aggr=[sum(partsupp.ps_supplycost * partsupp.ps_availqty)] │ CoalescePartitionsExec - │ [Stage 3] => NetworkCoalesceExec: output_partitions=16, input_tasks=4 + │ [Stage 3] => NetworkCoalesceExec: output_partitions=24, input_tasks=4 │ ProjectionExec: expr=[ps_partkey@0 as ps_partkey, sum(partsupp.ps_supplycost * partsupp.ps_availqty)@1 as sum(partsupp.ps_supplycost * partsupp.ps_availqty), CAST(sum(partsupp.ps_supplycost * partsupp.ps_availqty)@1 AS Decimal128(38, 15)) as join_proj_push_down_1] │ AggregateExec: mode=FinalPartitioned, gby=[ps_partkey@0 as ps_partkey], aggr=[sum(partsupp.ps_supplycost * partsupp.ps_availqty)] │ [Stage 6] => NetworkShuffleExec: output_partitions=6, input_tasks=4 └────────────────────────────────────────────────── - ┌───── Stage 3 ── Tasks: t0:[p0..p3] t1:[p4..p7] t2:[p8..p11] t3:[p12..p15] + ┌───── Stage 3 ── Tasks: t0:[p0..p5] t1:[p6..p11] t2:[p12..p17] t3:[p18..p23] │ AggregateExec: mode=Partial, gby=[], aggr=[sum(partsupp.ps_supplycost * partsupp.ps_availqty)] │ HashJoinExec: mode=CollectLeft, join_type=Inner, on=[(n_nationkey@0, s_nationkey@2)], projection=[ps_availqty@1, ps_supplycost@2] │ CoalescePartitionsExec - │ [Stage 1] => NetworkBroadcastExec: partitions_per_consumer=4, stage_partitions=16, input_tasks=4 + │ [Stage 1] => NetworkBroadcastExec: partitions_per_consumer=6, stage_partitions=24, input_tasks=4 │ ProjectionExec: expr=[ps_availqty@1 as ps_availqty, ps_supplycost@2 as ps_supplycost, s_nationkey@0 as s_nationkey] │ HashJoinExec: mode=CollectLeft, join_type=Inner, on=[(s_suppkey@0, ps_suppkey@0)], projection=[s_nationkey@1, ps_availqty@3, ps_supplycost@4] │ CoalescePartitionsExec - │ [Stage 2] => NetworkBroadcastExec: partitions_per_consumer=4, stage_partitions=16, input_tasks=4 - │ PartitionIsolatorExec: tasks=4 partitions=16 - │ DataSourceExec: file_groups={16 groups: [[/testdata/tpch/plan_sf0.02/partsupp/1.parquet], [/testdata/tpch/plan_sf0.02/partsupp/10.parquet], [/testdata/tpch/plan_sf0.02/partsupp/11.parquet], [/testdata/tpch/plan_sf0.02/partsupp/12.parquet], [/testdata/tpch/plan_sf0.02/partsupp/13.parquet], ...]}, projection=[ps_suppkey, ps_availqty, ps_supplycost], file_type=parquet, predicate=DynamicFilter [ empty ] + │ [Stage 2] => NetworkBroadcastExec: partitions_per_consumer=6, stage_partitions=24, input_tasks=4 + │ DistributedLeafExec: DataSourceExec: file_groups={6 groups: [[/testdata/tpch/plan_sf0.02/partsupp/1.parquet, /testdata/tpch/plan_sf0.02/partsupp/10.parquet, /testdata/tpch/plan_sf0.02/partsupp/11.parquet], [/testdata/tpch/plan_sf0.02/partsupp/12.parquet, /testdata/tpch/plan_sf0.02/partsupp/13.parquet, /testdata/tpch/plan_sf0.02/partsupp/14.parquet], [/testdata/tpch/plan_sf0.02/partsupp/15.parquet, /testdata/tpch/plan_sf0.02/partsupp/16.parquet, /testdata/tpch/plan_sf0.02/partsupp/2.parquet], [/testdata/tpch/plan_sf0.02/partsupp/3.parquet, /testdata/tpch/plan_sf0.02/partsupp/4.parquet, /testdata/tpch/plan_sf0.02/partsupp/5.parquet], [/testdata/tpch/plan_sf0.02/partsupp/6.parquet, /testdata/tpch/plan_sf0.02/partsupp/7.parquet, /testdata/tpch/plan_sf0.02/partsupp/8.parquet], ...]}, projection=[ps_suppkey, ps_availqty, ps_supplycost], file_type=parquet, predicate=DynamicFilter [ empty ] └────────────────────────────────────────────────── - ┌───── Stage 1 ── Tasks: t0:[p0..p15] t1:[p16..p31] t2:[p32..p47] t3:[p48..p63] - │ BroadcastExec: input_partitions=4, consumer_tasks=4, output_partitions=16 + ┌───── Stage 1 ── Tasks: t0:[p0..p23] t1:[p24..p47] t2:[p48..p71] t3:[p72..p95] + │ BroadcastExec: input_partitions=6, consumer_tasks=4, output_partitions=24 │ FilterExec: n_name@1 = GERMANY, projection=[n_nationkey@0] - │ PartitionIsolatorExec: tasks=4 partitions=16 - │ DataSourceExec: file_groups={16 groups: [[/testdata/tpch/plan_sf0.02/nation/1.parquet], [/testdata/tpch/plan_sf0.02/nation/10.parquet], [/testdata/tpch/plan_sf0.02/nation/11.parquet], [/testdata/tpch/plan_sf0.02/nation/12.parquet], [/testdata/tpch/plan_sf0.02/nation/13.parquet], ...]}, projection=[n_nationkey, n_name], file_type=parquet, predicate=n_name@1 = GERMANY, pruning_predicate=n_name_null_count@2 != row_count@3 AND n_name_min@0 <= GERMANY AND GERMANY <= n_name_max@1, required_guarantees=[n_name in (GERMANY)] + │ DistributedLeafExec: DataSourceExec: file_groups={6 groups: [[/testdata/tpch/plan_sf0.02/nation/1.parquet, /testdata/tpch/plan_sf0.02/nation/10.parquet, /testdata/tpch/plan_sf0.02/nation/11.parquet], [/testdata/tpch/plan_sf0.02/nation/12.parquet, /testdata/tpch/plan_sf0.02/nation/13.parquet, /testdata/tpch/plan_sf0.02/nation/14.parquet], [/testdata/tpch/plan_sf0.02/nation/15.parquet, /testdata/tpch/plan_sf0.02/nation/16.parquet, /testdata/tpch/plan_sf0.02/nation/2.parquet], [/testdata/tpch/plan_sf0.02/nation/3.parquet, /testdata/tpch/plan_sf0.02/nation/4.parquet, /testdata/tpch/plan_sf0.02/nation/5.parquet], [/testdata/tpch/plan_sf0.02/nation/6.parquet, /testdata/tpch/plan_sf0.02/nation/7.parquet, /testdata/tpch/plan_sf0.02/nation/8.parquet], ...]}, projection=[n_nationkey, n_name], file_type=parquet, predicate=n_name@1 = GERMANY, pruning_predicate=n_name_null_count@2 != row_count@3 AND n_name_min@0 <= GERMANY AND GERMANY <= n_name_max@1, required_guarantees=[n_name in (GERMANY)] └────────────────────────────────────────────────── - ┌───── Stage 2 ── Tasks: t0:[p0..p15] t1:[p16..p31] t2:[p32..p47] t3:[p48..p63] - │ BroadcastExec: input_partitions=4, consumer_tasks=4, output_partitions=16 - │ PartitionIsolatorExec: tasks=4 partitions=16 - │ DataSourceExec: file_groups={16 groups: [[/testdata/tpch/plan_sf0.02/supplier/1.parquet], [/testdata/tpch/plan_sf0.02/supplier/10.parquet], [/testdata/tpch/plan_sf0.02/supplier/11.parquet], [/testdata/tpch/plan_sf0.02/supplier/12.parquet], [/testdata/tpch/plan_sf0.02/supplier/13.parquet], ...]}, projection=[s_suppkey, s_nationkey], file_type=parquet, predicate=DynamicFilter [ empty ] + ┌───── Stage 2 ── Tasks: t0:[p0..p23] t1:[p24..p47] t2:[p48..p71] t3:[p72..p95] + │ BroadcastExec: input_partitions=6, consumer_tasks=4, output_partitions=24 + │ DistributedLeafExec: DataSourceExec: file_groups={6 groups: [[/testdata/tpch/plan_sf0.02/supplier/1.parquet, /testdata/tpch/plan_sf0.02/supplier/10.parquet, /testdata/tpch/plan_sf0.02/supplier/11.parquet], [/testdata/tpch/plan_sf0.02/supplier/12.parquet, /testdata/tpch/plan_sf0.02/supplier/13.parquet, /testdata/tpch/plan_sf0.02/supplier/14.parquet], [/testdata/tpch/plan_sf0.02/supplier/15.parquet, /testdata/tpch/plan_sf0.02/supplier/16.parquet, /testdata/tpch/plan_sf0.02/supplier/2.parquet], [/testdata/tpch/plan_sf0.02/supplier/3.parquet, /testdata/tpch/plan_sf0.02/supplier/4.parquet, /testdata/tpch/plan_sf0.02/supplier/5.parquet], [/testdata/tpch/plan_sf0.02/supplier/6.parquet, /testdata/tpch/plan_sf0.02/supplier/7.parquet, /testdata/tpch/plan_sf0.02/supplier/8.parquet], ...]}, projection=[s_suppkey, s_nationkey], file_type=parquet, predicate=DynamicFilter [ empty ] └────────────────────────────────────────────────── ┌───── Stage 6 ── Tasks: t0:[p0..p5] t1:[p0..p5] t2:[p0..p5] t3:[p0..p5] - │ RepartitionExec: partitioning=Hash([ps_partkey@0], 6), input_partitions=4 + │ RepartitionExec: partitioning=Hash([ps_partkey@0], 6), input_partitions=6 │ AggregateExec: mode=Partial, gby=[ps_partkey@0 as ps_partkey], aggr=[sum(partsupp.ps_supplycost * partsupp.ps_availqty)] │ HashJoinExec: mode=CollectLeft, join_type=Inner, on=[(n_nationkey@0, s_nationkey@3)], projection=[ps_partkey@1, ps_availqty@2, ps_supplycost@3] │ CoalescePartitionsExec - │ [Stage 4] => NetworkBroadcastExec: partitions_per_consumer=4, stage_partitions=16, input_tasks=4 + │ [Stage 4] => NetworkBroadcastExec: partitions_per_consumer=6, stage_partitions=24, input_tasks=4 │ ProjectionExec: expr=[ps_partkey@1 as ps_partkey, ps_availqty@2 as ps_availqty, ps_supplycost@3 as ps_supplycost, s_nationkey@0 as s_nationkey] │ HashJoinExec: mode=CollectLeft, join_type=Inner, on=[(s_suppkey@0, ps_suppkey@1)], projection=[s_nationkey@1, ps_partkey@2, ps_availqty@4, ps_supplycost@5] │ CoalescePartitionsExec - │ [Stage 5] => NetworkBroadcastExec: partitions_per_consumer=4, stage_partitions=16, input_tasks=4 - │ PartitionIsolatorExec: tasks=4 partitions=16 - │ DataSourceExec: file_groups={16 groups: [[/testdata/tpch/plan_sf0.02/partsupp/1.parquet], [/testdata/tpch/plan_sf0.02/partsupp/10.parquet], [/testdata/tpch/plan_sf0.02/partsupp/11.parquet], [/testdata/tpch/plan_sf0.02/partsupp/12.parquet], [/testdata/tpch/plan_sf0.02/partsupp/13.parquet], ...]}, projection=[ps_partkey, ps_suppkey, ps_availqty, ps_supplycost], file_type=parquet, predicate=DynamicFilter [ empty ] + │ [Stage 5] => NetworkBroadcastExec: partitions_per_consumer=6, stage_partitions=24, input_tasks=4 + │ DistributedLeafExec: DataSourceExec: file_groups={6 groups: [[/testdata/tpch/plan_sf0.02/partsupp/1.parquet, /testdata/tpch/plan_sf0.02/partsupp/10.parquet, /testdata/tpch/plan_sf0.02/partsupp/11.parquet], [/testdata/tpch/plan_sf0.02/partsupp/12.parquet, /testdata/tpch/plan_sf0.02/partsupp/13.parquet, /testdata/tpch/plan_sf0.02/partsupp/14.parquet], [/testdata/tpch/plan_sf0.02/partsupp/15.parquet, /testdata/tpch/plan_sf0.02/partsupp/16.parquet, /testdata/tpch/plan_sf0.02/partsupp/2.parquet], [/testdata/tpch/plan_sf0.02/partsupp/3.parquet, /testdata/tpch/plan_sf0.02/partsupp/4.parquet, /testdata/tpch/plan_sf0.02/partsupp/5.parquet], [/testdata/tpch/plan_sf0.02/partsupp/6.parquet, /testdata/tpch/plan_sf0.02/partsupp/7.parquet, /testdata/tpch/plan_sf0.02/partsupp/8.parquet], ...]}, projection=[ps_partkey, ps_suppkey, ps_availqty, ps_supplycost], file_type=parquet, predicate=DynamicFilter [ empty ] └────────────────────────────────────────────────── - ┌───── Stage 4 ── Tasks: t0:[p0..p15] t1:[p16..p31] t2:[p32..p47] t3:[p48..p63] - │ BroadcastExec: input_partitions=4, consumer_tasks=4, output_partitions=16 + ┌───── Stage 4 ── Tasks: t0:[p0..p23] t1:[p24..p47] t2:[p48..p71] t3:[p72..p95] + │ BroadcastExec: input_partitions=6, consumer_tasks=4, output_partitions=24 │ FilterExec: n_name@1 = GERMANY, projection=[n_nationkey@0] - │ PartitionIsolatorExec: tasks=4 partitions=16 - │ DataSourceExec: file_groups={16 groups: [[/testdata/tpch/plan_sf0.02/nation/1.parquet], [/testdata/tpch/plan_sf0.02/nation/10.parquet], [/testdata/tpch/plan_sf0.02/nation/11.parquet], [/testdata/tpch/plan_sf0.02/nation/12.parquet], [/testdata/tpch/plan_sf0.02/nation/13.parquet], ...]}, projection=[n_nationkey, n_name], file_type=parquet, predicate=n_name@1 = GERMANY, pruning_predicate=n_name_null_count@2 != row_count@3 AND n_name_min@0 <= GERMANY AND GERMANY <= n_name_max@1, required_guarantees=[n_name in (GERMANY)] + │ DistributedLeafExec: DataSourceExec: file_groups={6 groups: [[/testdata/tpch/plan_sf0.02/nation/1.parquet, /testdata/tpch/plan_sf0.02/nation/10.parquet, /testdata/tpch/plan_sf0.02/nation/11.parquet], [/testdata/tpch/plan_sf0.02/nation/12.parquet, /testdata/tpch/plan_sf0.02/nation/13.parquet, /testdata/tpch/plan_sf0.02/nation/14.parquet], [/testdata/tpch/plan_sf0.02/nation/15.parquet, /testdata/tpch/plan_sf0.02/nation/16.parquet, /testdata/tpch/plan_sf0.02/nation/2.parquet], [/testdata/tpch/plan_sf0.02/nation/3.parquet, /testdata/tpch/plan_sf0.02/nation/4.parquet, /testdata/tpch/plan_sf0.02/nation/5.parquet], [/testdata/tpch/plan_sf0.02/nation/6.parquet, /testdata/tpch/plan_sf0.02/nation/7.parquet, /testdata/tpch/plan_sf0.02/nation/8.parquet], ...]}, projection=[n_nationkey, n_name], file_type=parquet, predicate=n_name@1 = GERMANY, pruning_predicate=n_name_null_count@2 != row_count@3 AND n_name_min@0 <= GERMANY AND GERMANY <= n_name_max@1, required_guarantees=[n_name in (GERMANY)] └────────────────────────────────────────────────── - ┌───── Stage 5 ── Tasks: t0:[p0..p15] t1:[p16..p31] t2:[p32..p47] t3:[p48..p63] - │ BroadcastExec: input_partitions=4, consumer_tasks=4, output_partitions=16 - │ PartitionIsolatorExec: tasks=4 partitions=16 - │ DataSourceExec: file_groups={16 groups: [[/testdata/tpch/plan_sf0.02/supplier/1.parquet], [/testdata/tpch/plan_sf0.02/supplier/10.parquet], [/testdata/tpch/plan_sf0.02/supplier/11.parquet], [/testdata/tpch/plan_sf0.02/supplier/12.parquet], [/testdata/tpch/plan_sf0.02/supplier/13.parquet], ...]}, projection=[s_suppkey, s_nationkey], file_type=parquet, predicate=DynamicFilter [ empty ] + ┌───── Stage 5 ── Tasks: t0:[p0..p23] t1:[p24..p47] t2:[p48..p71] t3:[p72..p95] + │ BroadcastExec: input_partitions=6, consumer_tasks=4, output_partitions=24 + │ DistributedLeafExec: DataSourceExec: file_groups={6 groups: [[/testdata/tpch/plan_sf0.02/supplier/1.parquet, /testdata/tpch/plan_sf0.02/supplier/10.parquet, /testdata/tpch/plan_sf0.02/supplier/11.parquet], [/testdata/tpch/plan_sf0.02/supplier/12.parquet, /testdata/tpch/plan_sf0.02/supplier/13.parquet, /testdata/tpch/plan_sf0.02/supplier/14.parquet], [/testdata/tpch/plan_sf0.02/supplier/15.parquet, /testdata/tpch/plan_sf0.02/supplier/16.parquet, /testdata/tpch/plan_sf0.02/supplier/2.parquet], [/testdata/tpch/plan_sf0.02/supplier/3.parquet, /testdata/tpch/plan_sf0.02/supplier/4.parquet, /testdata/tpch/plan_sf0.02/supplier/5.parquet], [/testdata/tpch/plan_sf0.02/supplier/6.parquet, /testdata/tpch/plan_sf0.02/supplier/7.parquet, /testdata/tpch/plan_sf0.02/supplier/8.parquet], ...]}, projection=[s_suppkey, s_nationkey], file_type=parquet, predicate=DynamicFilter [ empty ] └────────────────────────────────────────────────── "); Ok(()) @@ -659,19 +608,17 @@ mod tests { │ [Stage 2] => NetworkShuffleExec: output_partitions=6, input_tasks=4 └────────────────────────────────────────────────── ┌───── Stage 2 ── Tasks: t0:[p0..p17] t1:[p0..p17] t2:[p0..p17] t3:[p0..p17] - │ RepartitionExec: partitioning=Hash([l_shipmode@0], 18), input_partitions=4 + │ RepartitionExec: partitioning=Hash([l_shipmode@0], 18), input_partitions=6 │ AggregateExec: mode=Partial, gby=[l_shipmode@0 as l_shipmode], aggr=[sum(CASE WHEN orders.o_orderpriority = Utf8("1-URGENT") OR orders.o_orderpriority = Utf8("2-HIGH") THEN Int64(1) ELSE Int64(0) END), sum(CASE WHEN orders.o_orderpriority != Utf8("1-URGENT") AND orders.o_orderpriority != Utf8("2-HIGH") THEN Int64(1) ELSE Int64(0) END)] │ HashJoinExec: mode=CollectLeft, join_type=Inner, on=[(l_orderkey@0, o_orderkey@0)], projection=[l_shipmode@1, o_orderpriority@3] │ CoalescePartitionsExec - │ [Stage 1] => NetworkBroadcastExec: partitions_per_consumer=4, stage_partitions=16, input_tasks=4 - │ PartitionIsolatorExec: tasks=4 partitions=16 - │ DataSourceExec: file_groups={16 groups: [[/testdata/tpch/plan_sf0.02/orders/1.parquet], [/testdata/tpch/plan_sf0.02/orders/10.parquet], [/testdata/tpch/plan_sf0.02/orders/11.parquet], [/testdata/tpch/plan_sf0.02/orders/12.parquet], [/testdata/tpch/plan_sf0.02/orders/13.parquet], ...]}, projection=[o_orderkey, o_orderpriority], file_type=parquet, predicate=DynamicFilter [ empty ] + │ [Stage 1] => NetworkBroadcastExec: partitions_per_consumer=6, stage_partitions=24, input_tasks=4 + │ DistributedLeafExec: DataSourceExec: file_groups={6 groups: [[/testdata/tpch/plan_sf0.02/orders/1.parquet, /testdata/tpch/plan_sf0.02/orders/10.parquet, /testdata/tpch/plan_sf0.02/orders/11.parquet], [/testdata/tpch/plan_sf0.02/orders/12.parquet, /testdata/tpch/plan_sf0.02/orders/13.parquet, /testdata/tpch/plan_sf0.02/orders/14.parquet], [/testdata/tpch/plan_sf0.02/orders/15.parquet, /testdata/tpch/plan_sf0.02/orders/16.parquet, /testdata/tpch/plan_sf0.02/orders/2.parquet], [/testdata/tpch/plan_sf0.02/orders/3.parquet, /testdata/tpch/plan_sf0.02/orders/4.parquet, /testdata/tpch/plan_sf0.02/orders/5.parquet], [/testdata/tpch/plan_sf0.02/orders/6.parquet, /testdata/tpch/plan_sf0.02/orders/7.parquet, /testdata/tpch/plan_sf0.02/orders/8.parquet], ...]}, projection=[o_orderkey, o_orderpriority], file_type=parquet, predicate=DynamicFilter [ empty ] └────────────────────────────────────────────────── - ┌───── Stage 1 ── Tasks: t0:[p0..p15] t1:[p16..p31] t2:[p32..p47] t3:[p48..p63] - │ BroadcastExec: input_partitions=4, consumer_tasks=4, output_partitions=16 + ┌───── Stage 1 ── Tasks: t0:[p0..p23] t1:[p24..p47] t2:[p48..p71] t3:[p72..p95] + │ BroadcastExec: input_partitions=6, consumer_tasks=4, output_partitions=24 │ FilterExec: (l_shipmode@4 = MAIL OR l_shipmode@4 = SHIP) AND l_receiptdate@3 > l_commitdate@2 AND l_shipdate@1 < l_commitdate@2 AND l_receiptdate@3 >= 1994-01-01 AND l_receiptdate@3 < 1995-01-01, projection=[l_orderkey@0, l_shipmode@4] - │ PartitionIsolatorExec: tasks=4 partitions=16 - │ DataSourceExec: file_groups={16 groups: [[/testdata/tpch/plan_sf0.02/lineitem/1.parquet], [/testdata/tpch/plan_sf0.02/lineitem/10.parquet], [/testdata/tpch/plan_sf0.02/lineitem/11.parquet], [/testdata/tpch/plan_sf0.02/lineitem/12.parquet], [/testdata/tpch/plan_sf0.02/lineitem/13.parquet], ...]}, projection=[l_orderkey, l_shipdate, l_commitdate, l_receiptdate, l_shipmode], file_type=parquet, predicate=(l_shipmode@14 = MAIL OR l_shipmode@14 = SHIP) AND l_receiptdate@12 > l_commitdate@11 AND l_shipdate@10 < l_commitdate@11 AND l_receiptdate@12 >= 1994-01-01 AND l_receiptdate@12 < 1995-01-01, pruning_predicate=(l_shipmode_null_count@2 != row_count@3 AND l_shipmode_min@0 <= MAIL AND MAIL <= l_shipmode_max@1 OR l_shipmode_null_count@2 != row_count@3 AND l_shipmode_min@0 <= SHIP AND SHIP <= l_shipmode_max@1) AND l_receiptdate_null_count@5 != row_count@3 AND l_receiptdate_max@4 >= 1994-01-01 AND l_receiptdate_null_count@5 != row_count@3 AND l_receiptdate_min@6 < 1995-01-01, required_guarantees=[l_shipmode in (MAIL, SHIP)] + │ DistributedLeafExec: DataSourceExec: file_groups={6 groups: [[/testdata/tpch/plan_sf0.02/lineitem/1.parquet, /testdata/tpch/plan_sf0.02/lineitem/10.parquet, /testdata/tpch/plan_sf0.02/lineitem/11.parquet], [/testdata/tpch/plan_sf0.02/lineitem/12.parquet, /testdata/tpch/plan_sf0.02/lineitem/13.parquet, /testdata/tpch/plan_sf0.02/lineitem/14.parquet], [/testdata/tpch/plan_sf0.02/lineitem/15.parquet, /testdata/tpch/plan_sf0.02/lineitem/16.parquet, /testdata/tpch/plan_sf0.02/lineitem/2.parquet], [/testdata/tpch/plan_sf0.02/lineitem/3.parquet, /testdata/tpch/plan_sf0.02/lineitem/4.parquet, /testdata/tpch/plan_sf0.02/lineitem/5.parquet], [/testdata/tpch/plan_sf0.02/lineitem/6.parquet, /testdata/tpch/plan_sf0.02/lineitem/7.parquet, /testdata/tpch/plan_sf0.02/lineitem/8.parquet], ...]}, projection=[l_orderkey, l_shipdate, l_commitdate, l_receiptdate, l_shipmode], file_type=parquet, predicate=(l_shipmode@14 = MAIL OR l_shipmode@14 = SHIP) AND l_receiptdate@12 > l_commitdate@11 AND l_shipdate@10 < l_commitdate@11 AND l_receiptdate@12 >= 1994-01-01 AND l_receiptdate@12 < 1995-01-01, pruning_predicate=(l_shipmode_null_count@2 != row_count@3 AND l_shipmode_min@0 <= MAIL AND MAIL <= l_shipmode_max@1 OR l_shipmode_null_count@2 != row_count@3 AND l_shipmode_min@0 <= SHIP AND SHIP <= l_shipmode_max@1) AND l_receiptdate_null_count@5 != row_count@3 AND l_receiptdate_max@4 >= 1994-01-01 AND l_receiptdate_null_count@5 != row_count@3 AND l_receiptdate_min@6 < 1995-01-01, required_guarantees=[l_shipmode in (MAIL, SHIP)] └────────────────────────────────────────────────── "#); Ok(()) @@ -680,7 +627,7 @@ mod tests { #[tokio::test] async fn test_tpch_13() -> Result<(), Box> { let plan = test_tpch_query("q13").await?; - assert_snapshot!(plan, @" + assert_snapshot!(plan, @r" ┌───── DistributedExec ── Tasks: t0:[p0] │ SortPreservingMergeExec: [custdist@1 DESC, c_count@0 DESC] │ [Stage 3] => NetworkCoalesceExec: output_partitions=12, input_tasks=2 @@ -699,15 +646,13 @@ mod tests { │ [Stage 1] => NetworkShuffleExec: output_partitions=6, input_tasks=4 └────────────────────────────────────────────────── ┌───── Stage 1 ── Tasks: t0:[p0..p17] t1:[p0..p17] t2:[p0..p17] t3:[p0..p17] - │ RepartitionExec: partitioning=Hash([c_custkey@0], 18), input_partitions=4 + │ RepartitionExec: partitioning=Hash([c_custkey@0], 18), input_partitions=6 │ AggregateExec: mode=Partial, gby=[c_custkey@0 as c_custkey], aggr=[count(orders.o_orderkey)] │ HashJoinExec: mode=CollectLeft, join_type=Left, on=[(c_custkey@0, o_custkey@1)], projection=[c_custkey@0, o_orderkey@1] │ CoalescePartitionsExec - │ PartitionIsolatorExec: tasks=4 partitions=16 - │ DataSourceExec: file_groups={16 groups: [[/testdata/tpch/plan_sf0.02/customer/1.parquet], [/testdata/tpch/plan_sf0.02/customer/10.parquet], [/testdata/tpch/plan_sf0.02/customer/11.parquet], [/testdata/tpch/plan_sf0.02/customer/12.parquet], [/testdata/tpch/plan_sf0.02/customer/13.parquet], ...]}, projection=[c_custkey], file_type=parquet + │ DistributedLeafExec: DataSourceExec: file_groups={6 groups: [[/testdata/tpch/plan_sf0.02/customer/1.parquet, /testdata/tpch/plan_sf0.02/customer/10.parquet, /testdata/tpch/plan_sf0.02/customer/11.parquet], [/testdata/tpch/plan_sf0.02/customer/12.parquet, /testdata/tpch/plan_sf0.02/customer/13.parquet, /testdata/tpch/plan_sf0.02/customer/14.parquet], [/testdata/tpch/plan_sf0.02/customer/15.parquet, /testdata/tpch/plan_sf0.02/customer/16.parquet, /testdata/tpch/plan_sf0.02/customer/2.parquet], [/testdata/tpch/plan_sf0.02/customer/3.parquet, /testdata/tpch/plan_sf0.02/customer/4.parquet, /testdata/tpch/plan_sf0.02/customer/5.parquet], [/testdata/tpch/plan_sf0.02/customer/6.parquet, /testdata/tpch/plan_sf0.02/customer/7.parquet, /testdata/tpch/plan_sf0.02/customer/8.parquet], ...]}, projection=[c_custkey], file_type=parquet │ FilterExec: o_comment@2 NOT LIKE %special%requests%, projection=[o_orderkey@0, o_custkey@1] - │ PartitionIsolatorExec: tasks=4 partitions=16 - │ DataSourceExec: file_groups={16 groups: [[/testdata/tpch/plan_sf0.02/orders/1.parquet], [/testdata/tpch/plan_sf0.02/orders/10.parquet], [/testdata/tpch/plan_sf0.02/orders/11.parquet], [/testdata/tpch/plan_sf0.02/orders/12.parquet], [/testdata/tpch/plan_sf0.02/orders/13.parquet], ...]}, projection=[o_orderkey, o_custkey, o_comment], file_type=parquet, predicate=o_comment@8 NOT LIKE %special%requests% + │ DistributedLeafExec: DataSourceExec: file_groups={6 groups: [[/testdata/tpch/plan_sf0.02/orders/1.parquet, /testdata/tpch/plan_sf0.02/orders/10.parquet, /testdata/tpch/plan_sf0.02/orders/11.parquet], [/testdata/tpch/plan_sf0.02/orders/12.parquet, /testdata/tpch/plan_sf0.02/orders/13.parquet, /testdata/tpch/plan_sf0.02/orders/14.parquet], [/testdata/tpch/plan_sf0.02/orders/15.parquet, /testdata/tpch/plan_sf0.02/orders/16.parquet, /testdata/tpch/plan_sf0.02/orders/2.parquet], [/testdata/tpch/plan_sf0.02/orders/3.parquet, /testdata/tpch/plan_sf0.02/orders/4.parquet, /testdata/tpch/plan_sf0.02/orders/5.parquet], [/testdata/tpch/plan_sf0.02/orders/6.parquet, /testdata/tpch/plan_sf0.02/orders/7.parquet, /testdata/tpch/plan_sf0.02/orders/8.parquet], ...]}, projection=[o_orderkey, o_custkey, o_comment], file_type=parquet, predicate=o_comment@8 NOT LIKE %special%requests% └────────────────────────────────────────────────── "); Ok(()) @@ -721,22 +666,20 @@ mod tests { │ ProjectionExec: expr=[100 * CAST(sum(CASE WHEN part.p_type LIKE Utf8("PROMO%") THEN lineitem.l_extendedprice * Int64(1) - lineitem.l_discount ELSE Int64(0) END)@0 AS Float64) / CAST(sum(lineitem.l_extendedprice * Int64(1) - lineitem.l_discount)@1 AS Float64) as promo_revenue] │ AggregateExec: mode=Final, gby=[], aggr=[sum(CASE WHEN part.p_type LIKE Utf8("PROMO%") THEN lineitem.l_extendedprice * Int64(1) - lineitem.l_discount ELSE Int64(0) END), sum(lineitem.l_extendedprice * Int64(1) - lineitem.l_discount)] │ CoalescePartitionsExec - │ [Stage 2] => NetworkCoalesceExec: output_partitions=16, input_tasks=4 + │ [Stage 2] => NetworkCoalesceExec: output_partitions=24, input_tasks=4 └────────────────────────────────────────────────── - ┌───── Stage 2 ── Tasks: t0:[p0..p3] t1:[p4..p7] t2:[p8..p11] t3:[p12..p15] + ┌───── Stage 2 ── Tasks: t0:[p0..p5] t1:[p6..p11] t2:[p12..p17] t3:[p18..p23] │ AggregateExec: mode=Partial, gby=[], aggr=[sum(CASE WHEN part.p_type LIKE Utf8("PROMO%") THEN lineitem.l_extendedprice * Int64(1) - lineitem.l_discount ELSE Int64(0) END), sum(lineitem.l_extendedprice * Int64(1) - lineitem.l_discount)] │ ProjectionExec: expr=[l_extendedprice@1 * (Some(1),20,0 - l_discount@2) as __common_expr_1, p_type@0 as p_type] │ HashJoinExec: mode=CollectLeft, join_type=Inner, on=[(p_partkey@0, l_partkey@0)], projection=[p_type@1, l_extendedprice@3, l_discount@4] │ CoalescePartitionsExec - │ [Stage 1] => NetworkBroadcastExec: partitions_per_consumer=4, stage_partitions=16, input_tasks=4 + │ [Stage 1] => NetworkBroadcastExec: partitions_per_consumer=6, stage_partitions=24, input_tasks=4 │ FilterExec: l_shipdate@3 >= 1995-09-01 AND l_shipdate@3 < 1995-10-01, projection=[l_partkey@0, l_extendedprice@1, l_discount@2] - │ PartitionIsolatorExec: tasks=4 partitions=16 - │ DataSourceExec: file_groups={16 groups: [[/testdata/tpch/plan_sf0.02/lineitem/1.parquet], [/testdata/tpch/plan_sf0.02/lineitem/10.parquet], [/testdata/tpch/plan_sf0.02/lineitem/11.parquet], [/testdata/tpch/plan_sf0.02/lineitem/12.parquet], [/testdata/tpch/plan_sf0.02/lineitem/13.parquet], ...]}, projection=[l_partkey, l_extendedprice, l_discount, l_shipdate], file_type=parquet, predicate=l_shipdate@10 >= 1995-09-01 AND l_shipdate@10 < 1995-10-01 AND DynamicFilter [ empty ], pruning_predicate=l_shipdate_null_count@1 != row_count@2 AND l_shipdate_max@0 >= 1995-09-01 AND l_shipdate_null_count@1 != row_count@2 AND l_shipdate_min@3 < 1995-10-01, required_guarantees=[] + │ DistributedLeafExec: DataSourceExec: file_groups={6 groups: [[/testdata/tpch/plan_sf0.02/lineitem/1.parquet, /testdata/tpch/plan_sf0.02/lineitem/10.parquet, /testdata/tpch/plan_sf0.02/lineitem/11.parquet], [/testdata/tpch/plan_sf0.02/lineitem/12.parquet, /testdata/tpch/plan_sf0.02/lineitem/13.parquet, /testdata/tpch/plan_sf0.02/lineitem/14.parquet], [/testdata/tpch/plan_sf0.02/lineitem/15.parquet, /testdata/tpch/plan_sf0.02/lineitem/16.parquet, /testdata/tpch/plan_sf0.02/lineitem/2.parquet], [/testdata/tpch/plan_sf0.02/lineitem/3.parquet, /testdata/tpch/plan_sf0.02/lineitem/4.parquet, /testdata/tpch/plan_sf0.02/lineitem/5.parquet], [/testdata/tpch/plan_sf0.02/lineitem/6.parquet, /testdata/tpch/plan_sf0.02/lineitem/7.parquet, /testdata/tpch/plan_sf0.02/lineitem/8.parquet], ...]}, projection=[l_partkey, l_extendedprice, l_discount, l_shipdate], file_type=parquet, predicate=l_shipdate@10 >= 1995-09-01 AND l_shipdate@10 < 1995-10-01 AND DynamicFilter [ empty ], pruning_predicate=l_shipdate_null_count@1 != row_count@2 AND l_shipdate_max@0 >= 1995-09-01 AND l_shipdate_null_count@1 != row_count@2 AND l_shipdate_min@3 < 1995-10-01, required_guarantees=[] └────────────────────────────────────────────────── - ┌───── Stage 1 ── Tasks: t0:[p0..p15] t1:[p16..p31] t2:[p32..p47] t3:[p48..p63] - │ BroadcastExec: input_partitions=4, consumer_tasks=4, output_partitions=16 - │ PartitionIsolatorExec: tasks=4 partitions=16 - │ DataSourceExec: file_groups={16 groups: [[/testdata/tpch/plan_sf0.02/part/1.parquet], [/testdata/tpch/plan_sf0.02/part/10.parquet], [/testdata/tpch/plan_sf0.02/part/11.parquet], [/testdata/tpch/plan_sf0.02/part/12.parquet], [/testdata/tpch/plan_sf0.02/part/13.parquet], ...]}, projection=[p_partkey, p_type], file_type=parquet + ┌───── Stage 1 ── Tasks: t0:[p0..p23] t1:[p24..p47] t2:[p48..p71] t3:[p72..p95] + │ BroadcastExec: input_partitions=6, consumer_tasks=4, output_partitions=24 + │ DistributedLeafExec: DataSourceExec: file_groups={6 groups: [[/testdata/tpch/plan_sf0.02/part/1.parquet, /testdata/tpch/plan_sf0.02/part/10.parquet, /testdata/tpch/plan_sf0.02/part/11.parquet], [/testdata/tpch/plan_sf0.02/part/12.parquet, /testdata/tpch/plan_sf0.02/part/13.parquet, /testdata/tpch/plan_sf0.02/part/14.parquet], [/testdata/tpch/plan_sf0.02/part/15.parquet, /testdata/tpch/plan_sf0.02/part/16.parquet, /testdata/tpch/plan_sf0.02/part/2.parquet], [/testdata/tpch/plan_sf0.02/part/3.parquet, /testdata/tpch/plan_sf0.02/part/4.parquet, /testdata/tpch/plan_sf0.02/part/5.parquet], [/testdata/tpch/plan_sf0.02/part/6.parquet, /testdata/tpch/plan_sf0.02/part/7.parquet, /testdata/tpch/plan_sf0.02/part/8.parquet], ...]}, projection=[p_partkey, p_type], file_type=parquet └────────────────────────────────────────────────── "#); Ok(()) @@ -745,7 +688,7 @@ mod tests { #[tokio::test] async fn test_tpch_15() -> Result<(), Box> { let plan = test_tpch_query("q15").await?; - assert_snapshot!(plan, @" + assert_snapshot!(plan, @r" ┌───── DistributedExec ── Tasks: t0:[p0] │ SortPreservingMergeExec: [s_suppkey@0 ASC NULLS LAST] │ [Stage 6] => NetworkCoalesceExec: output_partitions=24, input_tasks=4 @@ -757,7 +700,7 @@ mod tests { │ [Stage 3] => NetworkBroadcastExec: partitions_per_consumer=1, stage_partitions=4, input_tasks=1 │ HashJoinExec: mode=CollectLeft, join_type=Inner, on=[(s_suppkey@0, supplier_no@0)], projection=[s_suppkey@0, s_name@1, s_address@2, s_phone@3, total_revenue@5] │ CoalescePartitionsExec - │ [Stage 4] => NetworkBroadcastExec: partitions_per_consumer=4, stage_partitions=16, input_tasks=4 + │ [Stage 4] => NetworkBroadcastExec: partitions_per_consumer=6, stage_partitions=24, input_tasks=4 │ ProjectionExec: expr=[l_suppkey@0 as supplier_no, sum(lineitem.l_extendedprice * Int64(1) - lineitem.l_discount)@1 as total_revenue] │ AggregateExec: mode=FinalPartitioned, gby=[l_suppkey@0 as l_suppkey], aggr=[sum(lineitem.l_extendedprice * Int64(1) - lineitem.l_discount)] │ [Stage 5] => NetworkShuffleExec: output_partitions=6, input_tasks=4 @@ -775,23 +718,20 @@ mod tests { │ [Stage 1] => NetworkShuffleExec: output_partitions=6, input_tasks=4 └────────────────────────────────────────────────── ┌───── Stage 1 ── Tasks: t0:[p0..p11] t1:[p0..p11] t2:[p0..p11] t3:[p0..p11] - │ RepartitionExec: partitioning=Hash([l_suppkey@0], 12), input_partitions=4 + │ RepartitionExec: partitioning=Hash([l_suppkey@0], 12), input_partitions=6 │ AggregateExec: mode=Partial, gby=[l_suppkey@0 as l_suppkey], aggr=[sum(lineitem.l_extendedprice * Int64(1) - lineitem.l_discount)] │ FilterExec: l_shipdate@3 >= 1996-01-01 AND l_shipdate@3 < 1996-04-01, projection=[l_suppkey@0, l_extendedprice@1, l_discount@2] - │ PartitionIsolatorExec: tasks=4 partitions=16 - │ DataSourceExec: file_groups={16 groups: [[/testdata/tpch/plan_sf0.02/lineitem/1.parquet], [/testdata/tpch/plan_sf0.02/lineitem/10.parquet], [/testdata/tpch/plan_sf0.02/lineitem/11.parquet], [/testdata/tpch/plan_sf0.02/lineitem/12.parquet], [/testdata/tpch/plan_sf0.02/lineitem/13.parquet], ...]}, projection=[l_suppkey, l_extendedprice, l_discount, l_shipdate], file_type=parquet, predicate=l_shipdate@10 >= 1996-01-01 AND l_shipdate@10 < 1996-04-01, pruning_predicate=l_shipdate_null_count@1 != row_count@2 AND l_shipdate_max@0 >= 1996-01-01 AND l_shipdate_null_count@1 != row_count@2 AND l_shipdate_min@3 < 1996-04-01, required_guarantees=[] + │ DistributedLeafExec: DataSourceExec: file_groups={6 groups: [[/testdata/tpch/plan_sf0.02/lineitem/1.parquet, /testdata/tpch/plan_sf0.02/lineitem/10.parquet, /testdata/tpch/plan_sf0.02/lineitem/11.parquet], [/testdata/tpch/plan_sf0.02/lineitem/12.parquet, /testdata/tpch/plan_sf0.02/lineitem/13.parquet, /testdata/tpch/plan_sf0.02/lineitem/14.parquet], [/testdata/tpch/plan_sf0.02/lineitem/15.parquet, /testdata/tpch/plan_sf0.02/lineitem/16.parquet, /testdata/tpch/plan_sf0.02/lineitem/2.parquet], [/testdata/tpch/plan_sf0.02/lineitem/3.parquet, /testdata/tpch/plan_sf0.02/lineitem/4.parquet, /testdata/tpch/plan_sf0.02/lineitem/5.parquet], [/testdata/tpch/plan_sf0.02/lineitem/6.parquet, /testdata/tpch/plan_sf0.02/lineitem/7.parquet, /testdata/tpch/plan_sf0.02/lineitem/8.parquet], ...]}, projection=[l_suppkey, l_extendedprice, l_discount, l_shipdate], file_type=parquet, predicate=l_shipdate@10 >= 1996-01-01 AND l_shipdate@10 < 1996-04-01, pruning_predicate=l_shipdate_null_count@1 != row_count@2 AND l_shipdate_max@0 >= 1996-01-01 AND l_shipdate_null_count@1 != row_count@2 AND l_shipdate_min@3 < 1996-04-01, required_guarantees=[] └────────────────────────────────────────────────── - ┌───── Stage 4 ── Tasks: t0:[p0..p15] t1:[p16..p31] t2:[p32..p47] t3:[p48..p63] - │ BroadcastExec: input_partitions=4, consumer_tasks=4, output_partitions=16 - │ PartitionIsolatorExec: tasks=4 partitions=16 - │ DataSourceExec: file_groups={16 groups: [[/testdata/tpch/plan_sf0.02/supplier/1.parquet], [/testdata/tpch/plan_sf0.02/supplier/10.parquet], [/testdata/tpch/plan_sf0.02/supplier/11.parquet], [/testdata/tpch/plan_sf0.02/supplier/12.parquet], [/testdata/tpch/plan_sf0.02/supplier/13.parquet], ...]}, projection=[s_suppkey, s_name, s_address, s_phone], file_type=parquet + ┌───── Stage 4 ── Tasks: t0:[p0..p23] t1:[p24..p47] t2:[p48..p71] t3:[p72..p95] + │ BroadcastExec: input_partitions=6, consumer_tasks=4, output_partitions=24 + │ DistributedLeafExec: DataSourceExec: file_groups={6 groups: [[/testdata/tpch/plan_sf0.02/supplier/1.parquet, /testdata/tpch/plan_sf0.02/supplier/10.parquet, /testdata/tpch/plan_sf0.02/supplier/11.parquet], [/testdata/tpch/plan_sf0.02/supplier/12.parquet, /testdata/tpch/plan_sf0.02/supplier/13.parquet, /testdata/tpch/plan_sf0.02/supplier/14.parquet], [/testdata/tpch/plan_sf0.02/supplier/15.parquet, /testdata/tpch/plan_sf0.02/supplier/16.parquet, /testdata/tpch/plan_sf0.02/supplier/2.parquet], [/testdata/tpch/plan_sf0.02/supplier/3.parquet, /testdata/tpch/plan_sf0.02/supplier/4.parquet, /testdata/tpch/plan_sf0.02/supplier/5.parquet], [/testdata/tpch/plan_sf0.02/supplier/6.parquet, /testdata/tpch/plan_sf0.02/supplier/7.parquet, /testdata/tpch/plan_sf0.02/supplier/8.parquet], ...]}, projection=[s_suppkey, s_name, s_address, s_phone], file_type=parquet └────────────────────────────────────────────────── ┌───── Stage 5 ── Tasks: t0:[p0..p23] t1:[p0..p23] t2:[p0..p23] t3:[p0..p23] - │ RepartitionExec: partitioning=Hash([l_suppkey@0], 24), input_partitions=4 + │ RepartitionExec: partitioning=Hash([l_suppkey@0], 24), input_partitions=6 │ AggregateExec: mode=Partial, gby=[l_suppkey@0 as l_suppkey], aggr=[sum(lineitem.l_extendedprice * Int64(1) - lineitem.l_discount)] │ FilterExec: l_shipdate@3 >= 1996-01-01 AND l_shipdate@3 < 1996-04-01, projection=[l_suppkey@0, l_extendedprice@1, l_discount@2] - │ PartitionIsolatorExec: tasks=4 partitions=16 - │ DataSourceExec: file_groups={16 groups: [[/testdata/tpch/plan_sf0.02/lineitem/1.parquet], [/testdata/tpch/plan_sf0.02/lineitem/10.parquet], [/testdata/tpch/plan_sf0.02/lineitem/11.parquet], [/testdata/tpch/plan_sf0.02/lineitem/12.parquet], [/testdata/tpch/plan_sf0.02/lineitem/13.parquet], ...]}, projection=[l_suppkey, l_extendedprice, l_discount, l_shipdate], file_type=parquet, predicate=l_shipdate@10 >= 1996-01-01 AND l_shipdate@10 < 1996-04-01 AND DynamicFilter [ empty ], pruning_predicate=l_shipdate_null_count@1 != row_count@2 AND l_shipdate_max@0 >= 1996-01-01 AND l_shipdate_null_count@1 != row_count@2 AND l_shipdate_min@3 < 1996-04-01, required_guarantees=[] + │ DistributedLeafExec: DataSourceExec: file_groups={6 groups: [[/testdata/tpch/plan_sf0.02/lineitem/1.parquet, /testdata/tpch/plan_sf0.02/lineitem/10.parquet, /testdata/tpch/plan_sf0.02/lineitem/11.parquet], [/testdata/tpch/plan_sf0.02/lineitem/12.parquet, /testdata/tpch/plan_sf0.02/lineitem/13.parquet, /testdata/tpch/plan_sf0.02/lineitem/14.parquet], [/testdata/tpch/plan_sf0.02/lineitem/15.parquet, /testdata/tpch/plan_sf0.02/lineitem/16.parquet, /testdata/tpch/plan_sf0.02/lineitem/2.parquet], [/testdata/tpch/plan_sf0.02/lineitem/3.parquet, /testdata/tpch/plan_sf0.02/lineitem/4.parquet, /testdata/tpch/plan_sf0.02/lineitem/5.parquet], [/testdata/tpch/plan_sf0.02/lineitem/6.parquet, /testdata/tpch/plan_sf0.02/lineitem/7.parquet, /testdata/tpch/plan_sf0.02/lineitem/8.parquet], ...]}, projection=[l_suppkey, l_extendedprice, l_discount, l_shipdate], file_type=parquet, predicate=l_shipdate@10 >= 1996-01-01 AND l_shipdate@10 < 1996-04-01 AND DynamicFilter [ empty ], pruning_predicate=l_shipdate_null_count@1 != row_count@2 AND l_shipdate_max@0 >= 1996-01-01 AND l_shipdate_null_count@1 != row_count@2 AND l_shipdate_min@3 < 1996-04-01, required_guarantees=[] └────────────────────────────────────────────────── "); Ok(()) @@ -800,7 +740,7 @@ mod tests { #[tokio::test] async fn test_tpch_16() -> Result<(), Box> { let plan = test_tpch_query("q16").await?; - assert_snapshot!(plan, @" + assert_snapshot!(plan, @r" ┌───── DistributedExec ── Tasks: t0:[p0] │ SortPreservingMergeExec: [supplier_cnt@3 DESC, p_brand@0 ASC NULLS LAST, p_type@1 ASC NULLS LAST, p_size@2 ASC NULLS LAST] │ [Stage 5] => NetworkCoalesceExec: output_partitions=12, input_tasks=2 @@ -818,29 +758,26 @@ mod tests { │ [Stage 3] => NetworkShuffleExec: output_partitions=6, input_tasks=4 └────────────────────────────────────────────────── ┌───── Stage 3 ── Tasks: t0:[p0..p17] t1:[p0..p17] t2:[p0..p17] t3:[p0..p17] - │ RepartitionExec: partitioning=Hash([p_brand@0, p_type@1, p_size@2, alias1@3], 18), input_partitions=4 + │ RepartitionExec: partitioning=Hash([p_brand@0, p_type@1, p_size@2, alias1@3], 18), input_partitions=6 │ AggregateExec: mode=Partial, gby=[p_brand@1 as p_brand, p_type@2 as p_type, p_size@3 as p_size, ps_suppkey@0 as alias1], aggr=[] │ HashJoinExec: mode=CollectLeft, join_type=RightAnti, on=[(s_suppkey@0, ps_suppkey@0)] │ CoalescePartitionsExec - │ [Stage 1] => NetworkBroadcastExec: partitions_per_consumer=4, stage_partitions=16, input_tasks=4 + │ [Stage 1] => NetworkBroadcastExec: partitions_per_consumer=6, stage_partitions=24, input_tasks=4 │ ProjectionExec: expr=[ps_suppkey@3 as ps_suppkey, p_brand@0 as p_brand, p_type@1 as p_type, p_size@2 as p_size] │ HashJoinExec: mode=CollectLeft, join_type=Inner, on=[(p_partkey@0, ps_partkey@0)], projection=[p_brand@1, p_type@2, p_size@3, ps_suppkey@5] │ CoalescePartitionsExec - │ [Stage 2] => NetworkBroadcastExec: partitions_per_consumer=4, stage_partitions=16, input_tasks=4 - │ PartitionIsolatorExec: tasks=4 partitions=16 - │ DataSourceExec: file_groups={16 groups: [[/testdata/tpch/plan_sf0.02/partsupp/1.parquet], [/testdata/tpch/plan_sf0.02/partsupp/10.parquet], [/testdata/tpch/plan_sf0.02/partsupp/11.parquet], [/testdata/tpch/plan_sf0.02/partsupp/12.parquet], [/testdata/tpch/plan_sf0.02/partsupp/13.parquet], ...]}, projection=[ps_partkey, ps_suppkey], file_type=parquet, predicate=DynamicFilter [ empty ] + │ [Stage 2] => NetworkBroadcastExec: partitions_per_consumer=6, stage_partitions=24, input_tasks=4 + │ DistributedLeafExec: DataSourceExec: file_groups={6 groups: [[/testdata/tpch/plan_sf0.02/partsupp/1.parquet, /testdata/tpch/plan_sf0.02/partsupp/10.parquet, /testdata/tpch/plan_sf0.02/partsupp/11.parquet], [/testdata/tpch/plan_sf0.02/partsupp/12.parquet, /testdata/tpch/plan_sf0.02/partsupp/13.parquet, /testdata/tpch/plan_sf0.02/partsupp/14.parquet], [/testdata/tpch/plan_sf0.02/partsupp/15.parquet, /testdata/tpch/plan_sf0.02/partsupp/16.parquet, /testdata/tpch/plan_sf0.02/partsupp/2.parquet], [/testdata/tpch/plan_sf0.02/partsupp/3.parquet, /testdata/tpch/plan_sf0.02/partsupp/4.parquet, /testdata/tpch/plan_sf0.02/partsupp/5.parquet], [/testdata/tpch/plan_sf0.02/partsupp/6.parquet, /testdata/tpch/plan_sf0.02/partsupp/7.parquet, /testdata/tpch/plan_sf0.02/partsupp/8.parquet], ...]}, projection=[ps_partkey, ps_suppkey], file_type=parquet, predicate=DynamicFilter [ empty ] └────────────────────────────────────────────────── - ┌───── Stage 1 ── Tasks: t0:[p0..p15] t1:[p16..p31] t2:[p32..p47] t3:[p48..p63] - │ BroadcastExec: input_partitions=4, consumer_tasks=4, output_partitions=16 + ┌───── Stage 1 ── Tasks: t0:[p0..p23] t1:[p24..p47] t2:[p48..p71] t3:[p72..p95] + │ BroadcastExec: input_partitions=6, consumer_tasks=4, output_partitions=24 │ FilterExec: s_comment@1 LIKE %Customer%Complaints%, projection=[s_suppkey@0] - │ PartitionIsolatorExec: tasks=4 partitions=16 - │ DataSourceExec: file_groups={16 groups: [[/testdata/tpch/plan_sf0.02/supplier/1.parquet], [/testdata/tpch/plan_sf0.02/supplier/10.parquet], [/testdata/tpch/plan_sf0.02/supplier/11.parquet], [/testdata/tpch/plan_sf0.02/supplier/12.parquet], [/testdata/tpch/plan_sf0.02/supplier/13.parquet], ...]}, projection=[s_suppkey, s_comment], file_type=parquet, predicate=s_comment@6 LIKE %Customer%Complaints% + │ DistributedLeafExec: DataSourceExec: file_groups={6 groups: [[/testdata/tpch/plan_sf0.02/supplier/1.parquet, /testdata/tpch/plan_sf0.02/supplier/10.parquet, /testdata/tpch/plan_sf0.02/supplier/11.parquet], [/testdata/tpch/plan_sf0.02/supplier/12.parquet, /testdata/tpch/plan_sf0.02/supplier/13.parquet, /testdata/tpch/plan_sf0.02/supplier/14.parquet], [/testdata/tpch/plan_sf0.02/supplier/15.parquet, /testdata/tpch/plan_sf0.02/supplier/16.parquet, /testdata/tpch/plan_sf0.02/supplier/2.parquet], [/testdata/tpch/plan_sf0.02/supplier/3.parquet, /testdata/tpch/plan_sf0.02/supplier/4.parquet, /testdata/tpch/plan_sf0.02/supplier/5.parquet], [/testdata/tpch/plan_sf0.02/supplier/6.parquet, /testdata/tpch/plan_sf0.02/supplier/7.parquet, /testdata/tpch/plan_sf0.02/supplier/8.parquet], ...]}, projection=[s_suppkey, s_comment], file_type=parquet, predicate=s_comment@6 LIKE %Customer%Complaints% └────────────────────────────────────────────────── - ┌───── Stage 2 ── Tasks: t0:[p0..p15] t1:[p16..p31] t2:[p32..p47] t3:[p48..p63] - │ BroadcastExec: input_partitions=4, consumer_tasks=4, output_partitions=16 + ┌───── Stage 2 ── Tasks: t0:[p0..p23] t1:[p24..p47] t2:[p48..p71] t3:[p72..p95] + │ BroadcastExec: input_partitions=6, consumer_tasks=4, output_partitions=24 │ FilterExec: p_brand@1 != Brand#45 AND p_type@2 NOT LIKE MEDIUM POLISHED% AND p_size@3 IN (SET) ([49, 14, 23, 45, 19, 3, 36, 9]) - │ PartitionIsolatorExec: tasks=4 partitions=16 - │ DataSourceExec: file_groups={16 groups: [[/testdata/tpch/plan_sf0.02/part/1.parquet], [/testdata/tpch/plan_sf0.02/part/10.parquet], [/testdata/tpch/plan_sf0.02/part/11.parquet], [/testdata/tpch/plan_sf0.02/part/12.parquet], [/testdata/tpch/plan_sf0.02/part/13.parquet], ...]}, projection=[p_partkey, p_brand, p_type, p_size], file_type=parquet, predicate=p_brand@3 != Brand#45 AND p_type@4 NOT LIKE MEDIUM POLISHED% AND p_size@5 IN (SET) ([49, 14, 23, 45, 19, 3, 36, 9]), pruning_predicate=p_brand_null_count@2 != row_count@3 AND (p_brand_min@0 != Brand#45 OR Brand#45 != p_brand_max@1) AND p_type_null_count@6 != row_count@3 AND (p_type_min@4 NOT LIKE MEDIUM POLISHED% OR p_type_max@5 NOT LIKE MEDIUM POLISHED%) AND (p_size_null_count@9 != row_count@3 AND p_size_min@7 <= 49 AND 49 <= p_size_max@8 OR p_size_null_count@9 != row_count@3 AND p_size_min@7 <= 14 AND 14 <= p_size_max@8 OR p_size_null_count@9 != row_count@3 AND p_size_min@7 <= 23 AND 23 <= p_size_max@8 OR p_size_null_count@9 != row_count@3 AND p_size_min@7 <= 45 AND 45 <= p_size_max@8 OR p_size_null_count@9 != row_count@3 AND p_size_min@7 <= 19 AND 19 <= p_size_max@8 OR p_size_null_count@9 != row_count@3 AND p_size_min@7 <= 3 AND 3 <= p_size_max@8 OR p_size_null_count@9 != row_count@3 AND p_size_min@7 <= 36 AND 36 <= p_size_max@8 OR p_size_null_count@9 != row_count@3 AND p_size_min@7 <= 9 AND 9 <= p_size_max@8), required_guarantees=[p_brand not in (Brand#45), p_size in (14, 19, 23, 3, 36, 45, 49, 9)] + │ DistributedLeafExec: DataSourceExec: file_groups={6 groups: [[/testdata/tpch/plan_sf0.02/part/1.parquet, /testdata/tpch/plan_sf0.02/part/10.parquet, /testdata/tpch/plan_sf0.02/part/11.parquet], [/testdata/tpch/plan_sf0.02/part/12.parquet, /testdata/tpch/plan_sf0.02/part/13.parquet, /testdata/tpch/plan_sf0.02/part/14.parquet], [/testdata/tpch/plan_sf0.02/part/15.parquet, /testdata/tpch/plan_sf0.02/part/16.parquet, /testdata/tpch/plan_sf0.02/part/2.parquet], [/testdata/tpch/plan_sf0.02/part/3.parquet, /testdata/tpch/plan_sf0.02/part/4.parquet, /testdata/tpch/plan_sf0.02/part/5.parquet], [/testdata/tpch/plan_sf0.02/part/6.parquet, /testdata/tpch/plan_sf0.02/part/7.parquet, /testdata/tpch/plan_sf0.02/part/8.parquet], ...]}, projection=[p_partkey, p_brand, p_type, p_size], file_type=parquet, predicate=p_brand@3 != Brand#45 AND p_type@4 NOT LIKE MEDIUM POLISHED% AND p_size@5 IN (SET) ([49, 14, 23, 45, 19, 3, 36, 9]), pruning_predicate=p_brand_null_count@2 != row_count@3 AND (p_brand_min@0 != Brand#45 OR Brand#45 != p_brand_max@1) AND p_type_null_count@6 != row_count@3 AND (p_type_min@4 NOT LIKE MEDIUM POLISHED% OR p_type_max@5 NOT LIKE MEDIUM POLISHED%) AND (p_size_null_count@9 != row_count@3 AND p_size_min@7 <= 49 AND 49 <= p_size_max@8 OR p_size_null_count@9 != row_count@3 AND p_size_min@7 <= 14 AND 14 <= p_size_max@8 OR p_size_null_count@9 != row_count@3 AND p_size_min@7 <= 23 AND 23 <= p_size_max@8 OR p_size_null_count@9 != row_count@3 AND p_size_min@7 <= 45 AND 45 <= p_size_max@8 OR p_size_null_count@9 != row_count@3 AND p_size_min@7 <= 19 AND 19 <= p_size_max@8 OR p_size_null_count@9 != row_count@3 AND p_size_min@7 <= 3 AND 3 <= p_size_max@8 OR p_size_null_count@9 != row_count@3 AND p_size_min@7 <= 36 AND 36 <= p_size_max@8 OR p_size_null_count@9 != row_count@3 AND p_size_min@7 <= 9 AND 9 <= p_size_max@8), required_guarantees=[p_brand not in (Brand#45), p_size in (14, 19, 23, 3, 36, 45, 49, 9)] └────────────────────────────────────────────────── "); Ok(()) @@ -849,7 +786,7 @@ mod tests { #[tokio::test] async fn test_tpch_17() -> Result<(), Box> { let plan = test_tpch_query("q17").await?; - assert_snapshot!(plan, @" + assert_snapshot!(plan, @r" ┌───── DistributedExec ── Tasks: t0:[p0] │ ProjectionExec: expr=[CAST(sum(lineitem.l_extendedprice)@0 AS Float64) / 7 as avg_yearly] │ AggregateExec: mode=Final, gby=[], aggr=[sum(lineitem.l_extendedprice)] @@ -860,31 +797,28 @@ mod tests { │ AggregateExec: mode=Partial, gby=[], aggr=[sum(lineitem.l_extendedprice)] │ HashJoinExec: mode=CollectLeft, join_type=Inner, on=[(p_partkey@2, l_partkey@1)], filter=CAST(l_quantity@0 AS Decimal128(30, 15)) < Float64(0.2) * avg(lineitem.l_quantity)@1, projection=[l_extendedprice@1] │ CoalescePartitionsExec - │ [Stage 2] => NetworkBroadcastExec: partitions_per_consumer=4, stage_partitions=16, input_tasks=4 + │ [Stage 2] => NetworkBroadcastExec: partitions_per_consumer=6, stage_partitions=24, input_tasks=4 │ ProjectionExec: expr=[CAST(0.2 * CAST(avg(lineitem.l_quantity)@1 AS Float64) AS Decimal128(30, 15)) as Float64(0.2) * avg(lineitem.l_quantity), l_partkey@0 as l_partkey] │ AggregateExec: mode=FinalPartitioned, gby=[l_partkey@0 as l_partkey], aggr=[avg(lineitem.l_quantity)] │ [Stage 3] => NetworkShuffleExec: output_partitions=6, input_tasks=4 └────────────────────────────────────────────────── - ┌───── Stage 2 ── Tasks: t0:[p0..p15] t1:[p16..p31] t2:[p32..p47] t3:[p48..p63] - │ BroadcastExec: input_partitions=4, consumer_tasks=4, output_partitions=16 + ┌───── Stage 2 ── Tasks: t0:[p0..p23] t1:[p24..p47] t2:[p48..p71] t3:[p72..p95] + │ BroadcastExec: input_partitions=6, consumer_tasks=4, output_partitions=24 │ ProjectionExec: expr=[l_quantity@1 as l_quantity, l_extendedprice@2 as l_extendedprice, p_partkey@0 as p_partkey] │ HashJoinExec: mode=CollectLeft, join_type=Inner, on=[(p_partkey@0, l_partkey@0)], projection=[p_partkey@0, l_quantity@2, l_extendedprice@3] │ CoalescePartitionsExec - │ [Stage 1] => NetworkBroadcastExec: partitions_per_consumer=4, stage_partitions=16, input_tasks=4 - │ PartitionIsolatorExec: tasks=4 partitions=16 - │ DataSourceExec: file_groups={16 groups: [[/testdata/tpch/plan_sf0.02/lineitem/1.parquet], [/testdata/tpch/plan_sf0.02/lineitem/10.parquet], [/testdata/tpch/plan_sf0.02/lineitem/11.parquet], [/testdata/tpch/plan_sf0.02/lineitem/12.parquet], [/testdata/tpch/plan_sf0.02/lineitem/13.parquet], ...]}, projection=[l_partkey, l_quantity, l_extendedprice], file_type=parquet, predicate=DynamicFilter [ empty ] + │ [Stage 1] => NetworkBroadcastExec: partitions_per_consumer=6, stage_partitions=24, input_tasks=4 + │ DistributedLeafExec: DataSourceExec: file_groups={6 groups: [[/testdata/tpch/plan_sf0.02/lineitem/1.parquet, /testdata/tpch/plan_sf0.02/lineitem/10.parquet, /testdata/tpch/plan_sf0.02/lineitem/11.parquet], [/testdata/tpch/plan_sf0.02/lineitem/12.parquet, /testdata/tpch/plan_sf0.02/lineitem/13.parquet, /testdata/tpch/plan_sf0.02/lineitem/14.parquet], [/testdata/tpch/plan_sf0.02/lineitem/15.parquet, /testdata/tpch/plan_sf0.02/lineitem/16.parquet, /testdata/tpch/plan_sf0.02/lineitem/2.parquet], [/testdata/tpch/plan_sf0.02/lineitem/3.parquet, /testdata/tpch/plan_sf0.02/lineitem/4.parquet, /testdata/tpch/plan_sf0.02/lineitem/5.parquet], [/testdata/tpch/plan_sf0.02/lineitem/6.parquet, /testdata/tpch/plan_sf0.02/lineitem/7.parquet, /testdata/tpch/plan_sf0.02/lineitem/8.parquet], ...]}, projection=[l_partkey, l_quantity, l_extendedprice], file_type=parquet, predicate=DynamicFilter [ empty ] └────────────────────────────────────────────────── - ┌───── Stage 1 ── Tasks: t0:[p0..p15] t1:[p16..p31] t2:[p32..p47] t3:[p48..p63] - │ BroadcastExec: input_partitions=4, consumer_tasks=4, output_partitions=16 + ┌───── Stage 1 ── Tasks: t0:[p0..p23] t1:[p24..p47] t2:[p48..p71] t3:[p72..p95] + │ BroadcastExec: input_partitions=6, consumer_tasks=4, output_partitions=24 │ FilterExec: p_brand@1 = Brand#23 AND p_container@2 = MED BOX, projection=[p_partkey@0] - │ PartitionIsolatorExec: tasks=4 partitions=16 - │ DataSourceExec: file_groups={16 groups: [[/testdata/tpch/plan_sf0.02/part/1.parquet], [/testdata/tpch/plan_sf0.02/part/10.parquet], [/testdata/tpch/plan_sf0.02/part/11.parquet], [/testdata/tpch/plan_sf0.02/part/12.parquet], [/testdata/tpch/plan_sf0.02/part/13.parquet], ...]}, projection=[p_partkey, p_brand, p_container], file_type=parquet, predicate=p_brand@3 = Brand#23 AND p_container@6 = MED BOX, pruning_predicate=p_brand_null_count@2 != row_count@3 AND p_brand_min@0 <= Brand#23 AND Brand#23 <= p_brand_max@1 AND p_container_null_count@6 != row_count@3 AND p_container_min@4 <= MED BOX AND MED BOX <= p_container_max@5, required_guarantees=[p_brand in (Brand#23), p_container in (MED BOX)] + │ DistributedLeafExec: DataSourceExec: file_groups={6 groups: [[/testdata/tpch/plan_sf0.02/part/1.parquet, /testdata/tpch/plan_sf0.02/part/10.parquet, /testdata/tpch/plan_sf0.02/part/11.parquet], [/testdata/tpch/plan_sf0.02/part/12.parquet, /testdata/tpch/plan_sf0.02/part/13.parquet, /testdata/tpch/plan_sf0.02/part/14.parquet], [/testdata/tpch/plan_sf0.02/part/15.parquet, /testdata/tpch/plan_sf0.02/part/16.parquet, /testdata/tpch/plan_sf0.02/part/2.parquet], [/testdata/tpch/plan_sf0.02/part/3.parquet, /testdata/tpch/plan_sf0.02/part/4.parquet, /testdata/tpch/plan_sf0.02/part/5.parquet], [/testdata/tpch/plan_sf0.02/part/6.parquet, /testdata/tpch/plan_sf0.02/part/7.parquet, /testdata/tpch/plan_sf0.02/part/8.parquet], ...]}, projection=[p_partkey, p_brand, p_container], file_type=parquet, predicate=p_brand@3 = Brand#23 AND p_container@6 = MED BOX, pruning_predicate=p_brand_null_count@2 != row_count@3 AND p_brand_min@0 <= Brand#23 AND Brand#23 <= p_brand_max@1 AND p_container_null_count@6 != row_count@3 AND p_container_min@4 <= MED BOX AND MED BOX <= p_container_max@5, required_guarantees=[p_brand in (Brand#23), p_container in (MED BOX)] └────────────────────────────────────────────────── ┌───── Stage 3 ── Tasks: t0:[p0..p23] t1:[p0..p23] t2:[p0..p23] t3:[p0..p23] - │ RepartitionExec: partitioning=Hash([l_partkey@0], 24), input_partitions=4 + │ RepartitionExec: partitioning=Hash([l_partkey@0], 24), input_partitions=6 │ AggregateExec: mode=Partial, gby=[l_partkey@0 as l_partkey], aggr=[avg(lineitem.l_quantity)] - │ PartitionIsolatorExec: tasks=4 partitions=16 - │ DataSourceExec: file_groups={16 groups: [[/testdata/tpch/plan_sf0.02/lineitem/1.parquet], [/testdata/tpch/plan_sf0.02/lineitem/10.parquet], [/testdata/tpch/plan_sf0.02/lineitem/11.parquet], [/testdata/tpch/plan_sf0.02/lineitem/12.parquet], [/testdata/tpch/plan_sf0.02/lineitem/13.parquet], ...]}, projection=[l_partkey, l_quantity], file_type=parquet, predicate=DynamicFilter [ empty ] + │ DistributedLeafExec: DataSourceExec: file_groups={6 groups: [[/testdata/tpch/plan_sf0.02/lineitem/1.parquet, /testdata/tpch/plan_sf0.02/lineitem/10.parquet, /testdata/tpch/plan_sf0.02/lineitem/11.parquet], [/testdata/tpch/plan_sf0.02/lineitem/12.parquet, /testdata/tpch/plan_sf0.02/lineitem/13.parquet, /testdata/tpch/plan_sf0.02/lineitem/14.parquet], [/testdata/tpch/plan_sf0.02/lineitem/15.parquet, /testdata/tpch/plan_sf0.02/lineitem/16.parquet, /testdata/tpch/plan_sf0.02/lineitem/2.parquet], [/testdata/tpch/plan_sf0.02/lineitem/3.parquet, /testdata/tpch/plan_sf0.02/lineitem/4.parquet, /testdata/tpch/plan_sf0.02/lineitem/5.parquet], [/testdata/tpch/plan_sf0.02/lineitem/6.parquet, /testdata/tpch/plan_sf0.02/lineitem/7.parquet, /testdata/tpch/plan_sf0.02/lineitem/8.parquet], ...]}, projection=[l_partkey, l_quantity], file_type=parquet, predicate=DynamicFilter [ empty ] └────────────────────────────────────────────────── "); Ok(()) @@ -893,7 +827,7 @@ mod tests { #[tokio::test] async fn test_tpch_18() -> Result<(), Box> { let plan = test_tpch_query("q18").await?; - assert_snapshot!(plan, @" + assert_snapshot!(plan, @r" ┌───── DistributedExec ── Tasks: t0:[p0] │ SortPreservingMergeExec: [o_totalprice@4 DESC, o_orderdate@3 ASC NULLS LAST] │ [Stage 6] => NetworkCoalesceExec: output_partitions=18, input_tasks=3 @@ -904,16 +838,15 @@ mod tests { │ [Stage 5] => NetworkShuffleExec: output_partitions=6, input_tasks=4 └────────────────────────────────────────────────── ┌───── Stage 5 ── Tasks: t0:[p0..p17] t1:[p0..p17] t2:[p0..p17] t3:[p0..p17] - │ RepartitionExec: partitioning=Hash([c_name@0, c_custkey@1, o_orderkey@2, o_orderdate@3, o_totalprice@4], 18), input_partitions=4 + │ RepartitionExec: partitioning=Hash([c_name@0, c_custkey@1, o_orderkey@2, o_orderdate@3, o_totalprice@4], 18), input_partitions=6 │ AggregateExec: mode=Partial, gby=[c_name@1 as c_name, c_custkey@0 as c_custkey, o_orderkey@2 as o_orderkey, o_orderdate@4 as o_orderdate, o_totalprice@3 as o_totalprice], aggr=[sum(lineitem.l_quantity)] │ HashJoinExec: mode=CollectLeft, join_type=RightSemi, on=[(l_orderkey@0, o_orderkey@2)] │ CoalescePartitionsExec │ [Stage 2] => NetworkBroadcastExec: partitions_per_consumer=6, stage_partitions=24, input_tasks=3 │ HashJoinExec: mode=CollectLeft, join_type=Inner, on=[(o_orderkey@2, l_orderkey@0)], projection=[c_custkey@0, c_name@1, o_orderkey@2, o_totalprice@3, o_orderdate@4, l_quantity@6] │ CoalescePartitionsExec - │ [Stage 4] => NetworkBroadcastExec: partitions_per_consumer=4, stage_partitions=16, input_tasks=4 - │ PartitionIsolatorExec: tasks=4 partitions=16 - │ DataSourceExec: file_groups={16 groups: [[/testdata/tpch/plan_sf0.02/lineitem/1.parquet], [/testdata/tpch/plan_sf0.02/lineitem/10.parquet], [/testdata/tpch/plan_sf0.02/lineitem/11.parquet], [/testdata/tpch/plan_sf0.02/lineitem/12.parquet], [/testdata/tpch/plan_sf0.02/lineitem/13.parquet], ...]}, projection=[l_orderkey, l_quantity], file_type=parquet, predicate=DynamicFilter [ empty ] + │ [Stage 4] => NetworkBroadcastExec: partitions_per_consumer=6, stage_partitions=24, input_tasks=4 + │ DistributedLeafExec: DataSourceExec: file_groups={6 groups: [[/testdata/tpch/plan_sf0.02/lineitem/1.parquet, /testdata/tpch/plan_sf0.02/lineitem/10.parquet, /testdata/tpch/plan_sf0.02/lineitem/11.parquet], [/testdata/tpch/plan_sf0.02/lineitem/12.parquet, /testdata/tpch/plan_sf0.02/lineitem/13.parquet, /testdata/tpch/plan_sf0.02/lineitem/14.parquet], [/testdata/tpch/plan_sf0.02/lineitem/15.parquet, /testdata/tpch/plan_sf0.02/lineitem/16.parquet, /testdata/tpch/plan_sf0.02/lineitem/2.parquet], [/testdata/tpch/plan_sf0.02/lineitem/3.parquet, /testdata/tpch/plan_sf0.02/lineitem/4.parquet, /testdata/tpch/plan_sf0.02/lineitem/5.parquet], [/testdata/tpch/plan_sf0.02/lineitem/6.parquet, /testdata/tpch/plan_sf0.02/lineitem/7.parquet, /testdata/tpch/plan_sf0.02/lineitem/8.parquet], ...]}, projection=[l_orderkey, l_quantity], file_type=parquet, predicate=DynamicFilter [ empty ] └────────────────────────────────────────────────── ┌───── Stage 2 ── Tasks: t0:[p0..p23] t1:[p24..p47] t2:[p48..p71] │ BroadcastExec: input_partitions=6, consumer_tasks=4, output_partitions=24 @@ -922,23 +855,20 @@ mod tests { │ [Stage 1] => NetworkShuffleExec: output_partitions=6, input_tasks=4 └────────────────────────────────────────────────── ┌───── Stage 1 ── Tasks: t0:[p0..p17] t1:[p0..p17] t2:[p0..p17] t3:[p0..p17] - │ RepartitionExec: partitioning=Hash([l_orderkey@0], 18), input_partitions=4 + │ RepartitionExec: partitioning=Hash([l_orderkey@0], 18), input_partitions=6 │ AggregateExec: mode=Partial, gby=[l_orderkey@0 as l_orderkey], aggr=[sum(lineitem.l_quantity)] - │ PartitionIsolatorExec: tasks=4 partitions=16 - │ DataSourceExec: file_groups={16 groups: [[/testdata/tpch/plan_sf0.02/lineitem/1.parquet], [/testdata/tpch/plan_sf0.02/lineitem/10.parquet], [/testdata/tpch/plan_sf0.02/lineitem/11.parquet], [/testdata/tpch/plan_sf0.02/lineitem/12.parquet], [/testdata/tpch/plan_sf0.02/lineitem/13.parquet], ...]}, projection=[l_orderkey, l_quantity], file_type=parquet + │ DistributedLeafExec: DataSourceExec: file_groups={6 groups: [[/testdata/tpch/plan_sf0.02/lineitem/1.parquet, /testdata/tpch/plan_sf0.02/lineitem/10.parquet, /testdata/tpch/plan_sf0.02/lineitem/11.parquet], [/testdata/tpch/plan_sf0.02/lineitem/12.parquet, /testdata/tpch/plan_sf0.02/lineitem/13.parquet, /testdata/tpch/plan_sf0.02/lineitem/14.parquet], [/testdata/tpch/plan_sf0.02/lineitem/15.parquet, /testdata/tpch/plan_sf0.02/lineitem/16.parquet, /testdata/tpch/plan_sf0.02/lineitem/2.parquet], [/testdata/tpch/plan_sf0.02/lineitem/3.parquet, /testdata/tpch/plan_sf0.02/lineitem/4.parquet, /testdata/tpch/plan_sf0.02/lineitem/5.parquet], [/testdata/tpch/plan_sf0.02/lineitem/6.parquet, /testdata/tpch/plan_sf0.02/lineitem/7.parquet, /testdata/tpch/plan_sf0.02/lineitem/8.parquet], ...]}, projection=[l_orderkey, l_quantity], file_type=parquet └────────────────────────────────────────────────── - ┌───── Stage 4 ── Tasks: t0:[p0..p15] t1:[p16..p31] t2:[p32..p47] t3:[p48..p63] - │ BroadcastExec: input_partitions=4, consumer_tasks=4, output_partitions=16 + ┌───── Stage 4 ── Tasks: t0:[p0..p23] t1:[p24..p47] t2:[p48..p71] t3:[p72..p95] + │ BroadcastExec: input_partitions=6, consumer_tasks=4, output_partitions=24 │ HashJoinExec: mode=CollectLeft, join_type=Inner, on=[(c_custkey@0, o_custkey@1)], projection=[c_custkey@0, c_name@1, o_orderkey@2, o_totalprice@4, o_orderdate@5] │ CoalescePartitionsExec - │ [Stage 3] => NetworkBroadcastExec: partitions_per_consumer=4, stage_partitions=16, input_tasks=4 - │ PartitionIsolatorExec: tasks=4 partitions=16 - │ DataSourceExec: file_groups={16 groups: [[/testdata/tpch/plan_sf0.02/orders/1.parquet], [/testdata/tpch/plan_sf0.02/orders/10.parquet], [/testdata/tpch/plan_sf0.02/orders/11.parquet], [/testdata/tpch/plan_sf0.02/orders/12.parquet], [/testdata/tpch/plan_sf0.02/orders/13.parquet], ...]}, projection=[o_orderkey, o_custkey, o_totalprice, o_orderdate], file_type=parquet, predicate=DynamicFilter [ empty ] + │ [Stage 3] => NetworkBroadcastExec: partitions_per_consumer=6, stage_partitions=24, input_tasks=4 + │ DistributedLeafExec: DataSourceExec: file_groups={6 groups: [[/testdata/tpch/plan_sf0.02/orders/1.parquet, /testdata/tpch/plan_sf0.02/orders/10.parquet, /testdata/tpch/plan_sf0.02/orders/11.parquet], [/testdata/tpch/plan_sf0.02/orders/12.parquet, /testdata/tpch/plan_sf0.02/orders/13.parquet, /testdata/tpch/plan_sf0.02/orders/14.parquet], [/testdata/tpch/plan_sf0.02/orders/15.parquet, /testdata/tpch/plan_sf0.02/orders/16.parquet, /testdata/tpch/plan_sf0.02/orders/2.parquet], [/testdata/tpch/plan_sf0.02/orders/3.parquet, /testdata/tpch/plan_sf0.02/orders/4.parquet, /testdata/tpch/plan_sf0.02/orders/5.parquet], [/testdata/tpch/plan_sf0.02/orders/6.parquet, /testdata/tpch/plan_sf0.02/orders/7.parquet, /testdata/tpch/plan_sf0.02/orders/8.parquet], ...]}, projection=[o_orderkey, o_custkey, o_totalprice, o_orderdate], file_type=parquet, predicate=DynamicFilter [ empty ] └────────────────────────────────────────────────── - ┌───── Stage 3 ── Tasks: t0:[p0..p15] t1:[p16..p31] t2:[p32..p47] t3:[p48..p63] - │ BroadcastExec: input_partitions=4, consumer_tasks=4, output_partitions=16 - │ PartitionIsolatorExec: tasks=4 partitions=16 - │ DataSourceExec: file_groups={16 groups: [[/testdata/tpch/plan_sf0.02/customer/1.parquet], [/testdata/tpch/plan_sf0.02/customer/10.parquet], [/testdata/tpch/plan_sf0.02/customer/11.parquet], [/testdata/tpch/plan_sf0.02/customer/12.parquet], [/testdata/tpch/plan_sf0.02/customer/13.parquet], ...]}, projection=[c_custkey, c_name], file_type=parquet + ┌───── Stage 3 ── Tasks: t0:[p0..p23] t1:[p24..p47] t2:[p48..p71] t3:[p72..p95] + │ BroadcastExec: input_partitions=6, consumer_tasks=4, output_partitions=24 + │ DistributedLeafExec: DataSourceExec: file_groups={6 groups: [[/testdata/tpch/plan_sf0.02/customer/1.parquet, /testdata/tpch/plan_sf0.02/customer/10.parquet, /testdata/tpch/plan_sf0.02/customer/11.parquet], [/testdata/tpch/plan_sf0.02/customer/12.parquet, /testdata/tpch/plan_sf0.02/customer/13.parquet, /testdata/tpch/plan_sf0.02/customer/14.parquet], [/testdata/tpch/plan_sf0.02/customer/15.parquet, /testdata/tpch/plan_sf0.02/customer/16.parquet, /testdata/tpch/plan_sf0.02/customer/2.parquet], [/testdata/tpch/plan_sf0.02/customer/3.parquet, /testdata/tpch/plan_sf0.02/customer/4.parquet, /testdata/tpch/plan_sf0.02/customer/5.parquet], [/testdata/tpch/plan_sf0.02/customer/6.parquet, /testdata/tpch/plan_sf0.02/customer/7.parquet, /testdata/tpch/plan_sf0.02/customer/8.parquet], ...]}, projection=[c_custkey, c_name], file_type=parquet └────────────────────────────────────────────────── "); Ok(()) @@ -947,27 +877,25 @@ mod tests { #[tokio::test] async fn test_tpch_19() -> Result<(), Box> { let plan = test_tpch_query("q19").await?; - assert_snapshot!(plan, @" + assert_snapshot!(plan, @r" ┌───── DistributedExec ── Tasks: t0:[p0] │ ProjectionExec: expr=[sum(lineitem.l_extendedprice * Int64(1) - lineitem.l_discount)@0 as revenue] │ AggregateExec: mode=Final, gby=[], aggr=[sum(lineitem.l_extendedprice * Int64(1) - lineitem.l_discount)] │ CoalescePartitionsExec - │ [Stage 2] => NetworkCoalesceExec: output_partitions=16, input_tasks=4 + │ [Stage 2] => NetworkCoalesceExec: output_partitions=24, input_tasks=4 └────────────────────────────────────────────────── - ┌───── Stage 2 ── Tasks: t0:[p0..p3] t1:[p4..p7] t2:[p8..p11] t3:[p12..p15] + ┌───── Stage 2 ── Tasks: t0:[p0..p5] t1:[p6..p11] t2:[p12..p17] t3:[p18..p23] │ AggregateExec: mode=Partial, gby=[], aggr=[sum(lineitem.l_extendedprice * Int64(1) - lineitem.l_discount)] │ HashJoinExec: mode=CollectLeft, join_type=Inner, on=[(p_partkey@0, l_partkey@0)], filter=p_brand@1 = Brand#12 AND p_container@3 IN (SET) ([SM CASE, SM BOX, SM PACK, SM PKG]) AND l_quantity@0 >= Some(100),15,2 AND l_quantity@0 <= Some(1100),15,2 AND p_size@2 <= 5 OR p_brand@1 = Brand#23 AND p_container@3 IN (SET) ([MED BAG, MED BOX, MED PKG, MED PACK]) AND l_quantity@0 >= Some(1000),15,2 AND l_quantity@0 <= Some(2000),15,2 AND p_size@2 <= 10 OR p_brand@1 = Brand#34 AND p_container@3 IN (SET) ([LG CASE, LG BOX, LG PACK, LG PKG]) AND l_quantity@0 >= Some(2000),15,2 AND l_quantity@0 <= Some(3000),15,2 AND p_size@2 <= 15, projection=[l_extendedprice@6, l_discount@7] │ CoalescePartitionsExec - │ [Stage 1] => NetworkBroadcastExec: partitions_per_consumer=4, stage_partitions=16, input_tasks=4 + │ [Stage 1] => NetworkBroadcastExec: partitions_per_consumer=6, stage_partitions=24, input_tasks=4 │ FilterExec: (l_quantity@1 >= Some(100),15,2 AND l_quantity@1 <= Some(1100),15,2 OR l_quantity@1 >= Some(1000),15,2 AND l_quantity@1 <= Some(2000),15,2 OR l_quantity@1 >= Some(2000),15,2 AND l_quantity@1 <= Some(3000),15,2) AND (l_shipmode@5 = AIR OR l_shipmode@5 = AIR REG) AND l_shipinstruct@4 = DELIVER IN PERSON, projection=[l_partkey@0, l_quantity@1, l_extendedprice@2, l_discount@3] - │ PartitionIsolatorExec: tasks=4 partitions=16 - │ DataSourceExec: file_groups={16 groups: [[/testdata/tpch/plan_sf0.02/lineitem/1.parquet], [/testdata/tpch/plan_sf0.02/lineitem/10.parquet], [/testdata/tpch/plan_sf0.02/lineitem/11.parquet], [/testdata/tpch/plan_sf0.02/lineitem/12.parquet], [/testdata/tpch/plan_sf0.02/lineitem/13.parquet], ...]}, projection=[l_partkey, l_quantity, l_extendedprice, l_discount, l_shipinstruct, l_shipmode], file_type=parquet, predicate=(l_quantity@4 >= Some(100),15,2 AND l_quantity@4 <= Some(1100),15,2 OR l_quantity@4 >= Some(1000),15,2 AND l_quantity@4 <= Some(2000),15,2 OR l_quantity@4 >= Some(2000),15,2 AND l_quantity@4 <= Some(3000),15,2) AND (l_shipmode@14 = AIR OR l_shipmode@14 = AIR REG) AND l_shipinstruct@13 = DELIVER IN PERSON AND DynamicFilter [ empty ], pruning_predicate=(l_quantity_null_count@1 != row_count@2 AND l_quantity_max@0 >= Some(100),15,2 AND l_quantity_null_count@1 != row_count@2 AND l_quantity_min@3 <= Some(1100),15,2 OR l_quantity_null_count@1 != row_count@2 AND l_quantity_max@0 >= Some(1000),15,2 AND l_quantity_null_count@1 != row_count@2 AND l_quantity_min@3 <= Some(2000),15,2 OR l_quantity_null_count@1 != row_count@2 AND l_quantity_max@0 >= Some(2000),15,2 AND l_quantity_null_count@1 != row_count@2 AND l_quantity_min@3 <= Some(3000),15,2) AND (l_shipmode_null_count@6 != row_count@2 AND l_shipmode_min@4 <= AIR AND AIR <= l_shipmode_max@5 OR l_shipmode_null_count@6 != row_count@2 AND l_shipmode_min@4 <= AIR REG AND AIR REG <= l_shipmode_max@5) AND l_shipinstruct_null_count@9 != row_count@2 AND l_shipinstruct_min@7 <= DELIVER IN PERSON AND DELIVER IN PERSON <= l_shipinstruct_max@8, required_guarantees=[l_shipinstruct in (DELIVER IN PERSON), l_shipmode in (AIR, AIR REG)] + │ DistributedLeafExec: DataSourceExec: file_groups={6 groups: [[/testdata/tpch/plan_sf0.02/lineitem/1.parquet, /testdata/tpch/plan_sf0.02/lineitem/10.parquet, /testdata/tpch/plan_sf0.02/lineitem/11.parquet], [/testdata/tpch/plan_sf0.02/lineitem/12.parquet, /testdata/tpch/plan_sf0.02/lineitem/13.parquet, /testdata/tpch/plan_sf0.02/lineitem/14.parquet], [/testdata/tpch/plan_sf0.02/lineitem/15.parquet, /testdata/tpch/plan_sf0.02/lineitem/16.parquet, /testdata/tpch/plan_sf0.02/lineitem/2.parquet], [/testdata/tpch/plan_sf0.02/lineitem/3.parquet, /testdata/tpch/plan_sf0.02/lineitem/4.parquet, /testdata/tpch/plan_sf0.02/lineitem/5.parquet], [/testdata/tpch/plan_sf0.02/lineitem/6.parquet, /testdata/tpch/plan_sf0.02/lineitem/7.parquet, /testdata/tpch/plan_sf0.02/lineitem/8.parquet], ...]}, projection=[l_partkey, l_quantity, l_extendedprice, l_discount, l_shipinstruct, l_shipmode], file_type=parquet, predicate=(l_quantity@4 >= Some(100),15,2 AND l_quantity@4 <= Some(1100),15,2 OR l_quantity@4 >= Some(1000),15,2 AND l_quantity@4 <= Some(2000),15,2 OR l_quantity@4 >= Some(2000),15,2 AND l_quantity@4 <= Some(3000),15,2) AND (l_shipmode@14 = AIR OR l_shipmode@14 = AIR REG) AND l_shipinstruct@13 = DELIVER IN PERSON AND DynamicFilter [ empty ], pruning_predicate=(l_quantity_null_count@1 != row_count@2 AND l_quantity_max@0 >= Some(100),15,2 AND l_quantity_null_count@1 != row_count@2 AND l_quantity_min@3 <= Some(1100),15,2 OR l_quantity_null_count@1 != row_count@2 AND l_quantity_max@0 >= Some(1000),15,2 AND l_quantity_null_count@1 != row_count@2 AND l_quantity_min@3 <= Some(2000),15,2 OR l_quantity_null_count@1 != row_count@2 AND l_quantity_max@0 >= Some(2000),15,2 AND l_quantity_null_count@1 != row_count@2 AND l_quantity_min@3 <= Some(3000),15,2) AND (l_shipmode_null_count@6 != row_count@2 AND l_shipmode_min@4 <= AIR AND AIR <= l_shipmode_max@5 OR l_shipmode_null_count@6 != row_count@2 AND l_shipmode_min@4 <= AIR REG AND AIR REG <= l_shipmode_max@5) AND l_shipinstruct_null_count@9 != row_count@2 AND l_shipinstruct_min@7 <= DELIVER IN PERSON AND DELIVER IN PERSON <= l_shipinstruct_max@8, required_guarantees=[l_shipinstruct in (DELIVER IN PERSON), l_shipmode in (AIR, AIR REG)] └────────────────────────────────────────────────── - ┌───── Stage 1 ── Tasks: t0:[p0..p15] t1:[p16..p31] t2:[p32..p47] t3:[p48..p63] - │ BroadcastExec: input_partitions=4, consumer_tasks=4, output_partitions=16 + ┌───── Stage 1 ── Tasks: t0:[p0..p23] t1:[p24..p47] t2:[p48..p71] t3:[p72..p95] + │ BroadcastExec: input_partitions=6, consumer_tasks=4, output_partitions=24 │ FilterExec: (p_brand@1 = Brand#12 AND p_container@3 IN (SET) ([SM CASE, SM BOX, SM PACK, SM PKG]) AND p_size@2 <= 5 OR p_brand@1 = Brand#23 AND p_container@3 IN (SET) ([MED BAG, MED BOX, MED PKG, MED PACK]) AND p_size@2 <= 10 OR p_brand@1 = Brand#34 AND p_container@3 IN (SET) ([LG CASE, LG BOX, LG PACK, LG PKG]) AND p_size@2 <= 15) AND p_size@2 >= 1 - │ PartitionIsolatorExec: tasks=4 partitions=16 - │ DataSourceExec: file_groups={16 groups: [[/testdata/tpch/plan_sf0.02/part/1.parquet], [/testdata/tpch/plan_sf0.02/part/10.parquet], [/testdata/tpch/plan_sf0.02/part/11.parquet], [/testdata/tpch/plan_sf0.02/part/12.parquet], [/testdata/tpch/plan_sf0.02/part/13.parquet], ...]}, projection=[p_partkey, p_brand, p_size, p_container], file_type=parquet, predicate=(p_brand@3 = Brand#12 AND p_container@6 IN (SET) ([SM CASE, SM BOX, SM PACK, SM PKG]) AND p_size@5 <= 5 OR p_brand@3 = Brand#23 AND p_container@6 IN (SET) ([MED BAG, MED BOX, MED PKG, MED PACK]) AND p_size@5 <= 10 OR p_brand@3 = Brand#34 AND p_container@6 IN (SET) ([LG CASE, LG BOX, LG PACK, LG PKG]) AND p_size@5 <= 15) AND p_size@5 >= 1, pruning_predicate=(p_brand_null_count@2 != row_count@3 AND p_brand_min@0 <= Brand#12 AND Brand#12 <= p_brand_max@1 AND (p_container_null_count@6 != row_count@3 AND p_container_min@4 <= SM CASE AND SM CASE <= p_container_max@5 OR p_container_null_count@6 != row_count@3 AND p_container_min@4 <= SM BOX AND SM BOX <= p_container_max@5 OR p_container_null_count@6 != row_count@3 AND p_container_min@4 <= SM PACK AND SM PACK <= p_container_max@5 OR p_container_null_count@6 != row_count@3 AND p_container_min@4 <= SM PKG AND SM PKG <= p_container_max@5) AND p_size_null_count@8 != row_count@3 AND p_size_min@7 <= 5 OR p_brand_null_count@2 != row_count@3 AND p_brand_min@0 <= Brand#23 AND Brand#23 <= p_brand_max@1 AND (p_container_null_count@6 != row_count@3 AND p_container_min@4 <= MED BAG AND MED BAG <= p_container_max@5 OR p_container_null_count@6 != row_count@3 AND p_container_min@4 <= MED BOX AND MED BOX <= p_container_max@5 OR p_container_null_count@6 != row_count@3 AND p_container_min@4 <= MED PKG AND MED PKG <= p_container_max@5 OR p_container_null_count@6 != row_count@3 AND p_container_min@4 <= MED PACK AND MED PACK <= p_container_max@5) AND p_size_null_count@8 != row_count@3 AND p_size_min@7 <= 10 OR p_brand_null_count@2 != row_count@3 AND p_brand_min@0 <= Brand#34 AND Brand#34 <= p_brand_max@1 AND (p_container_null_count@6 != row_count@3 AND p_container_min@4 <= LG CASE AND LG CASE <= p_container_max@5 OR p_container_null_count@6 != row_count@3 AND p_container_min@4 <= LG BOX AND LG BOX <= p_container_max@5 OR p_container_null_count@6 != row_count@3 AND p_container_min@4 <= LG PACK AND LG PACK <= p_container_max@5 OR p_container_null_count@6 != row_count@3 AND p_container_min@4 <= LG PKG AND LG PKG <= p_container_max@5) AND p_size_null_count@8 != row_count@3 AND p_size_min@7 <= 15) AND p_size_null_count@8 != row_count@3 AND p_size_max@9 >= 1, required_guarantees=[p_brand in (Brand#12, Brand#23, Brand#34), p_container in (LG BOX, LG CASE, LG PACK, LG PKG, MED BAG, MED BOX, MED PACK, MED PKG, SM BOX, SM CASE, SM PACK, SM PKG)] + │ DistributedLeafExec: DataSourceExec: file_groups={6 groups: [[/testdata/tpch/plan_sf0.02/part/1.parquet, /testdata/tpch/plan_sf0.02/part/10.parquet, /testdata/tpch/plan_sf0.02/part/11.parquet], [/testdata/tpch/plan_sf0.02/part/12.parquet, /testdata/tpch/plan_sf0.02/part/13.parquet, /testdata/tpch/plan_sf0.02/part/14.parquet], [/testdata/tpch/plan_sf0.02/part/15.parquet, /testdata/tpch/plan_sf0.02/part/16.parquet, /testdata/tpch/plan_sf0.02/part/2.parquet], [/testdata/tpch/plan_sf0.02/part/3.parquet, /testdata/tpch/plan_sf0.02/part/4.parquet, /testdata/tpch/plan_sf0.02/part/5.parquet], [/testdata/tpch/plan_sf0.02/part/6.parquet, /testdata/tpch/plan_sf0.02/part/7.parquet, /testdata/tpch/plan_sf0.02/part/8.parquet], ...]}, projection=[p_partkey, p_brand, p_size, p_container], file_type=parquet, predicate=(p_brand@3 = Brand#12 AND p_container@6 IN (SET) ([SM CASE, SM BOX, SM PACK, SM PKG]) AND p_size@5 <= 5 OR p_brand@3 = Brand#23 AND p_container@6 IN (SET) ([MED BAG, MED BOX, MED PKG, MED PACK]) AND p_size@5 <= 10 OR p_brand@3 = Brand#34 AND p_container@6 IN (SET) ([LG CASE, LG BOX, LG PACK, LG PKG]) AND p_size@5 <= 15) AND p_size@5 >= 1, pruning_predicate=(p_brand_null_count@2 != row_count@3 AND p_brand_min@0 <= Brand#12 AND Brand#12 <= p_brand_max@1 AND (p_container_null_count@6 != row_count@3 AND p_container_min@4 <= SM CASE AND SM CASE <= p_container_max@5 OR p_container_null_count@6 != row_count@3 AND p_container_min@4 <= SM BOX AND SM BOX <= p_container_max@5 OR p_container_null_count@6 != row_count@3 AND p_container_min@4 <= SM PACK AND SM PACK <= p_container_max@5 OR p_container_null_count@6 != row_count@3 AND p_container_min@4 <= SM PKG AND SM PKG <= p_container_max@5) AND p_size_null_count@8 != row_count@3 AND p_size_min@7 <= 5 OR p_brand_null_count@2 != row_count@3 AND p_brand_min@0 <= Brand#23 AND Brand#23 <= p_brand_max@1 AND (p_container_null_count@6 != row_count@3 AND p_container_min@4 <= MED BAG AND MED BAG <= p_container_max@5 OR p_container_null_count@6 != row_count@3 AND p_container_min@4 <= MED BOX AND MED BOX <= p_container_max@5 OR p_container_null_count@6 != row_count@3 AND p_container_min@4 <= MED PKG AND MED PKG <= p_container_max@5 OR p_container_null_count@6 != row_count@3 AND p_container_min@4 <= MED PACK AND MED PACK <= p_container_max@5) AND p_size_null_count@8 != row_count@3 AND p_size_min@7 <= 10 OR p_brand_null_count@2 != row_count@3 AND p_brand_min@0 <= Brand#34 AND Brand#34 <= p_brand_max@1 AND (p_container_null_count@6 != row_count@3 AND p_container_min@4 <= LG CASE AND LG CASE <= p_container_max@5 OR p_container_null_count@6 != row_count@3 AND p_container_min@4 <= LG BOX AND LG BOX <= p_container_max@5 OR p_container_null_count@6 != row_count@3 AND p_container_min@4 <= LG PACK AND LG PACK <= p_container_max@5 OR p_container_null_count@6 != row_count@3 AND p_container_min@4 <= LG PKG AND LG PKG <= p_container_max@5) AND p_size_null_count@8 != row_count@3 AND p_size_min@7 <= 15) AND p_size_null_count@8 != row_count@3 AND p_size_max@9 >= 1, required_guarantees=[p_brand in (Brand#12, Brand#23, Brand#34), p_container in (LG BOX, LG CASE, LG PACK, LG PKG, MED BAG, MED BOX, MED PACK, MED PKG, SM BOX, SM CASE, SM PACK, SM PKG)] └────────────────────────────────────────────────── "); Ok(()) @@ -982,47 +910,42 @@ mod tests { │ SortExec: expr=[s_name@0 ASC NULLS LAST], preserve_partitioning=[true] │ HashJoinExec: mode=CollectLeft, join_type=LeftSemi, on=[(s_suppkey@0, ps_suppkey@0)], projection=[s_name@1, s_address@2] │ CoalescePartitionsExec - │ [Stage 2] => NetworkCoalesceExec: output_partitions=16, input_tasks=4 + │ [Stage 2] => NetworkCoalesceExec: output_partitions=24, input_tasks=4 │ HashJoinExec: mode=CollectLeft, join_type=Inner, on=[(ps_partkey@0, l_partkey@1), (ps_suppkey@1, l_suppkey@2)], filter=CAST(ps_availqty@0 AS Float64) > Float64(0.5) * sum(lineitem.l_quantity)@1, projection=[ps_suppkey@1] │ CoalescePartitionsExec - │ [Stage 4] => NetworkBroadcastExec: partitions_per_consumer=4, stage_partitions=4, input_tasks=4 + │ [Stage 4] => NetworkBroadcastExec: partitions_per_consumer=6, stage_partitions=6, input_tasks=4 │ ProjectionExec: expr=[0.5 * CAST(sum(lineitem.l_quantity)@2 AS Float64) as Float64(0.5) * sum(lineitem.l_quantity), l_partkey@0 as l_partkey, l_suppkey@1 as l_suppkey] │ AggregateExec: mode=FinalPartitioned, gby=[l_partkey@0 as l_partkey, l_suppkey@1 as l_suppkey], aggr=[sum(lineitem.l_quantity)] │ [Stage 5] => NetworkShuffleExec: output_partitions=6, input_tasks=4 └────────────────────────────────────────────────── - ┌───── Stage 2 ── Tasks: t0:[p0..p3] t1:[p4..p7] t2:[p8..p11] t3:[p12..p15] + ┌───── Stage 2 ── Tasks: t0:[p0..p5] t1:[p6..p11] t2:[p12..p17] t3:[p18..p23] │ HashJoinExec: mode=CollectLeft, join_type=Inner, on=[(n_nationkey@0, s_nationkey@3)], projection=[s_suppkey@1, s_name@2, s_address@3] │ CoalescePartitionsExec - │ [Stage 1] => NetworkBroadcastExec: partitions_per_consumer=4, stage_partitions=16, input_tasks=4 - │ PartitionIsolatorExec: tasks=4 partitions=16 - │ DataSourceExec: file_groups={16 groups: [[/testdata/tpch/plan_sf0.02/supplier/1.parquet], [/testdata/tpch/plan_sf0.02/supplier/10.parquet], [/testdata/tpch/plan_sf0.02/supplier/11.parquet], [/testdata/tpch/plan_sf0.02/supplier/12.parquet], [/testdata/tpch/plan_sf0.02/supplier/13.parquet], ...]}, projection=[s_suppkey, s_name, s_address, s_nationkey], file_type=parquet, predicate=DynamicFilter [ empty ] + │ [Stage 1] => NetworkBroadcastExec: partitions_per_consumer=6, stage_partitions=24, input_tasks=4 + │ DistributedLeafExec: DataSourceExec: file_groups={6 groups: [[/testdata/tpch/plan_sf0.02/supplier/1.parquet, /testdata/tpch/plan_sf0.02/supplier/10.parquet, /testdata/tpch/plan_sf0.02/supplier/11.parquet], [/testdata/tpch/plan_sf0.02/supplier/12.parquet, /testdata/tpch/plan_sf0.02/supplier/13.parquet, /testdata/tpch/plan_sf0.02/supplier/14.parquet], [/testdata/tpch/plan_sf0.02/supplier/15.parquet, /testdata/tpch/plan_sf0.02/supplier/16.parquet, /testdata/tpch/plan_sf0.02/supplier/2.parquet], [/testdata/tpch/plan_sf0.02/supplier/3.parquet, /testdata/tpch/plan_sf0.02/supplier/4.parquet, /testdata/tpch/plan_sf0.02/supplier/5.parquet], [/testdata/tpch/plan_sf0.02/supplier/6.parquet, /testdata/tpch/plan_sf0.02/supplier/7.parquet, /testdata/tpch/plan_sf0.02/supplier/8.parquet], ...]}, projection=[s_suppkey, s_name, s_address, s_nationkey], file_type=parquet, predicate=DynamicFilter [ empty ] └────────────────────────────────────────────────── - ┌───── Stage 1 ── Tasks: t0:[p0..p15] t1:[p16..p31] t2:[p32..p47] t3:[p48..p63] - │ BroadcastExec: input_partitions=4, consumer_tasks=4, output_partitions=16 + ┌───── Stage 1 ── Tasks: t0:[p0..p23] t1:[p24..p47] t2:[p48..p71] t3:[p72..p95] + │ BroadcastExec: input_partitions=6, consumer_tasks=4, output_partitions=24 │ FilterExec: n_name@1 = CANADA, projection=[n_nationkey@0] - │ PartitionIsolatorExec: tasks=4 partitions=16 - │ DataSourceExec: file_groups={16 groups: [[/testdata/tpch/plan_sf0.02/nation/1.parquet], [/testdata/tpch/plan_sf0.02/nation/10.parquet], [/testdata/tpch/plan_sf0.02/nation/11.parquet], [/testdata/tpch/plan_sf0.02/nation/12.parquet], [/testdata/tpch/plan_sf0.02/nation/13.parquet], ...]}, projection=[n_nationkey, n_name], file_type=parquet, predicate=n_name@1 = CANADA, pruning_predicate=n_name_null_count@2 != row_count@3 AND n_name_min@0 <= CANADA AND CANADA <= n_name_max@1, required_guarantees=[n_name in (CANADA)] + │ DistributedLeafExec: DataSourceExec: file_groups={6 groups: [[/testdata/tpch/plan_sf0.02/nation/1.parquet, /testdata/tpch/plan_sf0.02/nation/10.parquet, /testdata/tpch/plan_sf0.02/nation/11.parquet], [/testdata/tpch/plan_sf0.02/nation/12.parquet, /testdata/tpch/plan_sf0.02/nation/13.parquet, /testdata/tpch/plan_sf0.02/nation/14.parquet], [/testdata/tpch/plan_sf0.02/nation/15.parquet, /testdata/tpch/plan_sf0.02/nation/16.parquet, /testdata/tpch/plan_sf0.02/nation/2.parquet], [/testdata/tpch/plan_sf0.02/nation/3.parquet, /testdata/tpch/plan_sf0.02/nation/4.parquet, /testdata/tpch/plan_sf0.02/nation/5.parquet], [/testdata/tpch/plan_sf0.02/nation/6.parquet, /testdata/tpch/plan_sf0.02/nation/7.parquet, /testdata/tpch/plan_sf0.02/nation/8.parquet], ...]}, projection=[n_nationkey, n_name], file_type=parquet, predicate=n_name@1 = CANADA, pruning_predicate=n_name_null_count@2 != row_count@3 AND n_name_min@0 <= CANADA AND CANADA <= n_name_max@1, required_guarantees=[n_name in (CANADA)] └────────────────────────────────────────────────── - ┌───── Stage 4 ── Tasks: t0:[p0..p3] t1:[p4..p7] t2:[p8..p11] t3:[p12..p15] - │ BroadcastExec: input_partitions=4, consumer_tasks=1, output_partitions=4 + ┌───── Stage 4 ── Tasks: t0:[p0..p5] t1:[p6..p11] t2:[p12..p17] t3:[p18..p23] + │ BroadcastExec: input_partitions=6, consumer_tasks=1, output_partitions=6 │ HashJoinExec: mode=CollectLeft, join_type=RightSemi, on=[(p_partkey@0, ps_partkey@0)] │ CoalescePartitionsExec - │ [Stage 3] => NetworkBroadcastExec: partitions_per_consumer=4, stage_partitions=16, input_tasks=4 - │ PartitionIsolatorExec: tasks=4 partitions=16 - │ DataSourceExec: file_groups={16 groups: [[/testdata/tpch/plan_sf0.02/partsupp/1.parquet], [/testdata/tpch/plan_sf0.02/partsupp/10.parquet], [/testdata/tpch/plan_sf0.02/partsupp/11.parquet], [/testdata/tpch/plan_sf0.02/partsupp/12.parquet], [/testdata/tpch/plan_sf0.02/partsupp/13.parquet], ...]}, projection=[ps_partkey, ps_suppkey, ps_availqty], file_type=parquet + │ [Stage 3] => NetworkBroadcastExec: partitions_per_consumer=6, stage_partitions=24, input_tasks=4 + │ DistributedLeafExec: DataSourceExec: file_groups={6 groups: [[/testdata/tpch/plan_sf0.02/partsupp/1.parquet, /testdata/tpch/plan_sf0.02/partsupp/10.parquet, /testdata/tpch/plan_sf0.02/partsupp/11.parquet], [/testdata/tpch/plan_sf0.02/partsupp/12.parquet, /testdata/tpch/plan_sf0.02/partsupp/13.parquet, /testdata/tpch/plan_sf0.02/partsupp/14.parquet], [/testdata/tpch/plan_sf0.02/partsupp/15.parquet, /testdata/tpch/plan_sf0.02/partsupp/16.parquet, /testdata/tpch/plan_sf0.02/partsupp/2.parquet], [/testdata/tpch/plan_sf0.02/partsupp/3.parquet, /testdata/tpch/plan_sf0.02/partsupp/4.parquet, /testdata/tpch/plan_sf0.02/partsupp/5.parquet], [/testdata/tpch/plan_sf0.02/partsupp/6.parquet, /testdata/tpch/plan_sf0.02/partsupp/7.parquet, /testdata/tpch/plan_sf0.02/partsupp/8.parquet], ...]}, projection=[ps_partkey, ps_suppkey, ps_availqty], file_type=parquet └────────────────────────────────────────────────── - ┌───── Stage 3 ── Tasks: t0:[p0..p15] t1:[p16..p31] t2:[p32..p47] t3:[p48..p63] - │ BroadcastExec: input_partitions=4, consumer_tasks=4, output_partitions=16 + ┌───── Stage 3 ── Tasks: t0:[p0..p23] t1:[p24..p47] t2:[p48..p71] t3:[p72..p95] + │ BroadcastExec: input_partitions=6, consumer_tasks=4, output_partitions=24 │ FilterExec: p_name@1 LIKE forest%, projection=[p_partkey@0] - │ PartitionIsolatorExec: tasks=4 partitions=16 - │ DataSourceExec: file_groups={16 groups: [[/testdata/tpch/plan_sf0.02/part/1.parquet], [/testdata/tpch/plan_sf0.02/part/10.parquet], [/testdata/tpch/plan_sf0.02/part/11.parquet], [/testdata/tpch/plan_sf0.02/part/12.parquet], [/testdata/tpch/plan_sf0.02/part/13.parquet], ...]}, projection=[p_partkey, p_name], file_type=parquet, predicate=p_name@1 LIKE forest%, pruning_predicate=p_name_null_count@2 != row_count@3 AND p_name_min@0 <= foresu AND forest <= p_name_max@1, required_guarantees=[] + │ DistributedLeafExec: DataSourceExec: file_groups={6 groups: [[/testdata/tpch/plan_sf0.02/part/1.parquet, /testdata/tpch/plan_sf0.02/part/10.parquet, /testdata/tpch/plan_sf0.02/part/11.parquet], [/testdata/tpch/plan_sf0.02/part/12.parquet, /testdata/tpch/plan_sf0.02/part/13.parquet, /testdata/tpch/plan_sf0.02/part/14.parquet], [/testdata/tpch/plan_sf0.02/part/15.parquet, /testdata/tpch/plan_sf0.02/part/16.parquet, /testdata/tpch/plan_sf0.02/part/2.parquet], [/testdata/tpch/plan_sf0.02/part/3.parquet, /testdata/tpch/plan_sf0.02/part/4.parquet, /testdata/tpch/plan_sf0.02/part/5.parquet], [/testdata/tpch/plan_sf0.02/part/6.parquet, /testdata/tpch/plan_sf0.02/part/7.parquet, /testdata/tpch/plan_sf0.02/part/8.parquet], ...]}, projection=[p_partkey, p_name], file_type=parquet, predicate=p_name@1 LIKE forest%, pruning_predicate=p_name_null_count@2 != row_count@3 AND p_name_min@0 <= foresu AND forest <= p_name_max@1, required_guarantees=[] └────────────────────────────────────────────────── ┌───── Stage 5 ── Tasks: t0:[p0..p5] t1:[p0..p5] t2:[p0..p5] t3:[p0..p5] - │ RepartitionExec: partitioning=Hash([l_partkey@0, l_suppkey@1], 6), input_partitions=4 + │ RepartitionExec: partitioning=Hash([l_partkey@0, l_suppkey@1], 6), input_partitions=6 │ AggregateExec: mode=Partial, gby=[l_partkey@0 as l_partkey, l_suppkey@1 as l_suppkey], aggr=[sum(lineitem.l_quantity)] │ FilterExec: l_shipdate@3 >= 1994-01-01 AND l_shipdate@3 < 1995-01-01, projection=[l_partkey@0, l_suppkey@1, l_quantity@2] - │ PartitionIsolatorExec: tasks=4 partitions=16 - │ DataSourceExec: file_groups={16 groups: [[/testdata/tpch/plan_sf0.02/lineitem/1.parquet], [/testdata/tpch/plan_sf0.02/lineitem/10.parquet], [/testdata/tpch/plan_sf0.02/lineitem/11.parquet], [/testdata/tpch/plan_sf0.02/lineitem/12.parquet], [/testdata/tpch/plan_sf0.02/lineitem/13.parquet], ...]}, projection=[l_partkey, l_suppkey, l_quantity, l_shipdate], file_type=parquet, predicate=l_shipdate@10 >= 1994-01-01 AND l_shipdate@10 < 1995-01-01 AND DynamicFilter [ empty ], pruning_predicate=l_shipdate_null_count@1 != row_count@2 AND l_shipdate_max@0 >= 1994-01-01 AND l_shipdate_null_count@1 != row_count@2 AND l_shipdate_min@3 < 1995-01-01, required_guarantees=[] + │ DistributedLeafExec: DataSourceExec: file_groups={6 groups: [[/testdata/tpch/plan_sf0.02/lineitem/1.parquet, /testdata/tpch/plan_sf0.02/lineitem/10.parquet, /testdata/tpch/plan_sf0.02/lineitem/11.parquet], [/testdata/tpch/plan_sf0.02/lineitem/12.parquet, /testdata/tpch/plan_sf0.02/lineitem/13.parquet, /testdata/tpch/plan_sf0.02/lineitem/14.parquet], [/testdata/tpch/plan_sf0.02/lineitem/15.parquet, /testdata/tpch/plan_sf0.02/lineitem/16.parquet, /testdata/tpch/plan_sf0.02/lineitem/2.parquet], [/testdata/tpch/plan_sf0.02/lineitem/3.parquet, /testdata/tpch/plan_sf0.02/lineitem/4.parquet, /testdata/tpch/plan_sf0.02/lineitem/5.parquet], [/testdata/tpch/plan_sf0.02/lineitem/6.parquet, /testdata/tpch/plan_sf0.02/lineitem/7.parquet, /testdata/tpch/plan_sf0.02/lineitem/8.parquet], ...]}, projection=[l_partkey, l_suppkey, l_quantity, l_shipdate], file_type=parquet, predicate=l_shipdate@10 >= 1994-01-01 AND l_shipdate@10 < 1995-01-01 AND DynamicFilter [ empty ], pruning_predicate=l_shipdate_null_count@1 != row_count@2 AND l_shipdate_max@0 >= 1994-01-01 AND l_shipdate_null_count@1 != row_count@2 AND l_shipdate_min@3 < 1995-01-01, required_guarantees=[] └────────────────────────────────────────────────── "); Ok(()) @@ -1031,7 +954,7 @@ mod tests { #[tokio::test] async fn test_tpch_21() -> Result<(), Box> { let plan = test_tpch_query("q21").await?; - assert_snapshot!(plan, @" + assert_snapshot!(plan, @r" ┌───── DistributedExec ── Tasks: t0:[p0] │ SortPreservingMergeExec: [numwait@1 DESC, s_name@0 ASC NULLS LAST] │ SortExec: expr=[numwait@1 DESC, s_name@0 ASC NULLS LAST], preserve_partitioning=[true] @@ -1043,41 +966,37 @@ mod tests { │ CoalescePartitionsExec │ HashJoinExec: mode=CollectLeft, join_type=LeftSemi, on=[(l_orderkey@1, l_orderkey@0)], filter=l_suppkey@1 != l_suppkey@0 │ CoalescePartitionsExec - │ [Stage 4] => NetworkCoalesceExec: output_partitions=16, input_tasks=4 + │ [Stage 4] => NetworkCoalesceExec: output_partitions=24, input_tasks=4 │ DataSourceExec: file_groups={6 groups: [[/testdata/tpch/plan_sf0.02/lineitem/1.parquet, /testdata/tpch/plan_sf0.02/lineitem/10.parquet, /testdata/tpch/plan_sf0.02/lineitem/11.parquet], [/testdata/tpch/plan_sf0.02/lineitem/12.parquet, /testdata/tpch/plan_sf0.02/lineitem/13.parquet, /testdata/tpch/plan_sf0.02/lineitem/14.parquet], [/testdata/tpch/plan_sf0.02/lineitem/15.parquet, /testdata/tpch/plan_sf0.02/lineitem/16.parquet, /testdata/tpch/plan_sf0.02/lineitem/2.parquet], [/testdata/tpch/plan_sf0.02/lineitem/3.parquet, /testdata/tpch/plan_sf0.02/lineitem/4.parquet, /testdata/tpch/plan_sf0.02/lineitem/5.parquet], [/testdata/tpch/plan_sf0.02/lineitem/6.parquet, /testdata/tpch/plan_sf0.02/lineitem/7.parquet, /testdata/tpch/plan_sf0.02/lineitem/8.parquet], ...]}, projection=[l_orderkey, l_suppkey], file_type=parquet │ FilterExec: l_receiptdate@3 > l_commitdate@2, projection=[l_orderkey@0, l_suppkey@1] │ DataSourceExec: file_groups={6 groups: [[/testdata/tpch/plan_sf0.02/lineitem/1.parquet, /testdata/tpch/plan_sf0.02/lineitem/10.parquet, /testdata/tpch/plan_sf0.02/lineitem/11.parquet], [/testdata/tpch/plan_sf0.02/lineitem/12.parquet, /testdata/tpch/plan_sf0.02/lineitem/13.parquet, /testdata/tpch/plan_sf0.02/lineitem/14.parquet], [/testdata/tpch/plan_sf0.02/lineitem/15.parquet, /testdata/tpch/plan_sf0.02/lineitem/16.parquet, /testdata/tpch/plan_sf0.02/lineitem/2.parquet], [/testdata/tpch/plan_sf0.02/lineitem/3.parquet, /testdata/tpch/plan_sf0.02/lineitem/4.parquet, /testdata/tpch/plan_sf0.02/lineitem/5.parquet], [/testdata/tpch/plan_sf0.02/lineitem/6.parquet, /testdata/tpch/plan_sf0.02/lineitem/7.parquet, /testdata/tpch/plan_sf0.02/lineitem/8.parquet], ...]}, projection=[l_orderkey, l_suppkey, l_commitdate, l_receiptdate], file_type=parquet, predicate=l_receiptdate@12 > l_commitdate@11 └────────────────────────────────────────────────── - ┌───── Stage 4 ── Tasks: t0:[p0..p3] t1:[p4..p7] t2:[p8..p11] t3:[p12..p15] + ┌───── Stage 4 ── Tasks: t0:[p0..p5] t1:[p6..p11] t2:[p12..p17] t3:[p18..p23] │ HashJoinExec: mode=CollectLeft, join_type=Inner, on=[(n_nationkey@0, s_nationkey@1)], projection=[s_name@1, l_orderkey@3, l_suppkey@4] │ CoalescePartitionsExec - │ [Stage 1] => NetworkBroadcastExec: partitions_per_consumer=4, stage_partitions=16, input_tasks=4 + │ [Stage 1] => NetworkBroadcastExec: partitions_per_consumer=6, stage_partitions=24, input_tasks=4 │ HashJoinExec: mode=CollectLeft, join_type=Inner, on=[(o_orderkey@0, l_orderkey@2)], projection=[s_name@1, s_nationkey@2, l_orderkey@3, l_suppkey@4] │ CoalescePartitionsExec - │ [Stage 2] => NetworkBroadcastExec: partitions_per_consumer=4, stage_partitions=16, input_tasks=4 + │ [Stage 2] => NetworkBroadcastExec: partitions_per_consumer=6, stage_partitions=24, input_tasks=4 │ HashJoinExec: mode=CollectLeft, join_type=Inner, on=[(s_suppkey@0, l_suppkey@1)], projection=[s_name@1, s_nationkey@2, l_orderkey@3, l_suppkey@4] │ CoalescePartitionsExec - │ [Stage 3] => NetworkBroadcastExec: partitions_per_consumer=4, stage_partitions=16, input_tasks=4 + │ [Stage 3] => NetworkBroadcastExec: partitions_per_consumer=6, stage_partitions=24, input_tasks=4 │ FilterExec: l_receiptdate@3 > l_commitdate@2, projection=[l_orderkey@0, l_suppkey@1] - │ PartitionIsolatorExec: tasks=4 partitions=16 - │ DataSourceExec: file_groups={16 groups: [[/testdata/tpch/plan_sf0.02/lineitem/1.parquet], [/testdata/tpch/plan_sf0.02/lineitem/10.parquet], [/testdata/tpch/plan_sf0.02/lineitem/11.parquet], [/testdata/tpch/plan_sf0.02/lineitem/12.parquet], [/testdata/tpch/plan_sf0.02/lineitem/13.parquet], ...]}, projection=[l_orderkey, l_suppkey, l_commitdate, l_receiptdate], file_type=parquet, predicate=l_receiptdate@12 > l_commitdate@11 AND DynamicFilter [ empty ] AND DynamicFilter [ empty ] + │ DistributedLeafExec: DataSourceExec: file_groups={6 groups: [[/testdata/tpch/plan_sf0.02/lineitem/1.parquet, /testdata/tpch/plan_sf0.02/lineitem/10.parquet, /testdata/tpch/plan_sf0.02/lineitem/11.parquet], [/testdata/tpch/plan_sf0.02/lineitem/12.parquet, /testdata/tpch/plan_sf0.02/lineitem/13.parquet, /testdata/tpch/plan_sf0.02/lineitem/14.parquet], [/testdata/tpch/plan_sf0.02/lineitem/15.parquet, /testdata/tpch/plan_sf0.02/lineitem/16.parquet, /testdata/tpch/plan_sf0.02/lineitem/2.parquet], [/testdata/tpch/plan_sf0.02/lineitem/3.parquet, /testdata/tpch/plan_sf0.02/lineitem/4.parquet, /testdata/tpch/plan_sf0.02/lineitem/5.parquet], [/testdata/tpch/plan_sf0.02/lineitem/6.parquet, /testdata/tpch/plan_sf0.02/lineitem/7.parquet, /testdata/tpch/plan_sf0.02/lineitem/8.parquet], ...]}, projection=[l_orderkey, l_suppkey, l_commitdate, l_receiptdate], file_type=parquet, predicate=l_receiptdate@12 > l_commitdate@11 AND DynamicFilter [ empty ] AND DynamicFilter [ empty ] └────────────────────────────────────────────────── - ┌───── Stage 1 ── Tasks: t0:[p0..p15] t1:[p16..p31] t2:[p32..p47] t3:[p48..p63] - │ BroadcastExec: input_partitions=4, consumer_tasks=4, output_partitions=16 + ┌───── Stage 1 ── Tasks: t0:[p0..p23] t1:[p24..p47] t2:[p48..p71] t3:[p72..p95] + │ BroadcastExec: input_partitions=6, consumer_tasks=4, output_partitions=24 │ FilterExec: n_name@1 = SAUDI ARABIA, projection=[n_nationkey@0] - │ PartitionIsolatorExec: tasks=4 partitions=16 - │ DataSourceExec: file_groups={16 groups: [[/testdata/tpch/plan_sf0.02/nation/1.parquet], [/testdata/tpch/plan_sf0.02/nation/10.parquet], [/testdata/tpch/plan_sf0.02/nation/11.parquet], [/testdata/tpch/plan_sf0.02/nation/12.parquet], [/testdata/tpch/plan_sf0.02/nation/13.parquet], ...]}, projection=[n_nationkey, n_name], file_type=parquet, predicate=n_name@1 = SAUDI ARABIA, pruning_predicate=n_name_null_count@2 != row_count@3 AND n_name_min@0 <= SAUDI ARABIA AND SAUDI ARABIA <= n_name_max@1, required_guarantees=[n_name in (SAUDI ARABIA)] + │ DistributedLeafExec: DataSourceExec: file_groups={6 groups: [[/testdata/tpch/plan_sf0.02/nation/1.parquet, /testdata/tpch/plan_sf0.02/nation/10.parquet, /testdata/tpch/plan_sf0.02/nation/11.parquet], [/testdata/tpch/plan_sf0.02/nation/12.parquet, /testdata/tpch/plan_sf0.02/nation/13.parquet, /testdata/tpch/plan_sf0.02/nation/14.parquet], [/testdata/tpch/plan_sf0.02/nation/15.parquet, /testdata/tpch/plan_sf0.02/nation/16.parquet, /testdata/tpch/plan_sf0.02/nation/2.parquet], [/testdata/tpch/plan_sf0.02/nation/3.parquet, /testdata/tpch/plan_sf0.02/nation/4.parquet, /testdata/tpch/plan_sf0.02/nation/5.parquet], [/testdata/tpch/plan_sf0.02/nation/6.parquet, /testdata/tpch/plan_sf0.02/nation/7.parquet, /testdata/tpch/plan_sf0.02/nation/8.parquet], ...]}, projection=[n_nationkey, n_name], file_type=parquet, predicate=n_name@1 = SAUDI ARABIA, pruning_predicate=n_name_null_count@2 != row_count@3 AND n_name_min@0 <= SAUDI ARABIA AND SAUDI ARABIA <= n_name_max@1, required_guarantees=[n_name in (SAUDI ARABIA)] └────────────────────────────────────────────────── - ┌───── Stage 2 ── Tasks: t0:[p0..p15] t1:[p16..p31] t2:[p32..p47] t3:[p48..p63] - │ BroadcastExec: input_partitions=4, consumer_tasks=4, output_partitions=16 + ┌───── Stage 2 ── Tasks: t0:[p0..p23] t1:[p24..p47] t2:[p48..p71] t3:[p72..p95] + │ BroadcastExec: input_partitions=6, consumer_tasks=4, output_partitions=24 │ FilterExec: o_orderstatus@1 = F, projection=[o_orderkey@0] - │ PartitionIsolatorExec: tasks=4 partitions=16 - │ DataSourceExec: file_groups={16 groups: [[/testdata/tpch/plan_sf0.02/orders/1.parquet], [/testdata/tpch/plan_sf0.02/orders/10.parquet], [/testdata/tpch/plan_sf0.02/orders/11.parquet], [/testdata/tpch/plan_sf0.02/orders/12.parquet], [/testdata/tpch/plan_sf0.02/orders/13.parquet], ...]}, projection=[o_orderkey, o_orderstatus], file_type=parquet, predicate=o_orderstatus@2 = F, pruning_predicate=o_orderstatus_null_count@2 != row_count@3 AND o_orderstatus_min@0 <= F AND F <= o_orderstatus_max@1, required_guarantees=[o_orderstatus in (F)] + │ DistributedLeafExec: DataSourceExec: file_groups={6 groups: [[/testdata/tpch/plan_sf0.02/orders/1.parquet, /testdata/tpch/plan_sf0.02/orders/10.parquet, /testdata/tpch/plan_sf0.02/orders/11.parquet], [/testdata/tpch/plan_sf0.02/orders/12.parquet, /testdata/tpch/plan_sf0.02/orders/13.parquet, /testdata/tpch/plan_sf0.02/orders/14.parquet], [/testdata/tpch/plan_sf0.02/orders/15.parquet, /testdata/tpch/plan_sf0.02/orders/16.parquet, /testdata/tpch/plan_sf0.02/orders/2.parquet], [/testdata/tpch/plan_sf0.02/orders/3.parquet, /testdata/tpch/plan_sf0.02/orders/4.parquet, /testdata/tpch/plan_sf0.02/orders/5.parquet], [/testdata/tpch/plan_sf0.02/orders/6.parquet, /testdata/tpch/plan_sf0.02/orders/7.parquet, /testdata/tpch/plan_sf0.02/orders/8.parquet], ...]}, projection=[o_orderkey, o_orderstatus], file_type=parquet, predicate=o_orderstatus@2 = F, pruning_predicate=o_orderstatus_null_count@2 != row_count@3 AND o_orderstatus_min@0 <= F AND F <= o_orderstatus_max@1, required_guarantees=[o_orderstatus in (F)] └────────────────────────────────────────────────── - ┌───── Stage 3 ── Tasks: t0:[p0..p15] t1:[p16..p31] t2:[p32..p47] t3:[p48..p63] - │ BroadcastExec: input_partitions=4, consumer_tasks=4, output_partitions=16 - │ PartitionIsolatorExec: tasks=4 partitions=16 - │ DataSourceExec: file_groups={16 groups: [[/testdata/tpch/plan_sf0.02/supplier/1.parquet], [/testdata/tpch/plan_sf0.02/supplier/10.parquet], [/testdata/tpch/plan_sf0.02/supplier/11.parquet], [/testdata/tpch/plan_sf0.02/supplier/12.parquet], [/testdata/tpch/plan_sf0.02/supplier/13.parquet], ...]}, projection=[s_suppkey, s_name, s_nationkey], file_type=parquet, predicate=DynamicFilter [ empty ] + ┌───── Stage 3 ── Tasks: t0:[p0..p23] t1:[p24..p47] t2:[p48..p71] t3:[p72..p95] + │ BroadcastExec: input_partitions=6, consumer_tasks=4, output_partitions=24 + │ DistributedLeafExec: DataSourceExec: file_groups={6 groups: [[/testdata/tpch/plan_sf0.02/supplier/1.parquet, /testdata/tpch/plan_sf0.02/supplier/10.parquet, /testdata/tpch/plan_sf0.02/supplier/11.parquet], [/testdata/tpch/plan_sf0.02/supplier/12.parquet, /testdata/tpch/plan_sf0.02/supplier/13.parquet, /testdata/tpch/plan_sf0.02/supplier/14.parquet], [/testdata/tpch/plan_sf0.02/supplier/15.parquet, /testdata/tpch/plan_sf0.02/supplier/16.parquet, /testdata/tpch/plan_sf0.02/supplier/2.parquet], [/testdata/tpch/plan_sf0.02/supplier/3.parquet, /testdata/tpch/plan_sf0.02/supplier/4.parquet, /testdata/tpch/plan_sf0.02/supplier/5.parquet], [/testdata/tpch/plan_sf0.02/supplier/6.parquet, /testdata/tpch/plan_sf0.02/supplier/7.parquet, /testdata/tpch/plan_sf0.02/supplier/8.parquet], ...]}, projection=[s_suppkey, s_name, s_nationkey], file_type=parquet, predicate=DynamicFilter [ empty ] └────────────────────────────────────────────────── "); Ok(()) @@ -1086,7 +1005,7 @@ mod tests { #[tokio::test] async fn test_tpch_22() -> Result<(), Box> { let plan = test_tpch_query("q22").await?; - assert_snapshot!(plan, @" + assert_snapshot!(plan, @r" ┌───── DistributedExec ── Tasks: t0:[p0] │ SortPreservingMergeExec: [cntrycode@0 ASC NULLS LAST] │ SortExec: expr=[cntrycode@0 ASC NULLS LAST], preserve_partitioning=[true] @@ -1098,23 +1017,21 @@ mod tests { │ NestedLoopJoinExec: join_type=Inner, filter=join_proj_push_down_1@1 > avg(customer.c_acctbal)@0, projection=[avg(customer.c_acctbal)@0, c_phone@1, c_acctbal@2] │ AggregateExec: mode=Final, gby=[], aggr=[avg(customer.c_acctbal)] │ CoalescePartitionsExec - │ [Stage 1] => NetworkCoalesceExec: output_partitions=16, input_tasks=4 + │ [Stage 1] => NetworkCoalesceExec: output_partitions=24, input_tasks=4 │ ProjectionExec: expr=[c_phone@0 as c_phone, c_acctbal@1 as c_acctbal, CAST(c_acctbal@1 AS Decimal128(19, 6)) as join_proj_push_down_1] │ HashJoinExec: mode=CollectLeft, join_type=LeftAnti, on=[(c_custkey@0, o_custkey@0)], projection=[c_phone@1, c_acctbal@2] │ CoalescePartitionsExec - │ [Stage 2] => NetworkCoalesceExec: output_partitions=16, input_tasks=4 + │ [Stage 2] => NetworkCoalesceExec: output_partitions=24, input_tasks=4 │ DataSourceExec: file_groups={6 groups: [[/testdata/tpch/plan_sf0.02/orders/1.parquet, /testdata/tpch/plan_sf0.02/orders/10.parquet, /testdata/tpch/plan_sf0.02/orders/11.parquet], [/testdata/tpch/plan_sf0.02/orders/12.parquet, /testdata/tpch/plan_sf0.02/orders/13.parquet, /testdata/tpch/plan_sf0.02/orders/14.parquet], [/testdata/tpch/plan_sf0.02/orders/15.parquet, /testdata/tpch/plan_sf0.02/orders/16.parquet, /testdata/tpch/plan_sf0.02/orders/2.parquet], [/testdata/tpch/plan_sf0.02/orders/3.parquet, /testdata/tpch/plan_sf0.02/orders/4.parquet, /testdata/tpch/plan_sf0.02/orders/5.parquet], [/testdata/tpch/plan_sf0.02/orders/6.parquet, /testdata/tpch/plan_sf0.02/orders/7.parquet, /testdata/tpch/plan_sf0.02/orders/8.parquet], ...]}, projection=[o_custkey], file_type=parquet └────────────────────────────────────────────────── - ┌───── Stage 1 ── Tasks: t0:[p0..p3] t1:[p4..p7] t2:[p8..p11] t3:[p12..p15] + ┌───── Stage 1 ── Tasks: t0:[p0..p5] t1:[p6..p11] t2:[p12..p17] t3:[p18..p23] │ AggregateExec: mode=Partial, gby=[], aggr=[avg(customer.c_acctbal)] │ FilterExec: c_acctbal@1 > Some(0),15,2 AND substr(c_phone@0, 1, 2) IN (SET) ([13, 31, 23, 29, 30, 18, 17]), projection=[c_acctbal@1] - │ PartitionIsolatorExec: tasks=4 partitions=16 - │ DataSourceExec: file_groups={16 groups: [[/testdata/tpch/plan_sf0.02/customer/1.parquet], [/testdata/tpch/plan_sf0.02/customer/10.parquet], [/testdata/tpch/plan_sf0.02/customer/11.parquet], [/testdata/tpch/plan_sf0.02/customer/12.parquet], [/testdata/tpch/plan_sf0.02/customer/13.parquet], ...]}, projection=[c_phone, c_acctbal], file_type=parquet, predicate=c_acctbal@5 > Some(0),15,2 AND substr(c_phone@4, 1, 2) IN (SET) ([13, 31, 23, 29, 30, 18, 17]), pruning_predicate=c_acctbal_null_count@1 != row_count@2 AND c_acctbal_max@0 > Some(0),15,2, required_guarantees=[] + │ DistributedLeafExec: DataSourceExec: file_groups={6 groups: [[/testdata/tpch/plan_sf0.02/customer/1.parquet, /testdata/tpch/plan_sf0.02/customer/10.parquet, /testdata/tpch/plan_sf0.02/customer/11.parquet], [/testdata/tpch/plan_sf0.02/customer/12.parquet, /testdata/tpch/plan_sf0.02/customer/13.parquet, /testdata/tpch/plan_sf0.02/customer/14.parquet], [/testdata/tpch/plan_sf0.02/customer/15.parquet, /testdata/tpch/plan_sf0.02/customer/16.parquet, /testdata/tpch/plan_sf0.02/customer/2.parquet], [/testdata/tpch/plan_sf0.02/customer/3.parquet, /testdata/tpch/plan_sf0.02/customer/4.parquet, /testdata/tpch/plan_sf0.02/customer/5.parquet], [/testdata/tpch/plan_sf0.02/customer/6.parquet, /testdata/tpch/plan_sf0.02/customer/7.parquet, /testdata/tpch/plan_sf0.02/customer/8.parquet], ...]}, projection=[c_phone, c_acctbal], file_type=parquet, predicate=c_acctbal@5 > Some(0),15,2 AND substr(c_phone@4, 1, 2) IN (SET) ([13, 31, 23, 29, 30, 18, 17]), pruning_predicate=c_acctbal_null_count@1 != row_count@2 AND c_acctbal_max@0 > Some(0),15,2, required_guarantees=[] └────────────────────────────────────────────────── - ┌───── Stage 2 ── Tasks: t0:[p0..p3] t1:[p4..p7] t2:[p8..p11] t3:[p12..p15] + ┌───── Stage 2 ── Tasks: t0:[p0..p5] t1:[p6..p11] t2:[p12..p17] t3:[p18..p23] │ FilterExec: substr(c_phone@1, 1, 2) IN (SET) ([13, 31, 23, 29, 30, 18, 17]) - │ PartitionIsolatorExec: tasks=4 partitions=16 - │ DataSourceExec: file_groups={16 groups: [[/testdata/tpch/plan_sf0.02/customer/1.parquet], [/testdata/tpch/plan_sf0.02/customer/10.parquet], [/testdata/tpch/plan_sf0.02/customer/11.parquet], [/testdata/tpch/plan_sf0.02/customer/12.parquet], [/testdata/tpch/plan_sf0.02/customer/13.parquet], ...]}, projection=[c_custkey, c_phone, c_acctbal], file_type=parquet, predicate=substr(c_phone@4, 1, 2) IN (SET) ([13, 31, 23, 29, 30, 18, 17]) + │ DistributedLeafExec: DataSourceExec: file_groups={6 groups: [[/testdata/tpch/plan_sf0.02/customer/1.parquet, /testdata/tpch/plan_sf0.02/customer/10.parquet, /testdata/tpch/plan_sf0.02/customer/11.parquet], [/testdata/tpch/plan_sf0.02/customer/12.parquet, /testdata/tpch/plan_sf0.02/customer/13.parquet, /testdata/tpch/plan_sf0.02/customer/14.parquet], [/testdata/tpch/plan_sf0.02/customer/15.parquet, /testdata/tpch/plan_sf0.02/customer/16.parquet, /testdata/tpch/plan_sf0.02/customer/2.parquet], [/testdata/tpch/plan_sf0.02/customer/3.parquet, /testdata/tpch/plan_sf0.02/customer/4.parquet, /testdata/tpch/plan_sf0.02/customer/5.parquet], [/testdata/tpch/plan_sf0.02/customer/6.parquet, /testdata/tpch/plan_sf0.02/customer/7.parquet, /testdata/tpch/plan_sf0.02/customer/8.parquet], ...]}, projection=[c_custkey, c_phone, c_acctbal], file_type=parquet, predicate=substr(c_phone@4, 1, 2) IN (SET) ([13, 31, 23, 29, 30, 18, 17]) └────────────────────────────────────────────────── "); Ok(()) diff --git a/tests/udfs.rs b/tests/udfs.rs index 8e8114f5..505859e7 100644 --- a/tests/udfs.rs +++ b/tests/udfs.rs @@ -50,10 +50,9 @@ mod tests { │ [Stage 1] => NetworkShuffleExec: output_partitions=3, input_tasks=3 └────────────────────────────────────────────────── ┌───── Stage 1 ── Tasks: t0:[p0..p5] t1:[p0..p5] t2:[p0..p5] - │ RepartitionExec: partitioning=Hash([test_udf(weather.RainToday)@0], 6), input_partitions=1 + │ RepartitionExec: partitioning=Hash([test_udf(weather.RainToday)@0], 6), input_partitions=3 │ AggregateExec: mode=Partial, gby=[test_udf(RainToday@0) as test_udf(weather.RainToday)], aggr=[count(Int64(1))] - │ PartitionIsolatorExec: tasks=3 partitions=3 - │ DataSourceExec: file_groups={3 groups: [[/testdata/weather/result-000000.parquet], [/testdata/weather/result-000001.parquet], [/testdata/weather/result-000002.parquet]]}, projection=[RainToday], file_type=parquet + │ DistributedLeafExec: DataSourceExec: file_groups={3 groups: [[/testdata/weather/result-000000.parquet], [/testdata/weather/result-000001.parquet], [/testdata/weather/result-000002.parquet]]}, projection=[RainToday], file_type=parquet └────────────────────────────────────────────────── ", );