Gate new ScalarSubqueryExec node behind session property#22530
Gate new ScalarSubqueryExec node behind session property#22530LiaCastaneda wants to merge 4 commits into
Conversation
1c7af79 to
4f23ed0
Compare
4f23ed0 to
8416100
Compare
6e88a1c to
ddc20cd
Compare
gabotechs
left a comment
There was a problem hiding this comment.
Nice, thanks @LiaCastaneda. Is there a chance you could take a look at this one @neilconway?
| ## `ScalarSubqueryToJoin` instead of executed by `ScalarSubqueryExec`. This | ||
| ## restores pre-PR-21240 behavior, which has three known shortcomings the | ||
| ## physical-execution path was built to fix: multi-row subqueries silently | ||
| ## return wrong results, and uncorrelated scalar subqueries do not work in | ||
| ## ORDER BY / JOIN ON / aggregate-function arguments. Those cases are | ||
| ## intentionally not covered here; the queries below are the ones where both | ||
| ## paths agree. |
There was a problem hiding this comment.
These were known limitations/bugs #21240 fixed
|
I also ran the tpch queries in my local with the flag turned off (old path), all results match. Maybe it's worth adding it as part of the regular checks edit: added here d1b9dad |
neilconway
left a comment
There was a problem hiding this comment.
I think this makes sense as an interim measure if it will be too difficult to adapt df-distributed and/or ballista in the short-term, but long-term I'd prefer not to have a config option that silently produces incorrect query results. Can we add a note that disabling this is not recommended, and that we plan to remove the config option in the future -- say in a few DF releases from now?
| ## When `datafusion.optimizer.physical_uncorrelated_scalar_subquery` is false, | ||
| ## uncorrelated scalar subqueries are rewritten to left joins by | ||
| ## `ScalarSubqueryToJoin` instead of executed by `ScalarSubqueryExec`. This | ||
| ## restores pre-PR-21240 behavior, which has three known shortcomings the |
There was a problem hiding this comment.
I think you only list two known shortcomings?
| /// execution. When set to false, all scalar subqueries (including | ||
| /// uncorrelated ones) are rewritten to left joins by the | ||
| /// `ScalarSubqueryToJoin` optimizer rule. | ||
| pub physical_uncorrelated_scalar_subquery: bool, default = true |
There was a problem hiding this comment.
The other similar config options use the phrasing enable_...; we should probably adopt that for consistency.
physical_uncorrelated_scalar_subquery is also a mouthful, although I can't immediately think of a more concise name that is also accurate.
| /// | ||
| /// # Arguments | ||
| /// | ||
| /// * `subquery` - The correlated scalar subquery to decorrelate. |
There was a problem hiding this comment.
This comment needs updating.
Which issue does this PR close?
Related to discussion on #21240 and #21080 (comment).
PR #21240 introduced
ScalarSubqueryExec/ScalarSubqueryExprto execute uncorrelated scalar subqueries during physical execution. The two communicate via shared in process state (aslotinExecutionProps), which breaks distributed execution that may split execution across a network boundary between the producer (ScalarSubqueryExec) and the consumer expression (ScalarSubqueryExpr). See more details on this explanation in datafusion-contrib/datafusion-distributed#460What changes are included in this PR?
Adds a new optimizer config option
datafusion.optimizer.physical_uncorrelated_scalar_subquery(default true, preserving the current behavior). When true (default), behavior is unchanged from current main; when false, all scalar subqueries are rewritten to left joins byScalarSubqueryToJoinandScalarSubqueryExecis never constructed (which was the previous behavior).Are these changes tested?
Yes all tests pass and added
uncorrelated_scalar_subquery_rewritten_when_flag_offto test the negative case.Are there any user-facing changes?
Yes, a new config option
datafusion.optimizer.physical_uncorrelated_scalar_subquery(this just changes the way the query is executed but not the results)