[CALCITE-7510] EnumerableTableModify UPDATE, DELETE, INSERT support and bug fixes#4922
[CALCITE-7510] EnumerableTableModify UPDATE, DELETE, INSERT support and bug fixes#4922wasabii wants to merge 1 commit into
Conversation
|
Thank you for your contribution! Please refer to point four of the PR template and make adjustments accordingly. |
0f0eb88 to
09685a2
Compare
|
|
|
||
| /** Generates code for an UPDATE statement. | ||
| * | ||
| * <p>The child produces, for each row matched by the WHERE clause, a row of |
There was a problem hiding this comment.
is "the child" childExp?
This phrase is awkward with lots of commas, you can rearrange it a bit.
| * updateColumnList.size()} fields are the new values, one per column named in | ||
| * the SET clause. Filtering by WHERE has already been applied upstream, so | ||
| * every source row corresponds to an existing row in the modifiable collection | ||
| * and can be located by full-row content equality. |
There was a problem hiding this comment.
What happens if the collection is a multiset?
Equality will match multiple rows.
I hope there is a test for this case.
There was a problem hiding this comment.
It would match multiple rows. I thought about this, but I am not sure how better to go about it currently. There isn't really any sort of 'row id' passed around between the two sides of this. I also don't really like the full equality test. But, no PK metadata either.
I figured that this was largely good enough for the ServerDdlExecutor.
There was a problem hiding this comment.
Actually, I could do something like take the input update list, and ensure that we only ever update one item in the target collection per item in the update list. It's just a lot of complicated accounting, and I don't know if it matter. It certainly doesn't for my use case for this (which is just minimal UPDATE functionality so I can use ServerDdlExecutor as the backend for some other projects test cases.
There was a problem hiding this comment.
Calcite is a library with many users, we have to build APIs that have well-defined contracts.
The contract has to be defined clearly, and to make sense. Perhaps it's fine to not support deletion from multi-sets, but this has to be documented and perhaps justified. One justification is that using the DELETE statement you can only delete all or none of the copies of a value from a multiset.
There was a problem hiding this comment.
I implemented it as I suggested. With tests. Since all the records in this hypothetical are the same, it just deletes the first record it hits for each item in the source set. So whatever copies remain remain.
|
|
||
| /** Generates code for a DELETE statement. | ||
| * | ||
| * <p>{@code Object[].equals()} is reference equality, so |
There was a problem hiding this comment.
This comment is about the implementation and not what this function does. I would make it a regular comment. The JavaDoc can use more information, though.
|
|
||
| /** Generates code for an INSERT statement. | ||
| * | ||
| * <p>When the child row type differs from the table row type (the common case |
| /** | ||
| * Removes from {@code sink} every row that appears in {@code source}. | ||
| * | ||
| * <p>For multi-column rows the backing collection stores {@code Object[]} |
There was a problem hiding this comment.
another implementation comment
| for (Object row : tempList) { | ||
| toRemove.add(Arrays.asList((Object[]) row)); | ||
| } | ||
| ((Collection<Object[]>) sink).removeIf(row -> toRemove.contains(Arrays.asList(row))); |
There was a problem hiding this comment.
This seems to rely on the fact that all equal rows will either be deleted or not.
Maybe this is fine, but then function definition should specify this.
There are legit cases where you may want to remove only some of the copies (e.g., IVM).
mihaibudiu
left a comment
There was a problem hiding this comment.
Please do not amend commits during the review process, especially for such a large PR.
It makes it difficult for reviewers to see what has changed
mihaibudiu
left a comment
There was a problem hiding this comment.
I think this is fine, but we won't merge it before the next release, which will hopefully happen soon.
| } | ||
|
|
||
| /** | ||
| * Removes from {@code sink} one occurrence per occurrence in {@code source} |
There was a problem hiding this comment.
this comment is a bit awkward to read
| sourceRows.add(e.current()); | ||
| } | ||
| } | ||
| // Drain source first, then mutate sink, to avoid iterator interference |
There was a problem hiding this comment.
I would say "mutate sink after having drained the source".
The way the comment is written it makes it look like the next loop also mutates the source
I did this because the policy said The commit text should match the PR title. I assumed you wanted then squashed. It was super annoying to do and I would have rather not but I thought it was correct. |
|
We'll squash before merging |



Jira Link
CALCITE-7510
Changes Proposed
Added UPDATE support to EnumerableTableModify. Generates code that calls ExtendedEnumerable.update, which is new, and scans the destination list to apply updates.