Skip to content

[CALCITE-7510] EnumerableTableModify UPDATE, DELETE, INSERT support and bug fixes#4922

Open
wasabii wants to merge 1 commit into
apache:mainfrom
wasabii:upstream-main
Open

[CALCITE-7510] EnumerableTableModify UPDATE, DELETE, INSERT support and bug fixes#4922
wasabii wants to merge 1 commit into
apache:mainfrom
wasabii:upstream-main

Conversation

@wasabii
Copy link
Copy Markdown

@wasabii wasabii commented May 7, 2026

Jira Link

CALCITE-7510

Changes Proposed

Added UPDATE support to EnumerableTableModify. Generates code that calls ExtendedEnumerable.update, which is new, and scans the destination list to apply updates.

@xiedeyantu
Copy link
Copy Markdown
Member

Thank you for your contribution! Please refer to point four of the PR template and make adjustments accordingly.

4. CRITICAL CONSISTENCY RULE
   The following three items MUST match exactly in wording and meaning:
   (A) The Jira Issue Title
   (B) This Pull Request Title
   (C) Your Git Commit Message

   Format: [CALCITE-XXXX] <Description>
   Example: [CALCITE-0000] Add IF NOT EXISTS clause to CREATE TABLE

   Guidelines for a good Title:
   - Illustrate using SQL keywords (in ALL-CAPS) rather than Java method names.
   - Focus on the specification and user experience, not the implementation.
   - Make it clear whether it is a bug or a feature.
   - Use words like "should" to indicate desired behavior vs. current behavior (e.g., "Validator should not close model file").

@wasabii wasabii changed the title [CALCITE-7510]: Implement UPDATE on EnumerableTableModify [CALCITE-7510] EnumerableTableModify UPDATE support May 9, 2026
@wasabii wasabii force-pushed the upstream-main branch 3 times, most recently from 0f0eb88 to 09685a2 Compare May 9, 2026 15:35
@sonarqubecloud
Copy link
Copy Markdown

sonarqubecloud Bot commented May 9, 2026

@wasabii wasabii changed the title [CALCITE-7510] EnumerableTableModify UPDATE support [CALCITE-7510] EnumerableTableModify UPDATE, DELETE, INSERT support and bug fixes May 21, 2026

/** Generates code for an UPDATE statement.
*
* <p>The child produces, for each row matched by the WHERE clause, a row of
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is "the child" childExp?
This phrase is awkward with lots of commas, you can rearrange it a bit.

* updateColumnList.size()} fields are the new values, one per column named in
* the SET clause. Filtering by WHERE has already been applied upstream, so
* every source row corresponds to an existing row in the modifiable collection
* and can be located by full-row content equality.
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What happens if the collection is a multiset?
Equality will match multiple rows.
I hope there is a test for this case.

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would match multiple rows. I thought about this, but I am not sure how better to go about it currently. There isn't really any sort of 'row id' passed around between the two sides of this. I also don't really like the full equality test. But, no PK metadata either.

I figured that this was largely good enough for the ServerDdlExecutor.

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually, I could do something like take the input update list, and ensure that we only ever update one item in the target collection per item in the update list. It's just a lot of complicated accounting, and I don't know if it matter. It certainly doesn't for my use case for this (which is just minimal UPDATE functionality so I can use ServerDdlExecutor as the backend for some other projects test cases.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Calcite is a library with many users, we have to build APIs that have well-defined contracts.
The contract has to be defined clearly, and to make sense. Perhaps it's fine to not support deletion from multi-sets, but this has to be documented and perhaps justified. One justification is that using the DELETE statement you can only delete all or none of the copies of a value from a multiset.

Copy link
Copy Markdown
Author

@wasabii wasabii May 26, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I implemented it as I suggested. With tests. Since all the records in this hypothetical are the same, it just deletes the first record it hits for each item in the source set. So whatever copies remain remain.


/** Generates code for a DELETE statement.
*
* <p>{@code Object[].equals()} is reference equality, so
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This comment is about the implementation and not what this function does. I would make it a regular comment. The JavaDoc can use more information, though.


/** Generates code for an INSERT statement.
*
* <p>When the child row type differs from the table row type (the common case
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same comment as above

/**
* Removes from {@code sink} every row that appears in {@code source}.
*
* <p>For multi-column rows the backing collection stores {@code Object[]}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

another implementation comment

for (Object row : tempList) {
toRemove.add(Arrays.asList((Object[]) row));
}
((Collection<Object[]>) sink).removeIf(row -> toRemove.contains(Arrays.asList(row)));
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems to rely on the fact that all equal rows will either be deleted or not.
Maybe this is fine, but then function definition should specify this.
There are legit cases where you may want to remove only some of the copies (e.g., IVM).

Copy link
Copy Markdown
Contributor

@mihaibudiu mihaibudiu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please do not amend commits during the review process, especially for such a large PR.
It makes it difficult for reviewers to see what has changed

Copy link
Copy Markdown
Contributor

@mihaibudiu mihaibudiu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is fine, but we won't merge it before the next release, which will hopefully happen soon.

}

/**
* Removes from {@code sink} one occurrence per occurrence in {@code source}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this comment is a bit awkward to read

sourceRows.add(e.current());
}
}
// Drain source first, then mutate sink, to avoid iterator interference
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would say "mutate sink after having drained the source".
The way the comment is written it makes it look like the next loop also mutates the source

@mihaibudiu mihaibudiu added the LGTM-will-merge-soon Overall PR looks OK. Only minor things left. label May 26, 2026
@wasabii
Copy link
Copy Markdown
Author

wasabii commented May 26, 2026

Please do not amend commits during the review process, especially for such a large PR.
It makes it difficult for reviewers to see what has changed

I did this because the policy said The commit text should match the PR title. I assumed you wanted then squashed.

It was super annoying to do and I would have rather not but I thought it was correct.

@mihaibudiu
Copy link
Copy Markdown
Contributor

We'll squash before merging

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

LGTM-will-merge-soon Overall PR looks OK. Only minor things left.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants