What is the bug?
When using eval to materialize multiple dotted paths from the same MAP column in a single command (e.g., eval doc.user.name=doc.user.name, doc.user.age=doc.user.age), the second assignment fails with Field [doc.user.age] not found. The first assignment succeeds, but it removes the MAP root column doc from the schema, making subsequent paths unresolvable.
How can one reproduce the bug?
Steps to reproduce the behavior:
- Create a test index with a JSON string field:
curl -s -X PUT 'localhost:9200/t/_doc/1?refresh=true' -H 'Content-Type: application/json' \
-d '{"doc": "{\"user\":{\"name\":\"John\",\"age\":30}}"}'
curl -s -X PUT 'localhost:9200/t/_doc/2?refresh=true' -H 'Content-Type: application/json' \
-d '{"doc": "{\"user\":{\"name\":\"Alice\",\"age\":25}}"}'
- Enable Calcite and run:
source=t | spath input=doc | eval doc.user.name=doc.user.name, doc.user.age=doc.user.age`
Error: Field [doc.user.age] not found
- Note: splitting into two separate eval commands also fails for the same reason:
source=t | spath input=doc | eval doc.user.name=doc.user.name | eval doc.user.age=doc.user.age
What is the expected behavior?
Both dotted-path assignments should succeed. The MAP root column doc should not be removed when a dotted-path column like doc.user.name is added — the dotted name is a new flat column, not a nested sub-field of doc.
What is your host/environment?
- Version: PPL Calcite V3
- Plugins: 3.6
Do you have any screenshots?
N/A
Do you have any additional context?
- Single dotted-path eval works:
eval doc.user.name=doc.user.name succeeds.
- The workaround is to use non-dotted aliases:
eval username=doc.user.name, age=doc.user.age.
- The
shouldOverrideField heuristic was designed for OpenSearch nested/struct fields where resource.attributes.key is a real sub-field of resource. For MAP columns produced by spath, the dotted name is just a flat column name that happens to contain dots — it's not a nested sub-field.
- The root cause is in
projectPlusOverriding → shouldOverrideField, which checks newName.startsWith(originalName + "."). When eval creates a column named doc.user.name, the check "doc.user.name".startsWith("doc.") returns true, so projectPlusOverriding removes the original doc MAP column. The second eval expression doc.user.age=doc.user.age then fails because doc no longer exists in the schema.
// shouldOverrideField (CalciteRelNodeVisitor.java)
private boolean shouldOverrideField(String originalName, List<String> newNames) {
return newNames.stream()
.anyMatch(newName ->
newName.equals(originalName)
|| newName.startsWith(originalName + ".")); // ← incorrectly matches MAP root
}
What is the bug?
When using
evalto materialize multiple dotted paths from the same MAP column in a single command (e.g.,eval doc.user.name=doc.user.name, doc.user.age=doc.user.age), the second assignment fails withField [doc.user.age] not found. The first assignment succeeds, but it removes the MAP root columndocfrom the schema, making subsequent paths unresolvable.How can one reproduce the bug?
Steps to reproduce the behavior:
What is the expected behavior?
Both dotted-path assignments should succeed. The MAP root column doc should not be removed when a dotted-path column like
doc.user.nameis added — the dotted name is a new flat column, not a nested sub-field of doc.What is your host/environment?
Do you have any screenshots?
N/A
Do you have any additional context?
eval doc.user.name=doc.user.name succeeds.eval username=doc.user.name, age=doc.user.age.shouldOverrideFieldheuristic was designed for OpenSearch nested/struct fields where resource.attributes.key is a real sub-field of resource. For MAP columns produced by spath, the dotted name is just a flat column name that happens to contain dots — it's not a nested sub-field.projectPlusOverriding→shouldOverrideField, which checksnewName.startsWith(originalName + "."). When eval creates a column nameddoc.user.name, the check"doc.user.name".startsWith("doc.")returns true, soprojectPlusOverridingremoves the originaldocMAP column. The second eval expressiondoc.user.age=doc.user.agethen fails becausedocno longer exists in the schema.