Skip to content

Change the final output result of struct from list to map#5227

Merged
yuancu merged 2 commits intoopensearch-project:mainfrom
qianheng-aws:graphEnhance
Mar 13, 2026
Merged

Change the final output result of struct from list to map#5227
yuancu merged 2 commits intoopensearch-project:mainfrom
qianheng-aws:graphEnhance

Conversation

@qianheng-aws
Copy link
Collaborator

@qianheng-aws qianheng-aws commented Mar 12, 2026

Description

This PR changes the final output format of RelRecordType from a list to a map (keyed by column name) to prevent the loss of column name information in query results.

In PPL, this RelDataType is only used by graphlookup operator.

Before this PR, graphlookup like:

source=employees 
| graphLookup employees 
    start=reportsTo
    edge=reportsTo-->name
    maxDepth=3 
  as hierarchy

will get response:

{
  "schema": [
    {"name": "id", "alias": null, "type": "integer"},
    {"name": "name", "alias": null, "type": "keyword"},
    {"name": "reportsTo", "alias": null, "type": "keyword"},
    {"name": "hierarchy", "alias": null, "type": "array"}
  ],
  "datarows": [
    [1, "Dev", "Eliot", [
      ["Eliot", "Ron", 2],
      ["Ron", "Andrew", 3],
      ["Andrew", null, 4]
    ]]
  ],
  "total": 1,
  "size": 1,
  "status": 200
}

With this PR, it gets changed to:

{
  "schema": [
    {"name": "id", "alias": null, "type": "integer"},
    {"name": "name", "alias": null, "type": "keyword"},
    {"name": "reportsTo", "alias": null, "type": "keyword"},
    {"name": "hierarchy", "alias": null, "type": "array"}
  ],
  "datarows": [
    [1, "Dev", "Eliot", [
      {"name": "Eliot", "reportsTo": "Ron", "id": 2},
      {"name": "Ron", "reportsTo": "Andrew", "id": 3},
      {"name": "Andrew", "reportsTo": null, "id": 4}
    ]]
  ],
  "total": 1,
  "size": 1,
  "status": 200
}

Related Issues

Resolves #5225

Check List

  • New functionality includes testing.
  • New functionality has been documented.
  • New functionality has javadoc added.
  • New functionality has a user manual doc added.
  • New PPL command checklist all confirmed.
  • API changes companion pull request created.
  • Commits are signed per the DCO using --signoff or -s.
  • Public documentation issue/PR created.

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.

Signed-off-by: Heng Qian <qianheng@amazon.com>
@qianheng-aws qianheng-aws added enhancement New feature or request feature labels Mar 12, 2026
@github-actions
Copy link
Contributor

github-actions bot commented Mar 12, 2026

PR Reviewer Guide 🔍

(Review updated until commit 48642c0)

Here are some key observations to aid the review process:

🧪 PR contains tests
🔒 No security concerns identified
✅ No TODO sections
🔀 Multiple PR themes

Sub-PR theme: Core struct-to-map conversion in execution engine

Relevant files:

  • opensearch/src/main/java/org/opensearch/sql/opensearch/executor/OpenSearchExecutionEngine.java

Sub-PR theme: Update integration tests and documentation for map output format

Relevant files:

  • integ-test/src/test/java/org/opensearch/sql/calcite/remote/CalcitePPLGraphLookupIT.java
  • docs/user/ppl/cmd/graphlookup.md

⚡ Recommended focus areas for review

Fallback Behavior

When a StructImpl is encountered but no type metadata is available (type is null or not ROW), the code falls back to Arrays.asList(attrs) — the old list-based behavior. This inconsistency could silently produce the old format in some code paths. It should be verified whether this fallback is intentional and documented.

if (value instanceof StructImpl structImpl) {
  Object[] attrs = structImpl.getAttributes();
  if (type != null && type.getSqlTypeName() == SqlTypeName.ROW) {
    List<RelDataTypeField> fields = type.getFieldList();
    Map<String, Object> map = new LinkedHashMap<>();
    for (int i = 0; i < fields.size() && i < attrs.length; i++) {
      map.put(fields.get(i).getName(), processValue(attrs[i], fields.get(i).getType()));
    }
    return map;
  }
  return Arrays.asList(attrs);
Null Type Propagation

When processing a Map (not a StructImpl), nested values are processed with processValue(entry.getValue(), null), discarding any type information. If a map contains nested structs, they will fall back to the list representation. It should be verified whether maps can contain nested structs that need type-aware processing.

if (value instanceof Map) {
  Map<String, Object> map = (Map<String, Object>) value;
  Map<String, Object> convertedMap = new HashMap<>();
  for (Map.Entry<String, Object> entry : map.entrySet()) {
    convertedMap.put(entry.getKey(), processValue(entry.getValue(), null));
  }
  return convertedMap;
Map Ordering

Tests use Map.of() (unordered) for some assertions and mapOf() (LinkedHashMap, ordered) for others. If the test framework compares maps by equality this is fine, but if it compares by serialized string representation, Map.of() entries could appear in arbitrary order and cause flaky tests.

rows("Dev", "Eliot", 1, List.of(Map.of("name", "Eliot", "reportsTo", "Ron", "id", 2))),
rows("Eliot", "Ron", 2, List.of(Map.of("name", "Ron", "reportsTo", "Andrew", "id", 3))),
rows("Ron", "Andrew", 3, List.of(mapOf("name", "Andrew", "reportsTo", null, "id", 4))),
rows("Andrew", null, 4, Collections.emptyList()),
rows("Asya", "Ron", 5, List.of(Map.of("name", "Ron", "reportsTo", "Andrew", "id", 3))),
rows("Dan", "Andrew", 6, List.of(mapOf("name", "Andrew", "reportsTo", null, "id", 4))));

@github-actions
Copy link
Contributor

github-actions bot commented Mar 12, 2026

PR Code Suggestions ✨

Latest suggestions up to 48642c0
Explore these optional code suggestions:

CategorySuggestion                                                                                                                                    Impact
General
Propagate type metadata when processing Map values

When processing a Map value, the type metadata is passed as null for all nested
values, losing field-type information. If the incoming type is a ROW type, you
should look up the field type by name and pass it along, similar to how StructImpl
is handled.

opensearch/src/main/java/org/opensearch/sql/opensearch/executor/OpenSearchExecutionEngine.java [247-254]

 if (value instanceof Map) {
   Map<String, Object> map = (Map<String, Object>) value;
   Map<String, Object> convertedMap = new HashMap<>();
   for (Map.Entry<String, Object> entry : map.entrySet()) {
-    convertedMap.put(entry.getKey(), processValue(entry.getValue(), null));
+    RelDataType fieldType = null;
+    if (type != null && type.getSqlTypeName() == SqlTypeName.ROW) {
+      RelDataTypeField field = type.getField(entry.getKey(), true, false);
+      if (field != null) fieldType = field.getType();
+    }
+    convertedMap.put(entry.getKey(), processValue(entry.getValue(), fieldType));
   }
   return convertedMap;
 }
Suggestion importance[1-10]: 5

__

Why: The suggestion is valid and would improve type propagation consistency between Map and StructImpl handling. However, in practice, Map values from JDBC result sets may not always correspond to ROW types, and the current code already handles the primary use case. This is a moderate improvement for correctness in edge cases.

Low
Add input validation to test helper method

The mapOf helper does not validate that an even number of arguments is provided. If
an odd number of arguments is accidentally passed, keysAndValues[i + 1] will throw
an ArrayIndexOutOfBoundsException. Add a guard at the start of the method to catch
this misuse early with a clear error message.

integ-test/src/test/java/org/opensearch/sql/calcite/remote/CalcitePPLGraphLookupIT.java [55-61]

 private static Map<String, Object> mapOf(Object... keysAndValues) {
+  if (keysAndValues.length % 2 != 0) {
+    throw new IllegalArgumentException("mapOf requires an even number of arguments, got: " + keysAndValues.length);
+  }
   LinkedHashMap<String, Object> map = new LinkedHashMap<>();
   for (int i = 0; i < keysAndValues.length; i += 2) {
     map.put((String) keysAndValues[i], keysAndValues[i + 1]);
   }
   return map;
 }
Suggestion importance[1-10]: 3

__

Why: Adding a guard for odd-length arguments to mapOf is a defensive programming improvement, but since this is a private test helper used only within the test class with hardcoded arguments, the risk of misuse is very low and the impact is minimal.

Low

Previous suggestions

Suggestions up to commit 7b6d33c
CategorySuggestion                                                                                                                                    Impact
General
Recursively process struct attributes in fallback path

When type is null or not a ROW type, the fallback Arrays.asList(attrs) returns raw
attribute values without recursive processing. Nested StructImpl or List objects
within attrs won't be converted. The fallback should also recursively process each
attribute.

opensearch/src/main/java/org/opensearch/sql/opensearch/executor/OpenSearchExecutionEngine.java [255-266]

 if (value instanceof StructImpl structImpl) {
   Object[] attrs = structImpl.getAttributes();
   if (type != null && type.getSqlTypeName() == SqlTypeName.ROW) {
     List<RelDataTypeField> fields = type.getFieldList();
     Map<String, Object> map = new LinkedHashMap<>();
     for (int i = 0; i < fields.size() && i < attrs.length; i++) {
       map.put(fields.get(i).getName(), processValue(attrs[i], fields.get(i).getType()));
     }
     return map;
   }
-  return Arrays.asList(attrs);
+  List<Object> result = new ArrayList<>();
+  for (Object attr : attrs) {
+    result.add(processValue(attr, null));
+  }
+  return result;
 }
Suggestion importance[1-10]: 6

__

Why: The fallback path in StructImpl handling returns raw attrs without recursive processing, which could leave nested StructImpl or List objects unconverted. The suggestion correctly identifies this gap and provides a proper fix by recursively calling processValue on each attribute.

Low
Propagate type info when processing nested maps

When processing a Map value, the type information is discarded by passing null as
the type for nested values. If the map's field types are available from the
RelDataType, they should be passed along to preserve struct-to-map conversion for
nested fields. Consider looking up the field type from type when it's a ROW type,
similar to how StructImpl is handled.

opensearch/src/main/java/org/opensearch/sql/opensearch/executor/OpenSearchExecutionEngine.java [247-254]

 if (value instanceof Map) {
   Map<String, Object> map = (Map<String, Object>) value;
   Map<String, Object> convertedMap = new HashMap<>();
   for (Map.Entry<String, Object> entry : map.entrySet()) {
-    convertedMap.put(entry.getKey(), processValue(entry.getValue(), null));
+    RelDataType fieldType = null;
+    if (type != null && type.getSqlTypeName() == SqlTypeName.ROW) {
+      RelDataTypeField field = type.getField(entry.getKey(), true, false);
+      if (field != null) fieldType = field.getType();
+    }
+    convertedMap.put(entry.getKey(), processValue(entry.getValue(), fieldType));
   }
   return convertedMap;
 }
Suggestion importance[1-10]: 5

__

Why: The suggestion is valid - passing null for nested map values loses type information that could be used for proper struct-to-map conversion. However, in practice, Map values from JDBC result sets typically don't contain nested StructImpl objects that need type-aware conversion, making this a minor improvement rather than a critical fix.

Low

Signed-off-by: Heng Qian <qianheng@amazon.com>
@github-actions
Copy link
Contributor

Persistent review updated to latest commit 48642c0

@yuancu yuancu merged commit df8f259 into opensearch-project:main Mar 13, 2026
58 of 59 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

enhancement New feature or request feature

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[FEATURE][RFC] Support native OpenSearch JSON document format as PPL query response

3 participants