Skip to content

SQL: struct support#586

Open
kbatuigas wants to merge 5 commits into
rp-sqlfrom
DOC-2019-document-feature-record-structure-type-support
Open

SQL: struct support#586
kbatuigas wants to merge 5 commits into
rp-sqlfrom
DOC-2019-document-feature-record-structure-type-support

Conversation

@kbatuigas
Copy link
Copy Markdown
Contributor

@kbatuigas kbatuigas commented May 14, 2026

Description

This pull request introduces comprehensive documentation improvements for working with nested fields in Redpanda SQL, focusing on mapping nested Protobuf or Avro structures as SQL ROW columns and querying them directly. It clarifies the use of the struct_mapping_policy option, expands the reference for the ROW data type, and adds a dedicated how-to guide for querying nested fields.

New documentation and feature explanations:

  • Added a new how-to page, query-nested-fields.adoc, detailing how to map topics with nested schemas as SQL tables using struct_mapping_policy = 'COMPOUND', how to query nested fields with ROW syntax, and how to handle recursive (cyclic) schemas.
  • Updated the navigation (nav.adoc) to include the new "Query Topics with Nested Fields" guide.

Improvements to ROW data type documentation:

  • Expanded the ROW data type reference to document field access (by position and name), wildcard projection, lexicographic comparison, NULL checks, text conversion, and usage in GROUP BY, ORDER BY, and JOIN clauses.
  • Enhanced the ROW type summary to mention its support for field access, comparisons, and use in query clauses.

Clarifications to CREATE TABLE options:

  • Clarified the struct_mapping_policy option in the CREATE TABLE documentation, emphasizing that COMPOUND maps nested structures to SQL ROW columns and noting that cyclic types are only supported in JSON mode.

Resolves https://github.com/redpanda-data/documentation-private/issues/
Review deadline: 20 May

Page previews

Reference > Redpanda SQL Reference > Data Types > Row
Redpanda SQL > Query Data > Query Topics with Nested Fields

Checks

  • New feature
  • Content gap
  • Support Follow-up
  • Small fix (typos, links, copyedits, etc)

@kbatuigas kbatuigas requested a review from a team as a code owner May 14, 2026 05:30
@netlify
Copy link
Copy Markdown

netlify Bot commented May 14, 2026

Deploy Preview for rp-cloud ready!

Name Link
🔨 Latest commit 60053ed
🔍 Latest deploy log https://app.netlify.com/projects/rp-cloud/deploys/6a0d0256479bdb000845a914
😎 Deploy Preview https://deploy-preview-586--rp-cloud.netlify.app
📱 Preview on mobile
Toggle QR Code...

QR Code

Use your smartphone camera to open QR code link.

To edit notification comments on pull requests, go to your Netlify project configuration.

@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai Bot commented May 14, 2026

Important

Review skipped

Auto reviews are disabled on base/target branches other than the default branch.

Please check the settings in the CodeRabbit UI or the .coderabbit.yaml file in this repository. To trigger a single review, invoke the @coderabbitai review command.

⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 25007d75-3ce4-4963-bd4e-cd32370fd5b9

You can disable this status message by setting the reviews.review_status to false in the CodeRabbit configuration file.

Use the checkbox below for a quick retry:

  • 🔍 Trigger review
✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch DOC-2019-document-feature-record-structure-type-support

Comment @coderabbitai help to get the list of available commands and usage tips.

@kbatuigas kbatuigas requested a review from pkonrad1229 May 14, 2026 18:07
:learning-objective-2: Query nested fields using ROW field-access syntax
:learning-objective-3: Recognize and resolve cyclic-reference errors

When a glossterm:topic[]'s schema includes nested Protobuf or Avro message types, you can map those nested structures as SQL `ROW` columns instead of opaque JSON. This makes nested fields queryable by name, includable in projections, and usable in `WHERE`, `GROUP BY`, and `ORDER BY` clauses, without parsing JSON at query time.
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That may be a nitpick, but stating that we are mapping as SQL ROW columns is not entirely true.

In PostgreSQL, a ROW is an anonymous record, in which you cannot explicitly set the sub-field names (they contain some generic f1, f2, f<n>... names that you cannot change).

What we do in the COMPOUND mapping is we actually create a User-Defined type, and set the names of the fields according to the schema.

Not sure if that's something that we want to explicitly state here, or maybe the ROW meaning here is something other than PostgreSQL ROW.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changed to

When a glossterm:topic[]'s schema includes nested Protobuf, Avro, or JSON message types, you can map those nested structures as user-defined types (UDTs) with named fields, queryable using SQL ROW field-access syntax, instead of opaque JSON. This makes nested fields queryable by name, includable in projections, and usable in WHERE, GROUP BY, and ORDER BY clauses, without parsing JSON at query time.

@mattschumpert do you have a preference on whether we explicitly mention user defined types?

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No idea. I defer to @pkonrad1229

}
----

Redpanda SQL maps the table with three columns: `order_id` (text), `customer` (a `ROW` with fields `customer_id`, `name`, and `region`), and `amount` (double precision).
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ditto here:

a ROW with fields

We may also say something along the lines of:

a structure/UDT with fields

Comment thread modules/sql/pages/query-data/query-nested-fields.adoc Outdated
(1 row)
----

=== Use implicit tuple syntax
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm late to the party, as this was not modified in this PR :D I believe it's worth noting that the implicit tuple syntax works only when there are two or more expressions

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@pkonrad1229 Hm, that may have just surfaced as something related based on the Claude Code research... is it ok to leave on this page? The explanation is currently under the first sectionn https://deploy-preview-586--rp-cloud.netlify.app/redpanda-cloud/reference/sql/sql-data-types/row/#syntax

Comment thread modules/reference/pages/sql/sql-data-types/row.adoc Outdated
Comment thread modules/sql/pages/query-data/query-nested-fields.adoc Outdated
@kbatuigas kbatuigas force-pushed the DOC-2019-document-feature-record-structure-type-support branch from 27ea3c1 to e1f3231 Compare May 19, 2026 03:29
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants