SQL: struct support#586
Conversation
✅ Deploy Preview for rp-cloud ready!
To edit notification comments on pull requests, go to your Netlify project configuration. |
|
Important Review skippedAuto reviews are disabled on base/target branches other than the default branch. Please check the settings in the CodeRabbit UI or the ⚙️ Run configurationConfiguration used: Organization UI Review profile: CHILL Plan: Pro Run ID: You can disable this status message by setting the Use the checkbox below for a quick retry:
✨ Finishing Touches🧪 Generate unit tests (beta)
Comment |
| :learning-objective-2: Query nested fields using ROW field-access syntax | ||
| :learning-objective-3: Recognize and resolve cyclic-reference errors | ||
|
|
||
| When a glossterm:topic[]'s schema includes nested Protobuf or Avro message types, you can map those nested structures as SQL `ROW` columns instead of opaque JSON. This makes nested fields queryable by name, includable in projections, and usable in `WHERE`, `GROUP BY`, and `ORDER BY` clauses, without parsing JSON at query time. |
There was a problem hiding this comment.
That may be a nitpick, but stating that we are mapping as SQL ROW columns is not entirely true.
In PostgreSQL, a ROW is an anonymous record, in which you cannot explicitly set the sub-field names (they contain some generic f1, f2, f<n>... names that you cannot change).
What we do in the COMPOUND mapping is we actually create a User-Defined type, and set the names of the fields according to the schema.
Not sure if that's something that we want to explicitly state here, or maybe the ROW meaning here is something other than PostgreSQL ROW.
There was a problem hiding this comment.
Changed to
When a glossterm:topic[]'s schema includes nested Protobuf, Avro, or JSON message types, you can map those nested structures as user-defined types (UDTs) with named fields, queryable using SQL
ROWfield-access syntax, instead of opaque JSON. This makes nested fields queryable by name, includable in projections, and usable inWHERE,GROUP BY, andORDER BYclauses, without parsing JSON at query time.
@mattschumpert do you have a preference on whether we explicitly mention user defined types?
There was a problem hiding this comment.
No idea. I defer to @pkonrad1229
| } | ||
| ---- | ||
|
|
||
| Redpanda SQL maps the table with three columns: `order_id` (text), `customer` (a `ROW` with fields `customer_id`, `name`, and `region`), and `amount` (double precision). |
There was a problem hiding this comment.
ditto here:
a
ROWwith fields
We may also say something along the lines of:
a structure/UDT with fields
| (1 row) | ||
| ---- | ||
|
|
||
| === Use implicit tuple syntax |
There was a problem hiding this comment.
I'm late to the party, as this was not modified in this PR :D I believe it's worth noting that the implicit tuple syntax works only when there are two or more expressions
There was a problem hiding this comment.
@pkonrad1229 Hm, that may have just surfaced as something related based on the Claude Code research... is it ok to leave on this page? The explanation is currently under the first sectionn https://deploy-preview-586--rp-cloud.netlify.app/redpanda-cloud/reference/sql/sql-data-types/row/#syntax
27ea3c1 to
e1f3231
Compare
Description
This pull request introduces comprehensive documentation improvements for working with nested fields in Redpanda SQL, focusing on mapping nested Protobuf or Avro structures as SQL ROW columns and querying them directly. It clarifies the use of the
struct_mapping_policyoption, expands the reference for theROWdata type, and adds a dedicated how-to guide for querying nested fields.New documentation and feature explanations:
query-nested-fields.adoc, detailing how to map topics with nested schemas as SQL tables usingstruct_mapping_policy = 'COMPOUND', how to query nested fields with ROW syntax, and how to handle recursive (cyclic) schemas.nav.adoc) to include the new "Query Topics with Nested Fields" guide.Improvements to ROW data type documentation:
ROWdata type reference to document field access (by position and name), wildcard projection, lexicographic comparison, NULL checks, text conversion, and usage inGROUP BY,ORDER BY, andJOINclauses.ROWtype summary to mention its support for field access, comparisons, and use in query clauses.Clarifications to CREATE TABLE options:
struct_mapping_policyoption in theCREATE TABLEdocumentation, emphasizing thatCOMPOUNDmaps nested structures to SQL ROW columns and noting that cyclic types are only supported inJSONmode.Resolves https://github.com/redpanda-data/documentation-private/issues/
Review deadline: 20 May
Page previews
Reference > Redpanda SQL Reference > Data Types > Row
Redpanda SQL > Query Data > Query Topics with Nested Fields
Checks