fix: normalize nested field names in RecordBatchTransformer by vovacf201 · Pull Request #2251 · apache/iceberg-rust

vovacf201 · 2026-03-18T11:11:14Z

Parquet files use "item" as the List inner field name (Parquet spec) while Iceberg uses "element" (Iceberg spec). Similarly, Parquet uses "entries" for Map inner fields while Iceberg uses "key_value".

The RecordBatchTransformer previously used equals_datatype() (which ignores field names) to decide between PassThrough and Promote. This meant columns with mismatched nested field names were passed through unchanged, causing downstream consumers that use strict schema validation (like DataFusion's concat_batches) to fail with:
"column types must match schema types, expected List(Field { name: element ..."

Fix: use a 3-way comparison in generate_transform_operations:

Strict == match -> PassThrough (no cast needed)
equals_datatype() but != (field names differ) -> Promote (cast to normalize names)
Neither -> Promote (actual type promotion)

Cherry-picked from risingwavelabs/iceberg-rust commit 2e56dde

* fix: normalize nested field names in RecordBatchTransformer Parquet files use "item" as the List inner field name (Parquet spec) while Iceberg uses "element" (Iceberg spec). Similarly, Parquet uses "entries" for Map inner fields while Iceberg uses "key_value". The RecordBatchTransformer previously used equals_datatype() (which ignores field names) to decide between PassThrough and Promote. This meant columns with mismatched nested field names were passed through unchanged, causing downstream consumers that use strict schema validation (like DataFusion's concat_batches) to fail with: "column types must match schema types, expected List(Field { name: element ..." Fix: use a 3-way comparison in generate_transform_operations: 1. Strict == match → PassThrough (no cast needed) 2. equals_datatype() but != (field names differ) → Promote (cast to normalize names) 3. Neither → Promote (actual type promotion) * style: apply nightly rustfmt formatting

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix: normalize nested field names in RecordBatchTransformer#2251

fix: normalize nested field names in RecordBatchTransformer#2251
vovacf201 wants to merge 1 commit intoapache:mainfrom
risingwavelabs:pr/normalize-nested-field-names

vovacf201 commented Mar 18, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

vovacf201 commented Mar 18, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant