Skip to content

fix: embed iceberg.schema in Arrow schema metadata for Snowflake compatibility#2250

Open
vovacf201 wants to merge 5 commits intoapache:mainfrom
risingwavelabs:pr/iceberg-schema-arrow-metadata
Open

fix: embed iceberg.schema in Arrow schema metadata for Snowflake compatibility#2250
vovacf201 wants to merge 5 commits intoapache:mainfrom
risingwavelabs:pr/iceberg-schema-arrow-metadata

Conversation

@vovacf201
Copy link

The ParquetWriter was only writing the iceberg.schema JSON into the Parquet footer key-value metadata (WriterProperties). Downstream readers like Snowflake also expect it in the Arrow schema metadata map, which is encoded in the ARROW:schema IPC section of the Parquet file.

Inject the iceberg.schema JSON into the Arrow schema metadata during writer initialization so it is present in both locations, matching the behavior of the Java Iceberg implementation.

Depends on #2249

Cherry-picked from risingwavelabs/iceberg-rust commit bcf208d

jonathanc-n and others added 5 commits March 18, 2026 11:07
* mock change

* remove fmt

* add unit tests

* fix tests

* format

* commit
#134)

* fix: embed iceberg.schema in Arrow schema metadata for Snowflake compatibility

The ParquetWriter was only writing the iceberg.schema JSON into the
Parquet footer key-value metadata (WriterProperties). Downstream readers
like Snowflake also expect it in the Arrow schema metadata map, which is
encoded in the ARROW:schema IPC section of the Parquet file.

Inject the iceberg.schema JSON into the Arrow schema metadata during
writer initialization so it is present in both locations, matching the
behavior of the Java Iceberg implementation.

* feat: fixed formatting
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants