Feature Request / Improvement
Iceberg can write shredded variant columns to Parquet (#14297). On the read path,
SparkScanBuilder does not implement Spark 4.1's SupportsPushDownVariantExtractions,
so Spark never rewrites variant_get(...) into struct field accesses and never prunes the
scan output schema to the requested shredded fields.
As a result, even when queries only need one or two paths (e.g.
variant_get(payload, '$.size', 'long')), the optimized plan still evaluates
variant_get in a Spark Filter above BatchScan, and the scan output still
treats payload as VariantType. The reader therefore loads the full shredded
Parquet layout (hundreds of typed_value.* columns in practice), materializes a
full VARIANT per row, and applies path extraction only after the scan. That reconstruction
is expensive: on the GitHub Activities 1-day shredded Iceberg table (GHA), a simple
filter/count query was ~14× slower than the same workload on a plain JSON string column
(~63s vs ~4.4s), despite only ~1.6× more storage.
(Predicate pushdown of variant_get into IcebergScan filters for manifest
planning is handled separately #15385. it does not rewrite the plan or avoid reading
all shredded columns per file.)
This issue implements the DSv2 contract (plan rewrite). A follow-on change will wire the
annotated readSchema() into the Parquet reader and avoid full-variant reconstruction (I/O reduction).
Plan rewrite example
For this query:
CREATE TABLE events (
id INT,
type STRING,
payload VARIANT
) USING iceberg
TBLPROPERTIES ('format-version' = '3');
SELECT count(*) AS large_events
FROM events
WHERE type = 'PushEvent'
AND variant_get(payload, '$.size', 'long') > 5;
Before (today — no SupportsPushDownVariantExtractions)
Aggregate [count(1)]
+- Filter (... type = PushEvent)
AND (variant_get(payload#22, $.size, LongType, ...) > 5)) ← still a function call
+- RelationV2[type#19, payload#22]
Filter (variant_get(payload#22, $.size, ...) > 5) ← runs per row AFTER scan
+- BatchScan [type#15, payload#18]
IcebergScan(..., filters=type IS NOT NULL, payload IS NOT NULL, type = 'PushEvent')
ReadSchema: full payload variant / all shredded columns
After (DSv2 plan rewrite)
Aggregate [count(1)]
+- Filter (... type = PushEvent)
AND (payload#25.0 > 5)) ← struct field access, not variant_get
+- RelationV2[type#24, payload#25]
Filter (isnotnull(payload#25) AND (payload#25.0 > 5))
+- BatchScan [type#24, payload#25]
readSchema(): struct<payload: struct<0: bigint, ...>>
Note
This issue changes the logical/physical plan shape and readSchema() contract. Parquet may still read all shredded columns until the follow-on wires readSchema() into the reader.
Query engine
Spark
Willingness to contribute
Feature Request / Improvement
Iceberg can write shredded variant columns to Parquet (#14297). On the read path,
SparkScanBuilderdoes not implement Spark 4.1'sSupportsPushDownVariantExtractions,so Spark never rewrites
variant_get(...)into struct field accesses and never prunes thescan output schema to the requested shredded fields.
As a result, even when queries only need one or two paths (e.g.
variant_get(payload, '$.size', 'long')), the optimized plan still evaluatesvariant_getin a SparkFilteraboveBatchScan, and the scan output stilltreats
payloadasVariantType. The reader therefore loads the full shreddedParquet layout (hundreds of
typed_value.*columns in practice), materializes afull
VARIANTper row, and applies path extraction only after the scan. That reconstructionis expensive: on the GitHub Activities 1-day shredded Iceberg table (GHA), a simple
filter/count query was ~14× slower than the same workload on a plain JSON string column
(~63s vs ~4.4s), despite only ~1.6× more storage.
(Predicate pushdown of
variant_getintoIcebergScanfilters for manifestplanning is handled separately #15385. it does not rewrite the plan or avoid reading
all shredded columns per file.)
This issue implements the DSv2 contract (plan rewrite). A follow-on change will wire the
annotated readSchema() into the Parquet reader and avoid full-variant reconstruction (I/O reduction).
Plan rewrite example
For this query:
Before (today — no SupportsPushDownVariantExtractions)
After (DSv2 plan rewrite)
Note
This issue changes the logical/physical plan shape and readSchema() contract. Parquet may still read all shredded columns until the follow-on wires readSchema() into the reader.
Query engine
Spark
Willingness to contribute