feat: introduce Result Service using Lakekeeper as REST catalog for Iceberg - catalog migration #4272
Open
mengw15 wants to merge 2 commits intoapache:mainfrom
Open
feat: introduce Result Service using Lakekeeper as REST catalog for Iceberg - catalog migration #4272mengw15 wants to merge 2 commits intoapache:mainfrom
mengw15 wants to merge 2 commits intoapache:mainfrom
Conversation
This was referenced Mar 9, 2026
feat: introduce Result Service using Lakekeeper as REST catalog for Iceberg - bootstrap script
#4273
Open
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
What changes were proposed in this PR?
This is PR 1 of a decomposed series from #4242, focusing on the core Iceberg catalog migration to support Lakekeeper as a
REST catalog.
Scala changes:
IcebergUtil.scala: addedcreateRestCatalog()for REST catalog connections with S3FileIO (MinIO), and namespace auto-creation for all catalog typesIcebergCatalogInstance.scala: updated singleton to support REST catalog type selectionIcebergTableWriter.scala: updated for REST catalog compatibilityStorageConfig.scala/EnvironmentalVariable.scala: added REST catalog configuration (URI, warehouse name, region, S3bucket) and environment variable support
storage.conf: added REST catalog config section (default remainspostgresfor backward compatibility)build.sbt: addediceberg-aws, AWS SDK dependencies, and Netty version override for Arrow compatibilityPythonWorkflowWorker.scala/ComputingUnitManagingResource.scala: propagate REST catalog config to Python workers andcomputing units
Python changes:
iceberg_catalog_instance.py/iceberg_utils.py: added REST catalog support via PyIcebergstorage_config.py: added REST catalog configuration parsingtexera_run_python_worker.py: accept REST catalog config from Scala siderequirements.txt: upgraded PyIceberg (0.8.1 → 0.9.0), added s3fs/aiobotocore for S3 accessDatabase:
texera_lakekeeper.sql: schema for Lakekeeper's backing databaseNote: This PR keeps
postgresas the default catalog type instorage.conf. Switching to REST catalog will be enabledin subsequent deployment PRs.
Any related issues, documentation, discussions?
Part of #4126. Subsequent PRs will cover:
How was this PR tested?
Manual
Was this PR authored or co-authored using generative AI tooling?
co-authored with Claude