Skip to content

feat(storage): implement opendal resolving storage#2231

Merged
blackmwk merged 9 commits intoapache:mainfrom
CTTY:ctty/opendal-resolving
Mar 17, 2026
Merged

feat(storage): implement opendal resolving storage#2231
blackmwk merged 9 commits intoapache:mainfrom
CTTY:ctty/opendal-resolving

Conversation

@CTTY
Copy link
Collaborator

@CTTY CTTY commented Mar 11, 2026

Which issue does this PR close?

What changes are included in this PR?

  • Add OpenDalResolvingStorage

Are these changes tested?

Added a new test

Copy link
Contributor

@blackmwk blackmwk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @CTTY for this pr, generally LGTM!

props: HashMap<String, String>,
/// Cache of scheme → storage mappings.
#[serde(skip, default)]
storages: RwLock<HashMap<String, Arc<OpenDalStorage>>>,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should be multi map? For example, we may need to support both s3 and s3a for S3 storage.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The key here is a string representing a scheme, you can have both within a map:

("s3", OpenDalS3Storage),
("s3a", AnotherOpenDalS3Storage)

Or we are thinking of mapping one scheme to multiple storages?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was thinking we should map them into same storage? A storage instance has a lot of resources inside, like connection pool, etc.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Currently there is a configured_scheme for OpenDalStorage::{S3 , Azdls}, the path it handles should match the configured scheme, so technically it shouldn't be using the same storage instance if the schemes are different.

https://github.com/apache/iceberg-rust/blob/main/crates/storage/opendal/src/lib.rs#L110

I'm not quite sure about the reason tho, maybe it's an OpenDal limitation? I think we can improve this in a different PR if needed

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IIRC the configured_scheme was a legacy setting from before we refactor Storage trait. I think we no longer need this field since the Stroage now accepts the full url. Please create an issue to track it.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Created a tracking issue: #2245

Copy link
Contributor

@blackmwk blackmwk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @CTTY !

props: HashMap<String, String>,
/// Cache of scheme → storage mappings.
#[serde(skip, default)]
storages: RwLock<HashMap<String, Arc<OpenDalStorage>>>,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was thinking we should map them into same storage? A storage instance has a lot of resources inside, like connection pool, etc.

@blackmwk blackmwk merged commit ffd6454 into apache:main Mar 17, 2026
19 checks passed
@CTTY CTTY deleted the ctty/opendal-resolving branch March 17, 2026 17:12
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Implement ResolvingStorage for OpenDal

2 participants