-
Notifications
You must be signed in to change notification settings - Fork 430
Description
Is your feature request related to a problem or challenge?
For users that want to work with multiple storages, they can use the resolving storage that we are about to add #2231 . But the path will still need to be resolved within the Storage, and can be heavy for certain users.
We want to add an API so you can resolve the initialization of Storage so users don't have to resolve paths during the actual Storage operations. This can benefit RestCatalog especially where file_io are loaded for different tables.
Describe the solution you'd like
This can be achieved by adding an new API with_metadata(metadata: TableMetadata) to StorageFactory trait. Let's assume user wants to rely on the metadata location to resolve the Storage, and they can do that by implementing a custom storage factory:
struct ResolvingStorageFactory {
scheme_to_storage: Map<String, Arc<dyn Storage>>,
metadata: TableMetadata,
}
impl StorageFactory for CustomStorageFactory {
fn with_metadata(metadata) { self.metadata = metadata}
fn build(config) {
scheme = resolve(metadata.location)
return scheme_to_storage.get(scheme)
}
}in RestCatalog::load_file_io, we can have
let factory = self
.storage_factory
.with_metadata(metadata) // always attach metadata for RestCatalog's storage initialization
.clone()
.ok_or_else(|| {
Error::new(
ErrorKind::Unexpected,
"StorageFactory must be provided for RestCatalog. Use `with_storage_factory` to configure it.",
)
})?;Willingness to contribute
I can contribute to this feature independently