GH-3279: Introduce StorageProvider to de-couple Hadoop vs NIO constructs#3280
GH-3279: Introduce StorageProvider to de-couple Hadoop vs NIO constructs#3280ArnavBalyan wants to merge 3 commits intoapache:masterfrom
Conversation
ArnavBalyan
commented
Aug 25, 2025
- Add a minimal StorageProvider abstraction and selector that routes to hadoop vs non-hadoop classes.
- Make Hadoop I/O resolve FileSystem per path to correctly hit the right connector.
- This isolates local I/O from Hadoop today and sets up a clean interface to pull the correct concrete implementation at runtime.
| * Opens the given path for reading. | ||
| * | ||
| * @param path fully-qualified file path (implementation specific semantics) | ||
| * @return an InputStream that must be closed by the caller |
There was a problem hiding this comment.
The JavaDoc should specify what specific IOException
subclasses might be thrown (e.g., FileNotFoundException,
AccessDeniedException) to help implementers and users handle
errors appropriately.
| * @param path fully-qualified file path | ||
| * @param overwrite whether an existing file should be replaced | ||
| * @return an OutputStream that must be closed by the caller | ||
| */ |
There was a problem hiding this comment.
The interface should clarify stream ownership and closing
responsibilities. Consider returning AutoCloseable wrappers or
documenting that callers must use try-with-resources.
| * @param path fully-qualified file path (implementation specific semantics) | ||
| * @return an InputStream that must be closed by the caller | ||
| */ | ||
| InputStream openForRead(String path) throws IOException; |
There was a problem hiding this comment.
In the future if we want to use this abstraction to read parquet files, the code requires to create a SeekableInputStream
Can be more useful to use SeekableInputStream that already extends InputStream?
|
This pull request has been automatically marked as stale because it has had no activity for at least 2 months. If you are still working on this change or plan to move it forward, please leave a comment or push a new commit so we know to keep it open. Otherwise, this PR will be closed automatically in about one month. Thank you for your contribution to Apache Parquet! |