- BFSJ
- BucketFS Java
const~log-based-synchronization-check-has-a-minimum-resolution-of-one-second~1
At least the monitoring solution based on cluster logs is limited to a resolution of one second.
That means this monitor cannot distinguish between subsequent uploads to the same object in a bucket if they are not at least one second apart.
For more details on what this means and how we deal with this constraint see the design decision in section "How do we validate that objects on BucketFS are ready to use".
Needs: dsn
A Bucket can hold entries with common prefix and a slash / as separator. When interpreting this as a hierarchy similar to a file system, we need to consider that Bucket also allows having a file with the same name as a directory. For example name/child.txt and name can exist at the same time.
BucketFS offers a web API that is not compatible with established standards like Web DAV. While in some parts similar, the differences are big enough that standard client libraries can't be used. This library abstracts the underlying HTTP requests and responses to a level that lets users instead deal with buckets and their contents directly.
Please refer to the System Requirement Specification for user-level requirements.
This section introduces the building blocks of the software. Together those building blocks make up the big picture of the software structure.
The Bucket*Configuration is a set of objects representing the setup of an Exasol cluster.
Needs: impl
The CommandFactory building block allows executing RPC commands like creating new buckets.
The Bucket building block controls interaction with a bucket in BucketFS.
The HttpClientBuilder building block creates and configures HTTP clients, including TLS configuration.
This section describes the runtime behavior of the software.
The CommandFactory allows creating new buckets, specifying required arguments.
dsn~creating-new-bucket~1
Covers:
req~creating-new-buckets~1
Needs: impl, itest
dsn~bucket-lists-its-contents~2
The Bucket lists its contents as a sorted list of object names.
Covers:
req~bucket-content-listing~1
Needs: impl, itest
dsn~bucket-lists-its-contents-recursively~1
The Bucket lists its contents recursively.
Covers:
req~bucket-content-listing-recursive~1
Needs: impl, itest
dsn~bucket-lists-files-with-common-prefix~1
The list of contents of a path in the bucket contains files as well as sub-directories with this path as prefix.
Covers:
req~bucket-content-listing~1
Needs: impl, utest
dsn~bucket-lists-file-and-directory-with-identical-name~1
If Bucket contains two entries sharing the same prefix and only one of these entries has a path separator after the prefix, then list of contents of the bucket contains two entries.
Covers:
req~bucket-content-listing~1
Needs: impl, utest
dsn~get-the-udf-bucket-path~1
- The bucket API implements a method that returns the correct path for a bucket from the UDFs perspective.
- This method returns the UDF-visible path to the root of this bucket within BucketFS.
- This method ensures consistency and avoids human error by generating the correct chrooted path as seen from within a UDF environment.
Rationale:
- BucketFS is the only accessible filesystem for UDFs and it operates in a chrooted environment.
- Paths inside UDFs differ from those on the host system or exposed via the BucketFS web interface.
- This method abstracts away those differences and provides the correct UDF-local path.
Covers:
Needs: impl, utest
dsn~bucket-lists-directories-with-suffix~1
Directories in the list of bucket contents end with a slash / .
Rationale:
- This makes it easier for users to distinguish files from directories in a bucket listing. Especially if they have the same name.
Covers:
req~bucket-content-listing~1
Needs: impl, utest
dsn~uploading-to-bucket~1
The Bucket offers uploading a file from a locally accessible filesystem to a bucket in BucketFS.
Covers:
req~uploading-a-file-to-bucketfs~1
Needs: impl, itest
dsn~uploading-strings-to-bucket~1
The Bucket offers uploading strings into a file in bucket in BucketFS.
Covers:
req~uploading-text-to-a-file-in-bucketfs~1
Needs: impl, itest
dsn~uploading-input-stream-to-bucket~1
The Bucket offers uploading of the contents of an InputStream into a file in that bucket on BucketFS.
Covers:
req~uploading-input-stream-to-a-file-in-bucketfs~1
Needs: impl, itest
dsn~waiting-until-file-appears-in-target-directory~1
When uploading a file into a bucket, users can choose to block the call until the file appears in the bucket's target directory.
Covers:
req~waiting-for-bucket-content-synchronization~1
Needs: impl, itest
dsn~waiting-until-archive-extracted~1
When uploading an archive of type .tar.gz or .zip into a bucket, users can choose to block the call until the archive is fully extracted in the bucket's target directory.
Covers:
req~waiting-for-bucket-content-synchronization~1
Needs: impl, itest
dsn~conditional-upload-by-existence~1
BFSJ can check if a file needs to get uploaded by checking if the file exists in the Bucket.
Covers:
req-conditional-upload~1
Needs: impl, itest
dsn~conditional-upload-by-size~1
BFSJ always uploads files less or equal than 1 MB.
Rationale:
For other files the checksum comparison would be too expensive.
Covers:
req-conditional-upload~1
Needs: impl, itest
dsn~conditional-upload-by-checksum~1
BFSJ checks if a file needs to get uploaded by comparing the checksum.
Covers:
req-conditional-upload~1
Needs: impl, itest
dsn~delete-a-file-from-a-bucket~1
The Bucket offers deleting a file from a bucket.
Covers:
req~deleting-a-file-from-bucketfs~1
Needs: impl, itest
dsn~downloading-a-file-from-a-bucket~1
The Bucket offers downloading a file from a bucket in BucketFS to a locally accessible filesystem.
Covers:
req~downloading-a-file-from-bucketfs~1
Needs: impl, itest
dsn~downloading-a-file-from-a-bucket-as-string~1
The Bucket offers downloading a file from a bucket in BucketFS to as a Java string.
Covers:
req~downloading-a-file-from-bucketfs-as-string~1
Needs: impl, itest
dsn~tls-configuration~1
The Bucket allows the user to enable TLS encryption using builder method useTls(boolean).
Covers:
Needs: impl, utest, itest
dsn~custom-tls-certificate~1
If the user has specified a certificate, the HttpClientBuilder creates and uses a custom TrustManager that trusts the given certificate.
Rationale:
This allows connecting to a database that uses a self-signed certificate while still validating the certificate.
Covers:
Needs: impl, utest, itest
See the constraint format of entries in a Bucket.
The list of contents of a Bucket could either be represented as a hierarchy or as a flat list potentially with
- multiply entries sharing a common prefix
- prefix containing one or multiple slash
/separators
The design decides to interpret Bucket to contain a hierarchy of entries. Each entry may either be a file or a directory. An entry is a directory if it has children, otherwise the entry is a file. An entry has children when its name contains the BucketFS separator /.
Examples:
a.txtis a filea/b.txtis interpreted as directoryacontaining fileb.txt
This in particular affects the list of Bucket contents.
Rationale:
- A hierarchical representation of files and directories provides additional benefits:
- Hierarchies are a convenient and familiar concept to users.
- Hierarchies enable operations on multiple entries in a common scope, e.g. list, copy, or delete.
To support the coexistence of files and directories with the same name, directories should be represented with a slash / as suffix. The list of contents of a directory may then contain the same entry twice:
- once as file (without suffix)
- a second time as directory (with suffix)
BucketFS is a distributed filesystem with an HTTP interface. When users upload objects to a Bucket, it takes a while until they are really usable.
This is caused by various asynchronous processes an object has to go through, like node synchronization and extraction of archives.
In automated workflows, this is important, because reliable tests require objects to be available completely after they are uploaded.
-
Checking via HTTP
GET. Unfortunately this variant is not reliable. -
Checking all nodes via HTTP
GET. Suffers from the same problem as the previous idea and additionally requires that the client library knows all data nodes. On top of that, the variant's overhead grows proportionally with the number of nodes.
We decided to define a monitoring interface that a software that uses the library needs to implement. This allows at least consumers with access to cluster internal information to provide an implementation of this interface.
Users have the option to instantiate Bucket objects with synchronization checking if they provide a monitoring implementation. Otherwise they need to fall back to non-blocking operation.
dsn~validating-bucketfs-object-synchronization-via-monitoring-api~1
The SyncAwareBucket uses a BucketFsMonitor to check object synchronization.
Covers:
req~waiting-for-bucket-content-synchronization~1
Needs: impl, itest
dsn~bucketfs-object-overwrite-throttle~1
The SyncAwareBucket delays subsequent uploads to the same path in a bucket so that the upload speed does not exceed the monitoring resolution of one second.
Comment:
The logs have a timestamp resolution of a second. That is why we delay a subsequent upload to the same path so that it starts after the next second.
Covers:
req~waiting-for-bucket-content-synchronization~1const~log-based-synchronization-check-has-a-minimum-resolution-of-one-second~1
Needs: impl, itest
Exasol Docker DB by default generates a new self-signed TLS certificate at startup. That's why certificate validation will fail when connecting using Java's default keystore.
- Allow specifying a certificate
- Allow specifying a certificate fingerprint
Specifying a fingerprint is more convenient for the user and already a well established practice when connecting to the database using a database driver like JDBC. However specifying a certificate also verifies the hostname and is more secure.
See dsn~custom-tls-certificate~1
The Exasol Docker DB uses a self-signed certificate that is valid only for CN=*.exacluster.local. This certificate is not valid when connecting using hostname localhost or IP address 127.0.0.1. That's why connections fail with exception CertificateException: No subject alternative DNS name matching localhost found..
- Completely ignoring the the host name during certificate validation is insecure.
- We allow the user to specify one or more host names or IP addresses (Subject Alternative Names (SAN)) that are also considered valid during certificate validation.
- We could automatically add the hostname used for connecting as an additional SANs. This would be more convenient, but is less secure and unexpected for the user. It is better to explicitly configure the additional SAN.
dsn~custom-tls-certificate.additional-subject-alternative-names~1
If the user has specified additional Subject Alternative Names (SAN), the HttpClientBuilder creates and uses a custom TrustManager that additionally accepts the given DNS names or IP addresses during certificate validation.
Covers:
Needs: impl, utest, itest
This document's section structure is derived from the "arc42" architectural template by Dr. Gernot Starke, Dr. Peter Hruschka.