Introduce StorageConnector for Azure#14660
Conversation
| import java.util.concurrent.atomic.AtomicBoolean; | ||
| import java.util.concurrent.atomic.AtomicLong; | ||
|
|
||
| public abstract class ChunkingStorageConnector<T> implements StorageConnector |
There was a problem hiding this comment.
Can you please java doc this since this is the crux of this PR .
| public ChunkingStorageConnectorParameters<T> build() | ||
| { | ||
| Preconditions.checkArgument(start >= 0, "'start' not provided or an incorrect value [%s] passed", start); | ||
| Preconditions.checkArgument(end >= 0, "'end' not provided or an incorrect value [%s] passed", end); |
There was a problem hiding this comment.
Would end < start return a good error message?
There was a problem hiding this comment.
Updated a check with this as well in the PR!
| { | ||
| private static final long DOWNLOAD_MAX_CHUNK_SIZE_BYTES = 100_000_000; | ||
|
|
||
| public ChunkingStorageConnector() |
There was a problem hiding this comment.
Does this need to be public?
There was a problem hiding this comment.
Reverted the change so that the individual connectors can control the chunk sizes. Used primarily for testing for now, though this can be extended to the real implementations as well.
adarshsanjeev
left a comment
There was a problem hiding this comment.
Looks good to me overall
| params.getMaxRetry() | ||
| ), | ||
| outFile, | ||
| new byte[8 * 1024], |
There was a problem hiding this comment.
I know this code was only moved, but could you add a comment on why these numbers are chosen?
cryptoe
left a comment
There was a problem hiding this comment.
Changes LGTM. The user facing docs are remaining.
|
Thanks, @adarshsanjeev @cryptoe for the reviews and @dhananjay1308 for testing the changes out on a cluster. |
Description
This PR adds the storage connector to interact with Azure's blob storage using the current Azure API used in Druid. This will allow Durable storage and MSQ's interactive APIs to work with Azure
This also refactors the currently available S3 connector so that the chunking downloads that is currently done by the S3 connector can be extended to other connectors. (note: This refactoring is ported from the PR #14611 since that is currently parked for work).
Testing plan
Release note
Azure connector has been introduced and MSQ's fault tolerance and durable storage can now be used with Microsoft Azure's blob storage. Also the results of newly introduced queries from deep storage can now store and fetch the results from the Azure's blob storage.
Key changed/added classes in this PR
This PR has: