-
Notifications
You must be signed in to change notification settings - Fork 3.7k
[feat](storage)Support Azure Blob Storage #56861
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
Thank you for your contribution to Apache Doris. Please clearly describe your PR:
|
|
run buildall |
1 similar comment
|
run buildall |
FE UT Coverage ReportIncrement line coverage |
TPC-DS: Total hot run time: 190544 ms |
ClickBench: Total hot run time: 30.67 s |
|
run buildall |
[feat](storage)Support Azure Blob Storage
7157ce0 to
f5dd462
Compare
|
run buildall |
|
run buildall |
ClickBench: Total hot run time: 31.19 s |
BE UT Coverage ReportIncrement line coverage Increment coverage report
|
BE Regression && UT Coverage ReportIncrement line coverage Increment coverage report
|
FE Regression Coverage ReportIncrement line coverage |
|
run buildall |
ClickBench: Total hot run time: 30.06 s |
BE UT Coverage ReportIncrement line coverage Increment coverage report
|
BE Regression && UT Coverage ReportIncrement line coverage Increment coverage report
|
FE Regression Coverage ReportIncrement line coverage |
fe/fe-core/src/main/java/org/apache/doris/common/util/LocationPath.java
Outdated
Show resolved
Hide resolved
| s3Props.put("AWS_SECRET_KEY", secretKey); | ||
| s3Props.put("AWS_NEED_OVERRIDE_ENDPOINT", "true"); | ||
| s3Props.put("provider", "azure"); | ||
| s3Props.put("PROVIDER", "AZURE"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why need a uppercase?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
oops...removed
| @Override | ||
| public String getStorageName() { | ||
| return "Azure"; | ||
| return "AZURE"; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why uppercase?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we’ve updated the logic to keep Storage names fully uppercase for consistency, since both HDFS and S3 follow that convention. All callers already perform case-insensitive matching, so this change ensures uniform style without affecting compatibility.
| @ConnectorProperty(names = {"azure.account_name", "azure.access_key", "s3.access_key", | ||
| "AWS_ACCESS_KEY", "ACCESS_KEY", "access_key"}, | ||
| description = "The access key of S3.") | ||
| protected String accessKey = ""; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
how about change this to "accountName"?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
| "AWS_SECRET_KEY", "secret_key"}, | ||
| sensitive = true, | ||
| description = "The secret key of S3.") | ||
| protected String secretKey = ""; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
dito
|
|
||
| boolean isPrefix = false; | ||
| while (blobPath.normalize().toString().startsWith(listPrefix)) { | ||
| while (null != blobPath && blobPath.normalize().toString().startsWith(listPrefix)) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why adding null != blobPath, is this a bug?
fe/fe-core/src/main/java/org/apache/doris/fs/obj/AzureObjStorage.java
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Change the name to objCommit?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
|
run buildall |
| */ | ||
| public static String encodeToBase64(int id) { | ||
| ByteBuffer buf = ByteBuffer.allocate(4) | ||
| .order(ByteOrder.BIG_ENDIAN); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
use LE align as the BE side
|
PR approved by at least one committer and no changes requested. |
|
run buildall |
|
run performance |
ClickBench: Total hot run time: 29.09 s |
BE UT Coverage ReportIncrement line coverage Increment coverage report
|
|
run check_coverage |
BE Regression && UT Coverage ReportIncrement line coverage Increment coverage report
|
FE Regression Coverage ReportIncrement line coverage |
## What's Changed
1. **Refined Azure Blob Configuration Naming**
- Adopted Azure-native property names for better consistency with Azure
SDK conventions:
- `account_name` → Azure Storage Account Name
- `account_key` → Azure Storage Account Key
- Ensures compatibility, clarity, and alignment with Azure Blob
attribute definitions.
2. **Full Feature Support for Azure Blob Storage**
- Added comprehensive integration for the following modules:
- **TVF (Table-Valued Function)**
- **LOAD (Data Loading)**
- **CATALOG (Metadata Querying)**
- Azure Blob can now be used as both a data source and destination
across all modules.
3. **Protocol Compatibility**
- Added full support for multiple Azure storage access protocols:
- `abfs://`
- `abfss://`
- `wasb://`
- `wasbs://`
- Automatically recognizes protocol prefixes and maps them to the
correct Azure storage client implementation.
## todo
**Unified Connectivity Testing Framework**
- Refactored the connectivity test logic into a unified implementation
shared across all object storage backends (S3, OSS, COS, OBS, BOS, and
Azure).
- Improves code reusability and simplifies the process of adding new
storage providers.
FE Regression Coverage ReportIncrement line coverage |
cherry pick apache#56861 (cherry picked from commit 9177047)
## What's Changed
1. **Refined Azure Blob Configuration Naming**
- Adopted Azure-native property names for better consistency with Azure
SDK conventions:
- `account_name` → Azure Storage Account Name
- `account_key` → Azure Storage Account Key
- Ensures compatibility, clarity, and alignment with Azure Blob
attribute definitions.
2. **Full Feature Support for Azure Blob Storage**
- Added comprehensive integration for the following modules:
- **TVF (Table-Valued Function)**
- **LOAD (Data Loading)**
- **CATALOG (Metadata Querying)**
- Azure Blob can now be used as both a data source and destination
across all modules.
3. **Protocol Compatibility**
- Added full support for multiple Azure storage access protocols:
- `abfs://`
- `abfss://`
- `wasb://`
- `wasbs://`
- Automatically recognizes protocol prefixes and maps them to the
correct Azure storage client implementation.
## todo
**Unified Connectivity Testing Framework**
- Refactored the connectivity test logic into a unified implementation
shared across all object storage backends (S3, OSS, COS, OBS, BOS, and
Azure).
- Improves code reusability and simplifies the process of adding new
storage providers.
What's Changed
Refined Azure Blob Configuration Naming
account_name→ Azure Storage Account Nameaccount_key→ Azure Storage Account KeyFull Feature Support for Azure Blob Storage
Protocol Compatibility
abfs://abfss://wasb://wasbs://todo
Unified Connectivity Testing Framework