Skip to content

[GLUTEN-9182][VL] Support new s3 configuration in Gluten#9183

Merged
marin-ma merged 3 commits intoapache:mainfrom
dcoliversun:s3c
Apr 3, 2025
Merged

[GLUTEN-9182][VL] Support new s3 configuration in Gluten#9183
marin-ma merged 3 commits intoapache:mainfrom
dcoliversun:s3c

Conversation

@dcoliversun
Copy link
Copy Markdown
Contributor

What changes were proposed in this pull request?

This PR aims to add the s3a configuration supported by Velox in Gluten.

(Fixes: #9182)

How was this patch tested?

(Please explain how this patch was tested. E.g. unit tests, integration tests, manual tests)

(If this patch involves UI changes, please attach a screenshot; otherwise, remove this)

@github-actions github-actions bot added the VELOX label Mar 31, 2025
@github-actions
Copy link
Copy Markdown

#9182

@dcoliversun dcoliversun marked this pull request as draft March 31, 2025 07:54
@github-actions github-actions bot added the DOCS label Mar 31, 2025
@dcoliversun dcoliversun force-pushed the s3c branch 2 times, most recently from 14363ce to b9babdb Compare April 1, 2025 03:08
@dcoliversun dcoliversun changed the title [GLUTEN-9182][VL] Add the s3a configuration supported by Velox in Gluten [GLUTEN-9182][VL] Support new s3 configuration in Gluten Apr 1, 2025
@dcoliversun dcoliversun marked this pull request as ready for review April 1, 2025 03:36
@dcoliversun
Copy link
Copy Markdown
Contributor Author

cc @zhouyuan Could you please have a review for this PR?

Copy link
Copy Markdown
Member

@zhouyuan zhouyuan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for adding these configurations!

* TRACE
"OFF", "FATAL", "ERROR", "WARN", "INFO", "DEBUG", "TRACE".

## Configuring Whether To Use Proxy From Env for S3 C++ Client
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i understand this is aligning with velox config, but is this a standard aws-sdk-cpp feature? they seems to disable proxy by intention
aws/aws-sdk-cpp#1049

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe this is a standard feature, we can see this doc: https://docs.aws.amazon.com/sdk-for-cpp/v1/developer-guide/client-config.html

{S3Config::Keys::kIamRole, std::make_pair("iam.role", std::nullopt)},
{S3Config::Keys::kIamRoleSessionName, std::make_pair("iam.role.session.name", "gluten-session")},
{S3Config::Keys::kEndpointRegion, std::make_pair("endpoint.region", std::nullopt)},
{S3Config::Keys::kCredentialsProvider, std::make_pair("aws.credentials.provider", std::nullopt)},
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks for adding this
Cc @marin-ma

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder if it's safe to add this mapping in Gluten. If this configuration is not set, Velox uses its own logic to create credential providers that are supported in aws-sdk-cpp.

Say if a workload is configured with spark.hadoop.fs.s3a.aws.credentials.provider=org.apache.hadoop.fs.s3a.SimpleAWSCredentialsProvider and access/secret keys are set, it should go to (code link). But adding this mapping may break that part of logic and go to (code link). Then it will get the error 'CredentialsProviderFactory for 'org.apache.hadoop.fs.s3a.SimpleAWSCredentialsProvider' not registered'

@dcoliversun @zhouyuan Do you have any thoughts or suggestions?

Copy link
Copy Markdown
Contributor Author

@dcoliversun dcoliversun Apr 3, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@marin-ma Your concern is valid. Although the SimpleAWSCredentialsProvider is not required when declaring ak/sk, this piece of code might still disrupt some stable workflows. One idea I have is to use a different configuration name to distinguish it, or not set credentials provider configuration when declaring ak/sk.

If we choose second path, I guess it's better to add some codes in velox.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we use the registration to add mappings from AWSCredentialsProvider to the implementation in aws-sdk-cpp?
e.g.

  registerAWSCredentialsProvider("org.apache.hadoop.fs.s3a.SimpleAWSCredentialsProvider", [](const S3Config& config) {
    GLUTEN_CHECK(
        config.accessKey().has_value() && !config.accessKey().value().empty(),
        "Access key cannot be empty for SimpleAWSCredentialsProvider");
    GLUTEN_CHECK(
        config.secretKey().has_value() && !config.secretKey().value().empty(),
        "Secret key cannot be empty for SimpleAWSCredentialsProvider");
    return std::make_shared<Aws::Auth::SimpleAWSCredentialsProvider>(
        config.accessKey().value(), config.secretKey().value());
  });

Then we need to support all valid mappings in Gluten. Perhaps we can remove the configuration from this patch and add it back in another one together with adding support for all valid mappings. cc: @FelixYBW

Copy link
Copy Markdown
Contributor Author

@dcoliversun dcoliversun Apr 3, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure, have removed the configuration from this patch. cc @marin-ma

@github-actions github-actions bot added the CORE works for Gluten Core label Apr 2, 2025
@github-actions
Copy link
Copy Markdown

github-actions bot commented Apr 2, 2025

Run Gluten Clickhouse CI on x86

("spark.gluten.sql.columnar.backend.velox.fileHandleCacheEnabled", "false"),
("spark.gluten.velox.awsSdkLogLevel", "FATAL")
("spark.gluten.velox.awsSdkLogLevel", "FATAL"),
("spark.gluten.velox.s3UseProxyFromEnv", "false"),
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why set them in backend conf? Looks like session conf is enough.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just like sdkLogLevel configuration, will be used in hive connector in velox

@github-actions
Copy link
Copy Markdown

github-actions bot commented Apr 3, 2025

Run Gluten Clickhouse CI on x86

Copy link
Copy Markdown
Contributor

@marin-ma marin-ma left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks!

@dcoliversun
Copy link
Copy Markdown
Contributor Author

@marin-ma Thanks for your review, it's ready to merge 😄

@marin-ma marin-ma merged commit 0bc5e2f into apache:main Apr 3, 2025
52 checks passed
@GlutenPerfBot
Copy link
Copy Markdown
Contributor

===== Performance report for TPCDS SF2000 with Velox backend, for reference only ====

query log/native_master_04_03_2025_time.csv log/native_master_04_02_2025_d2b622212c_time.csv difference percentage
q1 11.02 11.70 0.684 106.21%
q2 11.38 11.88 0.501 104.40%
q3 2.47 2.98 0.520 121.08%
q4 51.32 50.35 -0.969 98.11%
q5 8.54 10.24 1.695 119.84%
q6 4.85 5.24 0.389 108.03%
q7 3.12 7.25 4.129 232.45%
q8 4.50 5.28 0.787 117.51%
q9 16.33 15.23 -1.099 93.27%
q10 13.75 12.60 -1.147 91.66%
q11 26.61 29.17 2.564 109.64%
q12 2.92 1.47 -1.448 50.35%
q13 6.87 5.29 -1.586 76.93%
q14a 43.07 43.59 0.516 101.20%
q14b 39.67 40.39 0.723 101.82%
q15 3.67 3.31 -0.354 90.34%
q16 4.26 5.46 1.196 128.05%
q17 6.30 7.36 1.055 116.73%
q18 7.63 8.10 0.471 106.18%
q19 5.80 4.80 -0.994 82.84%
q20 2.61 1.54 -1.071 58.95%
q21 0.84 0.79 -0.047 94.39%
q22 4.30 3.36 -0.933 78.28%
q23a 62.45 61.52 -0.933 98.51%
q23b 72.29 74.00 1.716 102.37%
q24a 69.78 71.55 1.769 102.53%
q24b 70.03 67.96 -2.064 97.05%
q25 7.12 4.94 -2.179 69.38%
q26 2.89 3.00 0.115 103.98%
q27 2.77 2.59 -0.180 93.51%
q28 17.30 17.05 -0.247 98.57%
q29 8.37 8.27 -0.098 98.82%
q30 5.04 6.02 0.979 119.44%
q31 8.30 8.63 0.336 104.05%
q32 1.79 1.89 0.096 105.35%
q33 4.35 4.12 -0.226 94.80%
q34 5.31 3.42 -1.890 64.40%
q35 7.53 7.63 0.100 101.33%
q36 2.83 2.30 -0.533 81.19%
q37 3.35 3.36 0.016 100.47%
q38 12.21 12.10 -0.118 99.04%
q39a 4.93 4.12 -0.810 83.55%
q39b 3.63 3.57 -0.054 98.51%
q40 3.90 3.79 -0.110 97.19%
q41 0.89 0.65 -0.235 73.50%
q42 1.64 0.83 -0.813 50.49%
q43 2.21 1.71 -0.498 77.49%
q44 6.02 6.52 0.507 108.42%
q45 3.61 3.07 -0.544 84.93%
q46 4.02 4.90 0.885 122.03%
q47 9.65 9.63 -0.025 99.74%
q48 3.70 3.51 -0.191 94.84%
q49 5.53 5.83 0.301 105.45%
q50 17.43 17.32 -0.114 99.35%
q51 7.70 8.26 0.564 107.32%
q52 0.95 0.69 -0.254 73.18%
q53 1.56 1.56 -0.004 99.71%
q54 6.03 5.92 -0.106 98.24%
q55 1.29 0.89 -0.401 68.94%
q56 3.50 4.09 0.593 116.96%
q57 7.89 6.67 -1.225 84.49%
q58 3.66 3.06 -0.601 83.59%
q59 4.60 4.25 -0.352 92.35%
q60 4.60 4.45 -0.155 96.63%
q61 4.30 4.20 -0.105 97.57%
q62 2.78 3.12 0.338 112.18%
q63 1.54 1.52 -0.016 98.93%
q64 35.85 36.37 0.511 101.43%
q65 11.46 11.43 -0.024 99.79%
q66 3.64 3.78 0.142 103.91%
q67 58.18 57.44 -0.739 98.73%
q68 2.82 3.25 0.429 115.19%
q69 5.58 4.86 -0.718 87.13%
q70 5.37 5.71 0.338 106.30%
q71 4.69 4.70 0.014 100.30%
q72 23.38 20.74 -2.639 88.71%
q73 2.70 2.81 0.112 104.14%
q74 17.46 17.57 0.110 100.63%
q75 22.98 22.86 -0.119 99.48%
q76 7.13 7.05 -0.073 98.97%
q77 2.85 3.02 0.172 106.02%
q78 33.51 33.11 -0.408 98.78%
q79 4.18 3.61 -0.574 86.28%
q80 11.33 10.98 -0.346 96.94%
q81 6.99 6.32 -0.669 90.43%
q82 6.39 5.58 -0.816 87.24%
q83 1.89 1.83 -0.059 96.85%
q84 2.76 3.01 0.256 109.27%
q85 5.53 5.93 0.405 107.33%
q86 1.99 2.14 0.153 107.72%
q87 12.24 11.86 -0.383 96.87%
q88 15.05 16.10 1.050 106.97%
q89 3.51 2.09 -1.422 59.49%
q90 1.93 2.06 0.132 106.87%
q91 4.35 3.29 -1.055 75.73%
q92 2.26 1.78 -0.489 78.41%
q93 23.16 23.95 0.791 103.41%
q94 9.70 8.49 -1.207 87.55%
q9 57.41 56.21 -1.200 97.91%
q5 2.64 2.56 -0.086 96.75%
q96 11.12 10.77 -0.355 96.81%
q97 4.55 2.46 -2.096 53.98%
q98 5.71 5.21 -0.502 91.22%
q99 0.52 0.54 0.015 102.85%
total 1195.88 1183.35 -12.533 98.95%

@GlutenPerfBot
Copy link
Copy Markdown
Contributor

===== Performance report for TPCH SF2000 with Velox backend, for reference only ====

query log/native_master_04_03_2025_time.csv log/native_master_04_02_2025_d2b622212c_time.csv difference percentage
q1 25.61 24.57 -1.035 95.96%
q2 25.60 25.92 0.322 101.26%
q3 33.25 33.45 0.203 100.61%
q4 28.60 28.30 -0.298 98.96%
q5 60.48 60.91 0.429 100.71%
q6 8.26 9.99 1.722 120.84%
q7 39.74 39.71 -0.029 99.93%
q8 64.75 64.57 -0.178 99.73%
q9 90.36 100.05 9.693 110.73%
q10 37.91 40.48 2.561 106.76%
q11 15.93 16.60 0.671 104.21%
q12 17.24 16.11 -1.130 93.45%
q13 24.60 24.51 -0.093 99.62%
q14 10.89 12.29 1.395 112.80%
q15 25.67 26.53 0.853 103.32%
q16 12.17 11.81 -0.364 97.01%
q17 73.14 74.17 1.035 101.42%
q18 111.25 114.45 3.199 102.88%
q19 15.73 22.34 6.611 142.04%
q20 27.01 23.35 -3.659 86.46%
q21 172.88 175.60 2.717 101.57%
q22 14.00 10.32 -3.675 73.74%
total 935.09 956.04 20.950 102.24%

@dcoliversun dcoliversun deleted the s3c branch April 4, 2025 02:48
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

CORE works for Gluten Core DOCS VELOX

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[VL] Support new s3 configuration in gluten

5 participants