importinto: require S3-like auth for nextgen import (#68231) by ti-chi-bot · Pull Request #68234 · pingcap/tidb

ti-chi-bot · 2026-05-08T11:37:27Z

This is an automated cherry-pick of #68231

What problem does this PR solve?

Issue Number: close #68226

Problem Summary:

In NextGen security enhanced mode, IMPORT INTO accepted S3-like storage URIs without explicit user-provided credentials. That allowed the object-store client to fall back to TiDB node-role credentials, which weakens the expected boundary for user-specified import sources.

What changed and how does it work?

This PR requires explicit authentication for S3-like IMPORT INTO sources when NextGen and SEM are enabled.

Adds normalized object-store query parameter matching so both dash and underscore spellings are handled consistently.
Defines shared S3-like query keys for access key, secret access key, and role ARN.
Rejects S3-like import paths unless they provide either a non-empty access key/secret access key pair or a non-empty role ARN.
Preserves the existing NextGen SEM behavior that rejects explicit external ID and injects the keyspace name as the external ID for allowed paths.

Check List

Tests

Unit test
Integration test
Manual test (add detailed scripts or steps below)
No need to test
- I checked and no code files have been changed.

Unit tests:

./tools/check/failpoint-go-test.sh pkg/planner/core -tags=intest,deadlock,nextgen -run TestProcessNextGenS3Path -count=1
./tools/check/failpoint-go-test.sh pkg/executor -tags=intest,deadlock,nextgen -run TestNextGenS3ExternalID -count=1
make lint

Side effects

Performance regression: Consumes more CPU
Performance regression: Consumes more Memory
Breaking backward compatibility

Documentation

Release note

Please refer to Release Notes Language Style Guide to write a quality release note.

In NextGen security enhanced mode, IMPORT INTO from S3-like storage now requires access key/secret access key credentials or a role ARN.

Summary by CodeRabbit

Bug Fixes
- Enhanced validation for IMPORT INTO operations on S3-like cloud storage to enforce explicit authentication requirements. Statements without valid access credentials (access key/secret key) or role ARN are now rejected in next-gen kernel mode.

coderabbitai · 2026-05-08T11:37:48Z

📝 Walkthrough

Walkthrough

This PR implements authentication enforcement for IMPORT INTO on S3-like storage in NextGen clusters. It adds S3 credential constants, normalizes query parameter parsing, rewrites the S3 validation logic to require either access-key/secret-access-key or role-arn, and updates tests across executor, planner, and SEM integration layers.

Changes

NextGen S3 Authentication Requirements

Layer / File(s)	Summary
S3 Credential Constants `pkg/objstore/s3like/store.go`	Exports `S3AccessKey`, `S3SecretAccessKey`, `S3RoleARN` constants for credential parameter keys.
Query Parameter Normalization `pkg/objstore/parse.go`	Adds `NormalizeQueryParameterKey` helper to lowercase and convert underscores to hyphens; `ExtractQueryParameters` uses it instead of inline normalization.
NextGen S3 Validation `pkg/planner/core/planbuilder.go`	`checkNextGenS3PathWithSem` now parses normalized S3 parameters, rejects explicit `external_id`, and requires either both `access_key` and `secret_access_key` or `role_arn`; `buildImportInto` invokes validation unconditionally for NextGen SEM S3-like imports.
Planner Unit Tests `pkg/planner/core/planbuilder_test.go`	`TestProcessNextGenS3Path` adds unsupported cases for `external_id` variants, supported cases for credential parameters (including underscore aliases), and error cases for missing credentials.
Executor Integration Tests `pkg/executor/import_into_test.go`	Adds `TestNextGenS3ExternalID` asserting SEM rejection of credentials-less S3-like URIs; modifies "local sort" and "unsupported options" test URIs to include `access-key`/`secret-access-key`.
SEM Conditional Integration Tests `pkg/util/sem/compat/sem_integration_test.go`	`TestRestrictedSQL` branches on `kerneltype.IsNextGen()`: NextGen rejects explicit `EXTERNAL-ID`, legacy mode preserves failpoint verification; adds import and Bazel dependency for `kerneltype`.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

Possibly related PRs

pingcap/tidb#68231: Makes parallel changes to normalize S3 query keys, add credential constants, and enforce NextGen SEM authentication requirements for IMPORT INTO.

Suggested labels

component/import, lgtm

Suggested reviewers

GMHDBJD
joechenrh
hawkingrei

Poem

🐰 S3 credentials now required with care,
No node-role fallback—authenticate with flair!
NextGen enforces: AK/SK or role ARN must be there,
Query parameters normalized, S3 is finally fair. ✨

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name	Status	Explanation	Resolution
Docstring Coverage	⚠️ Warning	Docstring coverage is 0.00% which is insufficient. The required threshold is 80.00%.	Write docstrings for the functions missing them to satisfy the coverage threshold.

✅ Passed checks (4 passed)

Check name	Status	Explanation
Title check	✅ Passed	The title clearly summarizes the main change: requiring S3-like authentication for NextGen imports, which is the primary objective of this PR.
Description check	✅ Passed	The description covers all required sections: issue number, problem summary, what changed, test coverage, breaking compatibility, and release notes. Content is complete and addresses the linked issue.
Linked Issues check	✅ Passed	The PR implements all key requirements from `#68226`: normalized query parameter handling, S3-like auth key constants, enforcement of AK/SK or role ARN, rejection of credentials-less URIs, and preservation of external-ID handling.
Out of Scope Changes check	✅ Passed	All changes are directly related to enforcing S3-like authentication for NextGen IMPORT INTO. No unrelated modifications to other systems or functionality were introduced.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

🧪 Generate unit tests (beta)

Create PR with unit tests

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

Actionable comments posted: 2

🤖 Prompt for all review comments with AI agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@pkg/planner/core/planbuilder_test.go`:
- Around line 1147-1157: The test is missing S3 role-ARN cases: update the
supported-cases loop in planbuilder_test.go to include
"s3://bucket?role-arn=arn" and "s3://bucket?role_arn=arn" alongside the existing
S3 AK/SK and OSS role-ARN entries so checkNextGenS3PathWithSem is exercised for
S3 role ARN authentication; locate the loop that parses URLs and calls
checkNextGenS3PathWithSem and add those two S3 strings to the slice.

In `@pkg/planner/core/planbuilder.go`:
- Around line 6323-6341: The code treats whitespace-only auth parameters as
present by checking values.Get(k) != ""; update the checks inside the loop that
set hasAccessKey, hasSecretAccessKey, and hasRoleARN to trim whitespace before
testing non-empty (e.g., use strings.TrimSpace(values.Get(k)) != "") so that
only non-blank values count as provided; keep using
objstore.NormalizeQueryParameterKey(k) and the same s3like constants
(s3like.S3AccessKey, s3like.S3SecretAccessKey, s3like.S3RoleARN) and leave the
final validation logic (the if !hasRoleARN && !(hasAccessKey &&
hasSecretAccessKey) ...) unchanged.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: Repository UI

Review profile: CHILL

Plan: Pro

Run ID: af02fd39-e950-4a75-a0f1-3be9aa57900b

📥 Commits

Reviewing files that changed from the base of the PR and between 678fb23 and de7515c.

📒 Files selected for processing (7)

pkg/executor/import_into_test.go
pkg/objstore/parse.go
pkg/objstore/s3like/store.go
pkg/planner/core/planbuilder.go
pkg/planner/core/planbuilder_test.go
pkg/util/sem/compat/BUILD.bazel
pkg/util/sem/compat/sem_integration_test.go

coderabbitai · 2026-05-08T11:54:47Z

+	for _, str := range []string{
+		"s3://bucket?access-key=ak&secret-access-key=sk",
+		"s3://bucket?access_key=ak&secret_access_key=sk",
+		"oss://bucket?role-arn=arn",
+		"oss://bucket?role_arn=arn",
+	} {
+		u, err := url.Parse(str)
+		require.NoError(t, err)
+		err = checkNextGenS3PathWithSem(u)
+		require.NoError(t, err)
+	}


⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash # Find the implementation of checkNextGenS3PathWithSem to see whether role-arn # is checked per-scheme or scheme-agnostically. ast-grep --pattern 'func checkNextGenS3PathWithSem($$$) { $$$ }'

Repository: pingcap/tidb

Length of output: 38

🏁 Script executed:

#!/bin/bash # Search for the function definition using rg rg "func checkNextGenS3PathWithSem" -A 30

Repository: pingcap/tidb

Length of output: 2200

🏁 Script executed:

#!/bin/bash # Also search for where this function is defined/called rg "checkNextGenS3PathWithSem" -B 2 -A 10

Repository: pingcap/tidb

Length of output: 4638

Add missing test cases for S3 with role ARN authentication.

The implementation of checkNextGenS3PathWithSem checks query parameters scheme-agnostically. Since it accepts role ARN as valid authentication (the logic is if !hasRoleARN && !(hasAccessKey && hasSecretAccessKey) returns error), the "supported" test block should include "s3://bucket?role-arn=arn" and "s3://bucket?role_arn=arn" alongside the existing S3 AK+SK cases and OSS role ARN cases.

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@pkg/planner/core/planbuilder_test.go` around lines 1147 - 1157, The test is missing S3 role-ARN cases: update the supported-cases loop in planbuilder_test.go to include "s3://bucket?role-arn=arn" and "s3://bucket?role_arn=arn" alongside the existing S3 AK/SK and OSS role-ARN entries so checkNextGenS3PathWithSem is exercised for S3 role ARN authentication; locate the loop that parses URLs and calls checkNextGenS3PathWithSem and add those two S3 strings to the slice.

coderabbitai · 2026-05-08T11:54:47Z

+	hasAccessKey := false
+	hasSecretAccessKey := false
+	hasRoleARN := false
 	for k := range values {
-		lowerK := strings.ToLower(k)
-		if lowerK == s3like.S3ExternalID {
+		normalizedK := objstore.NormalizeQueryParameterKey(k)
+		switch normalizedK {
+		case s3like.S3ExternalID:
 			return plannererrors.ErrNotSupportedWithSem.GenWithStackByArgs("IMPORT INTO with explicit external ID")
+		case s3like.S3AccessKey:
+			hasAccessKey = hasAccessKey || values.Get(k) != ""
+		case s3like.S3SecretAccessKey:
+			hasSecretAccessKey = hasSecretAccessKey || values.Get(k) != ""
+		case s3like.S3RoleARN:
+			hasRoleARN = hasRoleARN || values.Get(k) != ""
 		}
 	}

+	if !hasRoleARN && !(hasAccessKey && hasSecretAccessKey) {
+		return plannererrors.ErrNotSupportedWithSem.GenWithStackByArgs("IMPORT INTO from S3-like storage without access key/secret access key or role ARN")


⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Trim auth values before treating them as present.

values.Get(k) != "" accepts whitespace-only access-key, secret-access-key, and role-arn, so a URL like ...?role-arn=%20 currently passes this SEM gate even though the new contract requires non-empty explicit auth.

Suggested fix

for k := range values { normalizedK := objstore.NormalizeQueryParameterKey(k) switch normalizedK { case s3like.S3ExternalID: return plannererrors.ErrNotSupportedWithSem.GenWithStackByArgs("IMPORT INTO with explicit external ID") case s3like.S3AccessKey: - hasAccessKey = hasAccessKey || values.Get(k) != "" + hasAccessKey = hasAccessKey || strings.TrimSpace(values.Get(k)) != "" case s3like.S3SecretAccessKey: - hasSecretAccessKey = hasSecretAccessKey || values.Get(k) != "" + hasSecretAccessKey = hasSecretAccessKey || strings.TrimSpace(values.Get(k)) != "" case s3like.S3RoleARN: - hasRoleARN = hasRoleARN || values.Get(k) != "" + hasRoleARN = hasRoleARN || strings.TrimSpace(values.Get(k)) != "" } }

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

hasAccessKey := false

hasSecretAccessKey := false

hasRoleARN := false

for k := range values {

lowerK := strings.ToLower(k)

if lowerK == s3like.S3ExternalID {

normalizedK := objstore.NormalizeQueryParameterKey(k)

switch normalizedK {

case s3like.S3ExternalID:

return plannererrors.ErrNotSupportedWithSem.GenWithStackByArgs("IMPORT INTO with explicit external ID")

case s3like.S3AccessKey:

hasAccessKey = hasAccessKey || values.Get(k) != ""

case s3like.S3SecretAccessKey:

hasSecretAccessKey = hasSecretAccessKey || values.Get(k) != ""

case s3like.S3RoleARN:

hasRoleARN = hasRoleARN || values.Get(k) != ""

}

}

if !hasRoleARN && !(hasAccessKey && hasSecretAccessKey) {

return plannererrors.ErrNotSupportedWithSem.GenWithStackByArgs("IMPORT INTO from S3-like storage without access key/secret access key or role ARN")

hasAccessKey := false

hasSecretAccessKey := false

hasRoleARN := false

for k := range values {

normalizedK := objstore.NormalizeQueryParameterKey(k)

switch normalizedK {

case s3like.S3ExternalID:

return plannererrors.ErrNotSupportedWithSem.GenWithStackByArgs("IMPORT INTO with explicit external ID")

case s3like.S3AccessKey:

hasAccessKey = hasAccessKey || strings.TrimSpace(values.Get(k)) != ""

case s3like.S3SecretAccessKey:

hasSecretAccessKey = hasSecretAccessKey || strings.TrimSpace(values.Get(k)) != ""

case s3like.S3RoleARN:

hasRoleARN = hasRoleARN || strings.TrimSpace(values.Get(k)) != ""

}

}

if !hasRoleARN && !(hasAccessKey && hasSecretAccessKey) {

return plannererrors.ErrNotSupportedWithSem.GenWithStackByArgs("IMPORT INTO from S3-like storage without access key/secret access key or role ARN")

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@pkg/planner/core/planbuilder.go` around lines 6323 - 6341, The code treats whitespace-only auth parameters as present by checking values.Get(k) != ""; update the checks inside the loop that set hasAccessKey, hasSecretAccessKey, and hasRoleARN to trim whitespace before testing non-empty (e.g., use strings.TrimSpace(values.Get(k)) != "") so that only non-blank values count as provided; keep using objstore.NormalizeQueryParameterKey(k) and the same s3like constants (s3like.S3AccessKey, s3like.S3SecretAccessKey, s3like.S3RoleARN) and leave the final validation logic (the if !hasRoleARN && !(hasAccessKey && hasSecretAccessKey) ...) unchanged.

codecov · 2026-05-08T12:44:24Z

Codecov Report

✅ All modified and coverable lines are covered by tests.
⚠️ Please upload report for BASE (release-nextgen-202603@678fb23). Learn more about missing BASE report.

Additional details and impacted files

@@                     Coverage Diff                     @@
##             release-nextgen-202603     #68234   +/-   ##
===========================================================
  Coverage                          ?   77.5687%           
===========================================================
  Files                             ?       1962           
  Lines                             ?     544099           
  Branches                          ?          0           
===========================================================
  Hits                              ?     422051           
  Misses                            ?     121196           
  Partials                          ?        852

Flag	Coverage Δ
unit	`76.1749% <100.0000%> (?)`

Flags with carried forward coverage won't be shown. Click here to find out more.

Components	Coverage Δ
dumpling	`61.5065% <0.0000%> (?)`
parser	`∅ <0.0000%> (?)`
br	`60.9801% <0.0000%> (?)`

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

D3Hunter · 2026-05-08T13:07:40Z

/retest

ti-chi-bot · 2026-05-09T02:03:57Z

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: D3Hunter, hawkingrei

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Details

Needs approval from an approver in each of these files:

~~OWNERS~~ [D3Hunter,hawkingrei]
~~pkg/objstore/OWNERS~~ [D3Hunter]
~~pkg/planner/OWNERS~~ [hawkingrei]

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

ti-chi-bot · 2026-05-09T02:04:02Z

[LGTM Timeline notifier]

Timeline:

2026-05-08 12:32:21.101789191 +0000 UTC m=+443813.975139153: ☑️ agreed by D3Hunter.
2026-05-09 02:04:01.272111903 +0000 UTC m=+492514.145461885: ☑️ agreed by hawkingrei.

D3Hunter added 3 commits May 8, 2026 11:37

change

4e11325

planner: simplify S3 auth query checks

44d9171

util/sem: expect external ID import error in nextgen

de7515c

ti-chi-bot added release-note Denotes a PR that will be considered when it comes time to generate release notes. sig/planner SIG: Planner size/L Denotes a PR that changes 100-499 lines, ignoring generated files. type/cherry-pick-for-release-nextgen-202603 labels May 8, 2026

ti-chi-bot mentioned this pull request May 8, 2026

importinto: require S3-like auth for nextgen import #68231

Merged

13 tasks

ti-chi-bot assigned D3Hunter May 8, 2026

coderabbitai Bot reviewed May 8, 2026

View reviewed changes

D3Hunter approved these changes May 8, 2026

View reviewed changes

ti-chi-bot Bot added the needs-1-more-lgtm Indicates a PR needs 1 more LGTM. label May 8, 2026

hawkingrei approved these changes May 9, 2026

View reviewed changes

ti-chi-bot Bot added approved lgtm and removed needs-1-more-lgtm Indicates a PR needs 1 more LGTM. labels May 9, 2026

ti-chi-bot Bot merged commit 77ac297 into pingcap:release-nextgen-202603 May 9, 2026
18 checks passed

ti-chi-bot Bot deleted the cherry-pick-68231-to-release-nextgen-202603 branch May 9, 2026 02:08

coderabbitai Bot mentioned this pull request May 12, 2026

vars: validate some vars for tidb x #68196

Merged

13 tasks

coderabbitai Bot mentioned this pull request May 20, 2026

Revert "importinto: require S3-like auth for nextgen import (#68231) (#68233)" #68517

Merged

5 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

importinto: require S3-like auth for nextgen import (#68231)#68234

importinto: require S3-like auth for nextgen import (#68231)#68234
ti-chi-bot[bot] merged 3 commits into
pingcap:release-nextgen-202603from
ti-chi-bot:cherry-pick-68231-to-release-nextgen-202603

ti-chi-bot commented May 8, 2026 •

edited by coderabbitai Bot

Loading

Uh oh!

coderabbitai Bot commented May 8, 2026 •

edited

Loading

Walkthrough

Changes

Estimated code review effort

Possibly related PRs

Suggested labels

Suggested reviewers

Poem

❌ Failed checks (1 warning)

Uh oh!

coderabbitai Bot left a comment

Uh oh!

coderabbitai Bot May 8, 2026

Uh oh!

coderabbitai Bot May 8, 2026

Uh oh!

codecov Bot commented May 8, 2026 •

edited

Loading

Uh oh!

D3Hunter commented May 8, 2026

Uh oh!

ti-chi-bot Bot commented May 9, 2026

Uh oh!

ti-chi-bot Bot commented May 9, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

ti-chi-bot commented May 8, 2026 • edited by coderabbitai Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What problem does this PR solve?

What changed and how does it work?

Check List

Release note

Summary by CodeRabbit

Uh oh!

coderabbitai Bot commented May 8, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Estimated code review effort

Possibly related PRs

Suggested labels

Suggested reviewers

Poem

❌ Failed checks (1 warning)

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot May 8, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot May 8, 2026

Choose a reason for hiding this comment

Uh oh!

codecov Bot commented May 8, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

D3Hunter commented May 8, 2026

Uh oh!

ti-chi-bot Bot commented May 9, 2026

Uh oh!

ti-chi-bot Bot commented May 9, 2026

[LGTM Timeline notifier]

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

ti-chi-bot commented May 8, 2026 •

edited by coderabbitai Bot

Loading

coderabbitai Bot commented May 8, 2026 •

edited

Loading

codecov Bot commented May 8, 2026 •

edited

Loading