Skip to content

fix(ci): remove explicit AWS credential passing from docker builds#8310

Closed
sara4dev wants to merge 3 commits into
mainfrom
fix/remove-aws-secret-from-docker-build
Closed

fix(ci): remove explicit AWS credential passing from docker builds#8310
sara4dev wants to merge 3 commits into
mainfrom
fix/remove-aws-secret-from-docker-build

Conversation

@sara4dev
Copy link
Copy Markdown
Contributor

@sara4dev sara4dev commented Apr 17, 2026

Summary

  • Security fix: Remove explicit AWS_ACCESS_KEY_ID/AWS_SECRET_ACCESS_KEY passing through CI actions and Dockerfile secret mounts
  • BuildKit pods on aws-ci already have IAM access via IRSA (builder-v1-service-account), so sccache discovers credentials automatically through the AWS SDK credential chain
  • The previous approach set AWS creds as step-level env vars, and combined with set -x in the build script, caused credentials to leak into build logs via bash trace expansion

Changes

  • container/templates/wheel_builder.Dockerfile: Removed all 8 --mount=type=secret,id=aws-key-id / aws-secret-id from RUN steps
  • .github/actions/docker-remote-build/action.yml: Removed aws_access_key_id/aws_secret_access_key inputs, env vars, and SECRET_ARGS block
  • .github/actions/build-flavor/action.yml: Removed credential inputs and forwarding
  • .github/actions/docker-build/action.yml: Removed credential inputs and env vars
  • 5 workflow files: Removed secret declarations and forwarding

Test plan

  • CI build triggers and completes successfully (sccache picks up IRSA creds automatically)
  • Build logs contain no AKIA* patterns or AWS secret key values
  • sccache stats in build log show cache hits (confirms S3 connectivity via IRSA)
  • Verify builder-v1-service-account IRSA annotation exists on aws-ci cluster

🤖 Generated with Claude Code


Open with Devin

Summary by CodeRabbit

  • Chores
    • Removed explicit AWS access credentials from GitHub Actions definitions and build workflows for improved credential security management.
    • Eliminated direct AWS credential injection from build steps across multiple container and flavor build pipelines.
    • Streamlined Docker remote build configurations by removing credential mount arguments from build commands.

@sara4dev sara4dev requested review from a team as code owners April 17, 2026 18:08
@copy-pr-bot
Copy link
Copy Markdown

copy-pr-bot Bot commented Apr 17, 2026

This pull request requires additional validation before any workflows can run on NVIDIA's runners.

Pull request vetters can view their responsibilities here.

Contributors can view more details about this message here.

BuildKit pods on aws-ci already have IAM access via IRSA
(builder-v1-service-account), so sccache discovers credentials
automatically through the AWS SDK credential chain. Passing
AWS_ACCESS_KEY_ID/AWS_SECRET_ACCESS_KEY as step-level env vars
combined with set -x caused credentials to leak into build logs
via bash trace expansion.

Remove --mount=type=secret for AWS creds from all Dockerfile RUN
steps, remove aws_access_key_id/aws_secret_access_key inputs from
actions, and remove secret forwarding from all calling workflows.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@sara4dev sara4dev force-pushed the fix/remove-aws-secret-from-docker-build branch from 27a9f0d to 4d400ee Compare April 17, 2026 18:10
@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai Bot commented Apr 17, 2026

Walkthrough

This change systematically removes AWS credential inputs (aws_access_key_id, aws_secret_access_key) from GitHub Actions, workflows, and Dockerfile BuildKit secret mounts across the CI/CD pipeline. Credentials are no longer explicitly passed via action inputs, workflow secrets, or build secrets.

Changes

Cohort / File(s) Summary
GitHub Actions Input Removal
.github/actions/build-flavor/action.yml, .github/actions/docker-build/action.yml, .github/actions/docker-remote-build/action.yml
Removed aws_access_key_id and aws_secret_access_key input declarations; stopped forwarding AWS credentials to downstream build steps. In docker-remote-build, conditional SECRET_ARGS construction for BuildKit secret mounts was also deleted.
Workflow Secrets Removal
.github/workflows/build-flavor.yml, .github/workflows/build-frontend-image.yaml, .github/workflows/build-test-distribute-flavor-matrix.yml, .github/workflows/build-test-distribute-flavor.yml, .github/workflows/container-validation-dynamo.yml, .github/workflows/shared-build-image.yml
Removed AWS_ACCESS_KEY_ID and AWS_SECRET_ACCESS_KEY from workflow secrets definitions and removed corresponding with: parameters when invoking build actions. Other AWS inputs (region, account ID) remain unchanged.
Dockerfile BuildKit Secrets Removal
container/templates/wheel_builder.Dockerfile
Removed --secret aws-key-id --secret aws-secret-id mount arguments from multiple RUN steps (FFmpeg, UCX, libfabric, AWS SDK C++, ai-dynamo, nixl, kvbm builds); steps now omit AWS credential environment variables while retaining sccache and cache mounts.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~22 minutes

🚥 Pre-merge checks | ✅ 3
✅ Passed checks (3 passed)
Check name Status Explanation
Title check ✅ Passed The title clearly and accurately summarizes the main security-focused change: removing explicit AWS credential passing from Docker builds across CI actions and Dockerfiles.
Description check ✅ Passed The description covers the overview, detailed changes across multiple files, and a test plan, though it deviates from the template structure with additional context about IRSA and security implications.
Docstring Coverage ✅ Passed No functions found in the changed files to evaluate docstring coverage. Skipping docstring coverage check.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.


Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick comments (1)
container/templates/wheel_builder.Dockerfile (1)

570-599: LGTM — kvbm wheel build step.

Cache mounts retained, secret mounts dropped; auditwheel repair path unchanged.

One cross-file follow-up worth noting: container/use-sccache.sh (lines 97–99) still documents that S3 credentials are expected to be mounted via --mount=type=secret. Now that all Dockerfile RUN steps rely on IRSA/AWS SDK credential chain instead, consider updating that comment in a follow-up so the script's documentation matches the new CI model and future maintainers aren't sent looking for secret mounts that no longer exist.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@container/templates/wheel_builder.Dockerfile` around lines 570 - 599, The
comment in the use-sccache.sh script (use-sccache.sh around the documented block
at lines that mention S3 credential mounting) still instructs users to provide
S3 credentials via --mount=type=secret; update that comment to reflect the new
CI model by removing the secret-mount guidance and instead state that the script
relies on the AWS SDK/IRSA credential chain (environment, instance role, or
IRSA) for S3 access, and optionally note any fallback behavior or env var names
the script checks so future maintainers know where to look.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Nitpick comments:
In `@container/templates/wheel_builder.Dockerfile`:
- Around line 570-599: The comment in the use-sccache.sh script (use-sccache.sh
around the documented block at lines that mention S3 credential mounting) still
instructs users to provide S3 credentials via --mount=type=secret; update that
comment to reflect the new CI model by removing the secret-mount guidance and
instead state that the script relies on the AWS SDK/IRSA credential chain
(environment, instance role, or IRSA) for S3 access, and optionally note any
fallback behavior or env var names the script checks so future maintainers know
where to look.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: 4bf4ba12-82c0-4379-a7f8-9a1e5dd96632

📥 Commits

Reviewing files that changed from the base of the PR and between 5b03a59 and 27a9f0d.

📒 Files selected for processing (10)
  • .github/actions/build-flavor/action.yml
  • .github/actions/docker-build/action.yml
  • .github/actions/docker-remote-build/action.yml
  • .github/workflows/build-flavor.yml
  • .github/workflows/build-frontend-image.yaml
  • .github/workflows/build-test-distribute-flavor-matrix.yml
  • .github/workflows/build-test-distribute-flavor.yml
  • .github/workflows/container-validation-dynamo.yml
  • .github/workflows/shared-build-image.yml
  • container/templates/wheel_builder.Dockerfile
💤 Files with no reviewable changes (8)
  • .github/workflows/shared-build-image.yml
  • .github/workflows/build-flavor.yml
  • .github/workflows/build-test-distribute-flavor.yml
  • .github/workflows/build-frontend-image.yaml
  • .github/workflows/container-validation-dynamo.yml
  • .github/workflows/build-test-distribute-flavor-matrix.yml
  • .github/actions/docker-build/action.yml
  • .github/actions/build-flavor/action.yml

Copy link
Copy Markdown
Contributor

@devin-ai-integration devin-ai-integration Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

✅ Devin Review: No Issues Found

Devin Review analyzed this PR and found no potential bugs to report.

View in Devin Review to see 3 additional findings.

Open in Devin Review

Replace static AWS credential passing with dynamic temporary credentials
derived from the runner pod's IRSA role. Credentials are resolved via
`aws configure export-credentials`, written to temp files (never as env
vars), and passed to BuildKit via file-based --secret flags.

Changes:
- Dockerfile: restore --mount=type=secret on all sccache RUN steps,
  add aws-session-token for STS temp creds, rename aws-secret-id to
  aws-secret-key for clarity
- docker-remote-build: derive temp creds from IRSA with set +x guard,
  write to temp files, use --secret id=...,src=<file>
- No static AWS_ACCESS_KEY_ID/AWS_SECRET_ACCESS_KEY in env vars or
  GitHub secrets needed for docker builds

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Copy link
Copy Markdown
Contributor

@devin-ai-integration devin-ai-integration Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Devin Review found 1 new potential issue.

View 5 additional findings in Devin Review.

Open in Devin Review

Comment on lines +186 to 189
# Clean up credential files
if [ -n "${SECRET_DIR:-}" ]; then rm -rf "$SECRET_DIR"; fi

BUILD_EXIT_CODE=${PIPESTATUS[0]}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🔴 PIPESTATUS overwritten by credential cleanup command, silently masking build failures

The credential cleanup command if [ -n "${SECRET_DIR:-}" ]; then rm -rf "$SECRET_DIR"; fi (line 187) is inserted between the docker buildx build ... | tee pipeline (line 184) and BUILD_EXIT_CODE=${PIPESTATUS[0]} (line 189). In bash, PIPESTATUS is reset after every command, so by the time line 189 executes, PIPESTATUS[0] reflects the exit status of the if/rm compound command (almost always 0), not the docker build. This means if the docker build fails, BUILD_EXIT_CODE will still be 0, exit ${BUILD_EXIT_CODE} will succeed, and the CI step will incorrectly report success.

The old code (before this PR) correctly placed BUILD_EXIT_CODE=${PIPESTATUS[0]} immediately after the pipeline with no intervening commands.

Suggested change
# Clean up credential files
if [ -n "${SECRET_DIR:-}" ]; then rm -rf "$SECRET_DIR"; fi
BUILD_EXIT_CODE=${PIPESTATUS[0]}
BUILD_EXIT_CODE=${PIPESTATUS[0]}
# Clean up credential files
if [ -n "${SECRET_DIR:-}" ]; then rm -rf "$SECRET_DIR"; fi
Open in Devin Review

Was this helpful? React with 👍 or 👎 to provide feedback.

aws configure export-credentials failed silently on runners, causing
empty credentials to be passed to sccache. Switch to explicit
aws sts assume-role-with-web-identity using the IRSA-injected
AWS_WEB_IDENTITY_TOKEN_FILE and AWS_ROLE_ARN env vars.

Also strip trailing newlines from credential files with end=''.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@dmitry-tokarev-nv
Copy link
Copy Markdown
Contributor

closing in favor of #8324

@dmitry-tokarev-nv dmitry-tokarev-nv deleted the fix/remove-aws-secret-from-docker-build branch April 17, 2026 22:03
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants