Skip to content

refactor(build): consolidate sccache AWS credential handling#8323

Closed
dmitry-tokarev-nv wants to merge 2 commits into
mainfrom
dtokarev/sccache-creds-refactor
Closed

refactor(build): consolidate sccache AWS credential handling#8323
dmitry-tokarev-nv wants to merge 2 commits into
mainfrom
dtokarev/sccache-creds-refactor

Conversation

@dmitry-tokarev-nv
Copy link
Copy Markdown
Contributor

@dmitry-tokarev-nv dmitry-tokarev-nv commented Apr 17, 2026

Summary

  • Consolidate the per-variable BuildKit secret mounts in wheel_builder.Dockerfile into a single shared-credentials-file mount.
  • Point the AWS SDK default provider chain at the mounted file via AWS_SHARED_CREDENTIALS_FILE so sccache resolves credentials transparently.
  • Compose the credentials file once in the CI composite action and emit a single --secret id=aws-credentials,src=... to buildx.

Test plan

  • CI wheel build completes on amd64/arm64
  • sccache reports cache activity during FFMPEG/UCX/libfabric/NIXL builds
  • Rendered Dockerfile parses for sglang/vllm/trtllm frameworks

🤖 Generated with Claude Code


Open with Devin

Summary by CodeRabbit

  • Chores
    • Enhanced container build process security by refactoring credential handling to use a file-based approach instead of environment variables, reducing credential exposure during builds.

Switch the wheel_builder BuildKit secret mounts from two per-variable
env= mounts to a single shared-credentials-file mount, and point the AWS
SDK default provider chain at it via AWS_SHARED_CREDENTIALS_FILE. Lets
sccache's aws-sdk-rust resolve credentials transparently without each
RUN plumbing them through the Dockerfile environment.

CI composes the credentials file once from the existing Actions secrets
and emits a single --secret id=aws-credentials,src=... to buildx. An EXIT
trap removes the host tempfile.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Signed-off-by: Dmitry Tokarev <dtokarev@nvidia.com>
@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai Bot commented Apr 17, 2026

Walkthrough

AWS credential handling is refactored to use a single BuildKit-mounted credentials file instead of separate environment variable secrets. The GitHub Action creates a temporary credentials file (with restricted permissions) and manages its lifecycle with a trap handler. The Dockerfile template is updated across all build stages to mount this single credentials file and configure the AWS SDK to read from it.

Changes

Cohort / File(s) Summary
AWS Credentials Refactoring
.github/actions/docker-remote-build/action.yml, container/templates/wheel_builder.Dockerfile
Consolidated AWS credential management from separate aws-key-id and aws-secret-id secrets to a single aws-credentials file-based approach. Added temp file creation with restricted permissions and EXIT trap for cleanup in the action. Updated Dockerfile to set AWS_SHARED_CREDENTIALS_FILE and mount the credentials file consistently across FFmpeg, UCX, libfabric, AWS SDK C++, and Python wheel build stages.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

🚥 Pre-merge checks | ✅ 3
✅ Passed checks (3 passed)
Check name Status Explanation
Title check ✅ Passed The title clearly and concisely summarizes the main change: consolidating sccache AWS credential handling from multiple separate secrets into a single shared approach.
Description check ✅ Passed The description provides a clear summary of changes, test plan, and context. However, it lacks the 'Where should the reviewer start?' and 'Related Issues' sections specified in the template.
Docstring Coverage ✅ Passed No functions found in the changed files to evaluate docstring coverage. Skipping docstring coverage check.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.


Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (1)
.github/actions/docker-remote-build/action.yml (1)

173-186: Set the EXIT trap before creating the tempfile to close the cleanup gap.

Today the tempfile is created at line 175 and the cleanup trap is only installed at line 186. If the script is interrupted (e.g. SIGTERM during heredoc write or by a failing chmod/cat), the credentials file is left behind on the runner. Installing the trap first — guarded on AWS_CREDENTIALS_FILE being non-empty — closes that window without changing behavior on the happy path.

♻️ Proposed reorder
         set +x
         SECRET_ARGS=""
         AWS_CREDENTIALS_FILE=""
+        # Ensure the credentials tempfile is removed even if buildx (or this step) fails.
+        trap '[ -n "${AWS_CREDENTIALS_FILE:-}" ] && [ -f "${AWS_CREDENTIALS_FILE}" ] && rm -f "${AWS_CREDENTIALS_FILE}"' EXIT
         if [ "${{ inputs.use_sccache }}" == "true" ] && [ -n "${AWS_ACCESS_KEY_ID:-}" ]; then
           AWS_CREDENTIALS_FILE="$(mktemp)"
           chmod 600 "$AWS_CREDENTIALS_FILE"
           cat > "$AWS_CREDENTIALS_FILE" <<EOF
         [default]
         aws_access_key_id=${AWS_ACCESS_KEY_ID}
         aws_secret_access_key=${AWS_SECRET_ACCESS_KEY}
         EOF
           SECRET_ARGS+=" --secret id=aws-credentials,src=${AWS_CREDENTIALS_FILE}"
         fi
         set -x
-        # Ensure the credentials tempfile is removed even if buildx fails.
-        trap '[ -n "${AWS_CREDENTIALS_FILE:-}" ] && [ -f "${AWS_CREDENTIALS_FILE}" ] && rm -f "${AWS_CREDENTIALS_FILE}"' EXIT
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In @.github/actions/docker-remote-build/action.yml around lines 173 - 186, The
cleanup trap is installed after creating the tempfile, leaving a window where an
interruption can leak credentials; move the trap installation so it runs
immediately before creating AWS_CREDENTIALS_FILE (i.e., place the trap that
checks AWS_CREDENTIALS_FILE and removes the file on EXIT before the
mktemp/cat/heredoc sequence), keep the same guard logic ([ -n
"${AWS_CREDENTIALS_FILE:-}" ] && [ -f "${AWS_CREDENTIALS_FILE}" ] && rm -f
"${AWS_CREDENTIALS_FILE}") and leave the rest of the logic that sets
AWS_CREDENTIALS_FILE via mktemp, chmod, the heredoc and appending to SECRET_ARGS
unchanged.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In @.github/actions/docker-remote-build/action.yml:
- Around line 174-183: The credentials block only checks AWS_ACCESS_KEY_ID and
uses an unquoted heredoc, which can create a partial creds file or corrupt
values; update the if-condition to require both AWS_ACCESS_KEY_ID and
AWS_SECRET_ACCESS_KEY (e.g., test -n "${AWS_ACCESS_KEY_ID:-}" && -n
"${AWS_SECRET_ACCESS_KEY:-}"), and make the heredoc quoted (use <<'EOF') when
writing AWS_CREDENTIALS_FILE so shell metacharacters in the secret are not
expanded before appending to SECRET_ARGS; ensure references to
AWS_CREDENTIALS_FILE, SECRET_ARGS, mktemp, and the heredoc are preserved.

---

Nitpick comments:
In @.github/actions/docker-remote-build/action.yml:
- Around line 173-186: The cleanup trap is installed after creating the
tempfile, leaving a window where an interruption can leak credentials; move the
trap installation so it runs immediately before creating AWS_CREDENTIALS_FILE
(i.e., place the trap that checks AWS_CREDENTIALS_FILE and removes the file on
EXIT before the mktemp/cat/heredoc sequence), keep the same guard logic ([ -n
"${AWS_CREDENTIALS_FILE:-}" ] && [ -f "${AWS_CREDENTIALS_FILE}" ] && rm -f
"${AWS_CREDENTIALS_FILE}") and leave the rest of the logic that sets
AWS_CREDENTIALS_FILE via mktemp, chmod, the heredoc and appending to SECRET_ARGS
unchanged.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: 7bb136e6-6f48-4852-bb40-a0aa94173ebf

📥 Commits

Reviewing files that changed from the base of the PR and between 32ec044 and 946a016.

📒 Files selected for processing (2)
  • .github/actions/docker-remote-build/action.yml
  • container/templates/wheel_builder.Dockerfile

Comment thread .github/actions/docker-remote-build/action.yml
Copy link
Copy Markdown
Contributor

@devin-ai-integration devin-ai-integration Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

✅ Devin Review: No Issues Found

Devin Review analyzed this PR and found no potential bugs to report.

View in Devin Review to see 4 additional findings.

Open in Devin Review

Replace the stage-level AWS_SHARED_CREDENTIALS_FILE ENV with per-RUN
inline exports so the path is only set in RUN blocks that actually mount
the credentials file. Keeps env scope tight to the BuildKit secret
lifetime and avoids a dangling ENV pointer in RUNs that do not request
the mount.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Signed-off-by: Dmitry Tokarev <dtokarev@nvidia.com>
@dmitry-tokarev-nv
Copy link
Copy Markdown
Contributor Author

closing in favor of #8324

@dmitry-tokarev-nv dmitry-tokarev-nv deleted the dtokarev/sccache-creds-refactor branch April 17, 2026 22:04
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants