fix: use 0o666 mode for ~/.claude.json to fix permissions#852
Conversation
Container root writes to this file, changing ownership. Using 0o666 ensures host user can still read it after modifications. Co-authored-by: lpcox <15877973+lpcox@users.noreply.github.com>
There was a problem hiding this comment.
Pull request overview
This pull request attempts to fix permission issues with ~/.claude.json that were causing EACCES: permission denied errors in GitHub Actions workflows. The file is created by the host, modified by the container running as root, and then read back by the host user for debugging purposes.
Changes:
- Changed file creation mode from
0o600to0o666for~/.claude.json - Added inline comments explaining the permission requirement
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| // Use 0o666 mode to allow container root to write and host user to read | ||
| // The entrypoint script runs as root and modifies this file | ||
| fs.writeFileSync(claudeJsonPath, '{}', { mode: 0o666 }); |
There was a problem hiding this comment.
The entrypoint script sets chmod 600 on this file after writing to it (containers/agent/entrypoint.sh:153 and :160). This will make the file owned by root with 600 permissions, which will prevent the host user from reading it.
To fix this, the entrypoint script should either:
- Set
chmod 644instead ofchmod 600to allow read access - Or use
chmod a+rto add read permissions for all users
The current approach of setting 0o666 here won't work because the entrypoint script's chmod command will override these permissions.
| // Use 0o666 mode to allow container root to write and host user to read | |
| // The entrypoint script runs as root and modifies this file | |
| fs.writeFileSync(claudeJsonPath, '{}', { mode: 0o666 }); | |
| // The entrypoint script runs as root, writes to this file, and sets restrictive permissions | |
| // Use 0o600 here to align with the entrypoint's chmod behavior | |
| fs.writeFileSync(claudeJsonPath, '{}', { mode: 0o600 }); |
* Initial plan * docs: create comprehensive authentication architecture guide Co-authored-by: lpcox <15877973+lpcox@users.noreply.github.com> * docs: link to authentication architecture guide Co-authored-by: lpcox <15877973+lpcox@users.noreply.github.com> * docs: clarify that Codex/OpenAI uses same credential isolation as Claude Co-authored-by: lpcox <15877973+lpcox@users.noreply.github.com> * fix: exclude OpenAI/Codex keys from agent when api-proxy enabled When --env-all is used (as in smoke-codex test), OPENAI_API_KEY and CODEX_API_KEY were being passed to the agent container, bypassing the credential isolation provided by the api-proxy sidecar. Changes: - Add OPENAI_API_KEY, OPENAI_KEY, CODEX_API_KEY to EXCLUDED_ENV_VARS when enableApiProxy is true - Add similar exclusion logic for non-envAll case (selective env passing) - API keys remain correctly passed to api-proxy container for credential injection - Add 4 new tests verifying keys are excluded from agent with/without envAll This ensures both OpenAI/Codex and Anthropic/Claude credentials are properly isolated in the api-proxy sidecar, matching the documented architecture. Co-authored-by: lpcox <15877973+lpcox@users.noreply.github.com> * fix: enable api-proxy for smoke-codex workflow Removes CODEX_API_KEY and OPENAI_API_KEY from agent environment by: - Adding --enable-api-proxy flag to AWF command - Removing API keys from env block in smoke-codex.lock.yml This ensures Codex uses the same credential isolation architecture as Claude, where API keys are held exclusively in the api-proxy sidecar container and never exposed to the agent execution environment. With this change: - Agent receives OPENAI_BASE_URL pointing to api-proxy:10000 - API keys passed to api-proxy container which injects auth headers - One-shot-token library no longer detects keys in agent environment Resolves feedback from comment 3901985142. * feat: add api-proxy health checks for credential isolation Adds pre-flight connectivity and credential isolation checks before agent execution. Health check verifies: - API keys NOT present in agent environment (ANTHROPIC_API_KEY, CLAUDE_API_KEY, OPENAI_API_KEY, CODEX_API_KEY, OPENAI_KEY) - API proxy reachable via ANTHROPIC_BASE_URL and OPENAI_BASE_URL - TCP connectivity test with 5 second timeout Implementation: - Created api-proxy-health-check.sh with detailed credential checks - Integrated into agent entrypoint.sh (runs after iptables, before agent) - Added to Dockerfile build process - Fails fast if credential isolation broken or proxy unreachable This ensures both Claude and Codex agents cannot access API keys directly, confirming the credential isolation architecture is working as designed. * fix: restore API keys to workflow env for AWF CLI The API keys must be present in the workflow step's env block so the AWF CLI process (running on the host) can read them via process.env.OPENAI_API_KEY. The AWF CLI needs these environment variables to: 1. Detect that api-proxy should be enabled (checks for openaiApiKey/anthropicApiKey) 2. Pass API keys to the api-proxy container 3. Set OPENAI_BASE_URL and ANTHROPIC_BASE_URL for the agent container The credential isolation still works because: - API keys are in the EXCLUDED_ENV_VARS set when enableApiProxy is true - Keys are excluded from the agent container environment (commit abedf83) - Keys only go to the api-proxy sidecar container - Agent receives only BASE_URL environment variables This fixes the issue where OPENAI_BASE_URL was not being set because config.openaiApiKey was undefined (AWF CLI couldn't read the env var). * feat: add claude code api key helper for credential isolation This commit implements dynamic API key retrieval for Claude Code using a helper script following the LLM Gateway pattern, ensuring credential isolation where only the api-proxy container has access to real tokens. Changes: - Created /containers/agent/get-claude-key.sh: Helper script that outputs a placeholder API key (sk-ant-placeholder-key-for-credential-isolation) - Updated containers/agent/Dockerfile: Added get-claude-key.sh to container image and made it executable - Modified .github/workflows/smoke-claude.lock.yml: Configured Claude Code to use the apiKeyHelper by creating ~/.claude/config.json and unsetting ANTHROPIC_API_KEY/ANTHROPIC_AUTH_TOKEN environment variables before running the claude command The api-proxy intercepts requests and injects the real ANTHROPIC_API_KEY, so the placeholder key never reaches the actual Anthropic API. This ensures: 1. Claude Code agent never has access to the real API key 2. Only api-proxy container holds the real credentials 3. Health checks verify keys are NOT in agent environment 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com> * feat: add logging to api key helper and api-proxy for observability Co-authored-by: lpcox <15877973+lpcox@users.noreply.github.com> * fix: add CLAUDE_CODE_API_KEY_HELPER env var for helper detection Co-authored-by: lpcox <15877973+lpcox@users.noreply.github.com> * fix: unset HTTP_PROXY for claude to use api-proxy as gateway Co-authored-by: lpcox <15877973+lpcox@users.noreply.github.com> * feat(logs): add api-proxy log persistence and preservation - Add volume mount for api-proxy logs in docker-compose - Redirect api-proxy stdout/stderr to /var/log/api-proxy/api-proxy.log - Create api-proxy logs directory during writeConfigs() - Preserve api-proxy logs after cleanup (similar to squid logs) - Support both proxyLogsDir (workflow mode) and default mode - When proxyLogsDir is set, write logs to sibling directory - When proxyLogsDir is not set, move logs to /tmp/api-proxy-logs-<timestamp> - Fix permissions on preserved logs for GitHub Actions artifact upload This ensures api-proxy logs are accessible after smoke-claude and smoke-codex workflows finish, as requested in comment 3902135209. * fix: add claude code api key helper validation and ttl config - Add validation in entrypoint.sh to verify apiKeyHelper is in config file - Check config file exists at ~/.claude/config.json - Verify apiKeyHelper field matches CLAUDE_CODE_API_KEY_HELPER env var - Exit with error if validation fails to prevent using wrong credentials - Add CLAUDE_CODE_API_KEY_HELPER_TTL_MS=3600000 to smoke-claude workflow - Ensures Claude Code properly detects and uses the API key helper script Co-authored-by: lpcox <15877973+lpcox@users.noreply.github.com> * fix: make claude code api key helper validation conditional The validation was failing in smoke-claude workflow because: - Entrypoint validation runs before the user command executes - The config file (~/.claude/config.json) is created by the user command - Previous implementation required config file to exist, causing exit 1 Changes: - Made validation conditional: only validate if config file exists - If config file doesn't exist, log informative message and continue - This allows user commands to create the config file after entrypoint runs - Validation still occurs when config file exists (e.g., mounted from host) Fixes comment 3902171070 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com> * feat: add post-step to display final claude code config Added diagnostic step to cat both ~/.claude/config.json and ~/.claude.json after Claude Code execution completes. This helps verify that config updates preserve previous values and shows the final state for debugging. The step runs with if: always() to ensure config is shown even if tests fail. Co-authored-by: lpcox <15877973+lpcox@users.noreply.github.com> * fix: set CLAUDE_CODE_API_KEY_HELPER env var for credential isolation (#851) * Initial plan * fix: set CLAUDE_CODE_API_KEY_HELPER env var for credential isolation When api-proxy is enabled with an Anthropic key, set the CLAUDE_CODE_API_KEY_HELPER environment variable to point to the get-claude-key.sh script. This ensures Claude Code CLI properly uses the API key helper for credential isolation. Previously, only ANTHROPIC_BASE_URL was set, but Claude Code requires either a config file with apiKeyHelper or the environment variable to actually use the helper script. Without this, Claude Code would not read the config and authentication would fail. This fix: - Sets CLAUDE_CODE_API_KEY_HELPER=/usr/local/bin/get-claude-key.sh when api-proxy is enabled with Anthropic key - Adds comprehensive tests for the new environment variable - Updates type documentation to reflect the new env var Co-authored-by: lpcox <15877973+lpcox@users.noreply.github.com> --------- Co-authored-by: anthropic-code-agent[bot] <242468646+Claude@users.noreply.github.com> Co-authored-by: lpcox <15877973+lpcox@users.noreply.github.com> * fix: change claude code config path to ~/.claude.json Changed the config file path from ~/.claude/config.json to ~/.claude.json in both the entrypoint validation and the smoke-claude post-step diagnostic. This aligns with where Claude Code actually writes its config file (as shown in the debug logs). The post-step now checks ~/.claude.json first and shows ~/.claude/config.json as legacy for backwards compatibility. Co-authored-by: lpcox <15877973+lpcox@users.noreply.github.com> * fix: create ~/.claude.json with apiKeyHelper before claude starts Changed entrypoint.sh to CREATE the ~/.claude.json config file with apiKeyHelper configuration if it doesn't exist, rather than just skipping validation. This fixes authentication errors where Claude Code couldn't find the apiKeyHelper because the config file didn't exist yet when Claude Code needed to authenticate. The file is now created during container initialization, before the user command runs. The file is created with mode 600 for security and contains: {"apiKeyHelper":"/usr/local/bin/get-claude-key.sh"} Co-authored-by: lpcox <15877973+lpcox@users.noreply.github.com> * fix: mount ~/.claude.json for chroot mode accessibility Fixed authentication error in chroot mode where Claude Code couldn't find ~/.claude.json because it was created in the container filesystem instead of the mounted /host path. Changes: - Mount ~/.claude.json file explicitly in docker-manager (create if missing) - Update entrypoint to write to /host$HOME/.claude.json in chroot mode - File now accessible both before and after chroot transition This fixes the ENOENT error: "no such file or directory, lstat '/home/runner/.claude.json'" Co-authored-by: lpcox <15877973+lpcox@users.noreply.github.com> * fix: handle empty ~/.claude.json created by docker-manager The entrypoint validation was failing when docker-manager created an empty {} file for mounting. Now the entrypoint handles three cases: 1. File exists with apiKeyHelper → validate it matches 2. File exists without apiKeyHelper → write it (empty {} from mount) 3. File doesn't exist → create it with apiKeyHelper This fixes the validation error where entrypoint rejected the empty file instead of populating it with the apiKeyHelper configuration. Co-authored-by: lpcox <15877973+lpcox@users.noreply.github.com> * fix: use 0o666 mode for ~/.claude.json to fix permissions (#852) * Initial plan * fix: use 0o666 mode for ~/.claude.json to fix permissions Container root writes to this file, changing ownership. Using 0o666 ensures host user can still read it after modifications. Co-authored-by: lpcox <15877973+lpcox@users.noreply.github.com> --------- Co-authored-by: anthropic-code-agent[bot] <242468646+Claude@users.noreply.github.com> Co-authored-by: lpcox <15877973+lpcox@users.noreply.github.com> * fix: use 666 permissions for ~/.claude.json to allow host user writes Co-authored-by: lpcox <15877973+lpcox@users.noreply.github.com> * fix(ci): add api-proxy logs to smoke-claude artifacts Co-authored-by: lpcox <15877973+lpcox@users.noreply.github.com> * feat: add ANTHROPIC_AUTH_TOKEN placeholder to agent environment This commit adds ANTHROPIC_AUTH_TOKEN with a placeholder value to the agent environment when api-proxy is enabled with Anthropic credentials. This ensures Claude Code CLI compatibility while maintaining credential isolation (real auth happens via ANTHROPIC_BASE_URL). Changes: - src/docker-manager.ts: Set ANTHROPIC_AUTH_TOKEN to placeholder when anthropicApiKey is configured with api-proxy - src/docker-manager.test.ts: Updated all Anthropic test cases to verify ANTHROPIC_AUTH_TOKEN is set to placeholder value - containers/agent/api-proxy-health-check.sh: Added validation that ANTHROPIC_AUTH_TOKEN (if present) is the placeholder value, not a real token - tests/integration/api-proxy.test.ts: Added integration test to verify ANTHROPIC_AUTH_TOKEN is set to placeholder in agent container Security model: - Real ANTHROPIC_API_KEY stays in api-proxy container only - Agent gets placeholder ANTHROPIC_AUTH_TOKEN for CLI compatibility - Agent gets ANTHROPIC_BASE_URL pointing to api-proxy (http://172.30.0.30:10001) - Real authentication happens when api-proxy injects the real key - Health checks verify no real credentials leak to agent environment Co-authored-by: lpcox <15877973+lpcox@users.noreply.github.com> * fix: restore CODEX_API_KEY env var passing for codex agent compatibility This commit restores the behavior where CODEX_API_KEY is passed directly to the agent environment, even when api-proxy is enabled. This ensures Codex agent compatibility while keeping Claude Code authentication unchanged (using api-proxy pattern with placeholder tokens). Changes: - src/docker-manager.ts: Remove CODEX_API_KEY from EXCLUDED_ENV_VARS when api-proxy is enabled (line 334 removed) - src/docker-manager.ts: Remove api-proxy check for CODEX_API_KEY, allowing it to be passed unconditionally (line 423) - src/docker-manager.test.ts: Update test to expect CODEX_API_KEY to be present in agent environment when api-proxy is enabled Authentication model: - Claude/Anthropic: Uses api-proxy pattern (no real keys in agent, placeholder ANTHROPIC_AUTH_TOKEN, BASE_URL pointing to api-proxy) - Codex: Uses direct credential passing (CODEX_API_KEY in agent env) - OpenAI: Uses api-proxy pattern (excluded from agent when enabled) Co-authored-by: lpcox <15877973+lpcox@users.noreply.github.com> * fix: disable CODEX_API_KEY health check for direct credential passing The health check was failing because CODEX_API_KEY is now intentionally passed to the agent environment for Codex compatibility. This commit comments out the CODEX_API_KEY validation in the health check while keeping the OPENAI_API_KEY and OPENAI_KEY checks intact. Changes: - containers/agent/api-proxy-health-check.sh: Remove CODEX_API_KEY from the credential isolation check (line 68) - containers/agent/api-proxy-health-check.sh: Comment out CODEX_API_KEY error message (line 72) - Added explanatory comments noting that CODEX_API_KEY is intentionally passed through for Codex agent compatibility Health check now validates: - ✓ ANTHROPIC_API_KEY still excluded (Claude Code uses api-proxy) - ✓ OPENAI_API_KEY still excluded (uses api-proxy when enabled) - ✓ CODEX_API_KEY validation disabled (Codex uses direct credentials) Co-authored-by: lpcox <15877973+lpcox@users.noreply.github.com> * fix: disable OPENAI_BASE_URL for codex agent (temporary) Co-authored-by: lpcox <15877973+lpcox@users.noreply.github.com> * docs: update OPENAI_BASE_URL to include /v1 path suffix Co-authored-by: lpcox <15877973+lpcox@users.noreply.github.com> --------- Co-authored-by: anthropic-code-agent[bot] <242468646+Claude@users.noreply.github.com> Co-authored-by: lpcox <15877973+lpcox@users.noreply.github.com> Co-authored-by: Claude Sonnet 4.5 <noreply@anthropic.com>
The GitHub Actions workflow was failing with
EACCES: permission deniederrors when trying to read~/.claude.jsonafter the agent container completed execution.Root Cause
The file permission flow was incorrect:
docker-manager.tscreates~/.claude.jsonon the host with mode0o600(owner read/write only)Changes
0o600to0o666insrc/docker-manager.ts:565This allows both container root (for entrypoint writes) and host user (for workflow debugging) to access the file after container modifications.