Skip to content

cache-memory: add pre-agent working-tree sanitization to neutralize planted executables and disallowed files#26587

Merged
pelikhan merged 8 commits intomainfrom
copilot/fix-cache-memory-issue
Apr 16, 2026
Merged

cache-memory: add pre-agent working-tree sanitization to neutralize planted executables and disallowed files#26587
pelikhan merged 8 commits intomainfrom
copilot/fix-cache-memory-issue

Conversation

Copy link
Copy Markdown
Contributor

Copilot AI commented Apr 16, 2026

With integrity: none (the default), cache-memory restores prior-run files into /tmp/gh-aw/cache-memory/ before the agent runs and before threat detection fires. A compromised run can plant executable scripts (e.g. helper.sh), symlinks to sensitive files, or unexpected file types that the agent encounters without any prior validation gate.

Changes

  • setup_cache_memory_git.sh — added a pre-agent sanitization block that runs after git checkout/merge-down, immediately before the agent can access the working tree:

    • Unconditional: deletes all working-tree symlinks (preventing planted links to files outside the cache), then strips execute bits from all non-git working-tree files
    • Conditional on GH_AW_ALLOWED_EXTENSIONS: removes files whose extension isn't in the allowed list using NUL-delimited find -print0 for safe handling of filenames with newlines; extension matching is case-insensitive and whitespace-tolerant (both the allowed list entries and extracted extensions are lowercased and whitespace-stripped before comparison)
  • pkg/workflow/cache.go (generateCacheMemoryGitSetupStep) — passes GH_AW_ALLOWED_EXTENSIONS as a colon-separated env var to the git setup step when AllowedExtensions is configured. Extensions are validated at parse time against ^\.[A-Za-z0-9]+$ (via isValidFileExtension) to prevent YAML injection, and single quotes are escaped (''') at emit time as defense-in-depth:

    if len(cache.AllowedExtensions) > 0 {
        escaped := strings.ReplaceAll(strings.Join(cache.AllowedExtensions, ":"), "'", "''")
        fmt.Fprintf(builder, "          GH_AW_ALLOWED_EXTENSIONS: '%s'\n", escaped)
    }
  • pkg/workflow/cache_integrity_test.go — unit tests verifying the env var is emitted when extensions are configured, omitted when the list is empty, and that isValidFileExtension correctly accepts/rejects values and that single-quote escaping is applied in generated YAML.

  • actions/setup/sh/setup_cache_memory_git_test.sh — shell tests for the sanitization block covering 11 scenarios (32 assertions):

    • Execute bits are stripped unconditionally from all working-tree files
    • .git/ directory is excluded from sanitization
    • All files are preserved when GH_AW_ALLOWED_EXTENSIONS is unset
    • Files with disallowed extensions are removed when the allowlist is active
    • Files with no extension are removed when an allowlist is active
    • Single-extension allowlists work correctly
    • Execute-bit stripping and extension filtering both apply in the same run
    • Extension matching is case-insensitive (.JSON file matched by .json allowlist entry)
    • Whitespace in allowed extensions list is trimmed (.json matches .json files)
    • Symlinks in the working tree are deleted unconditionally
    • Files with spaces in their names are handled correctly

The execute-bit stripping and symlink deletion are always active for all cache-memory workflows. Extension-based removal only activates for workflows that explicitly set allowed-extensions.

Copilot AI and others added 2 commits April 16, 2026 09:06
…d executables and disallowed files

Agent-Logs-Url: https://github.com/github/gh-aw/sessions/d46a2d61-1cc8-404f-a706-606b9f7e42c9

Co-authored-by: szabta89 <1330202+szabta89@users.noreply.github.com>
Copilot AI changed the title [WIP] Fix cache memory integrity issue with prior unvalidated content cache-memory: add pre-agent working-tree sanitization to neutralize planted executables and disallowed files Apr 16, 2026
Copilot AI requested a review from szabta89 April 16, 2026 09:21
@github-actions
Copy link
Copy Markdown
Contributor

Hey @Copilot 👋 — great work on this cache memory integrity hardening! Stripping execute bits and filtering disallowed extensions before the agent reads the working tree is exactly the kind of defence-in-depth the project needs.

The PR looks well-structured: the shell script hardening, the Go env-var wiring in cache.go, and the unit tests in cache_integrity_test.go all tell a coherent story. Build validation (make fmt, make build, make recompile) is confirmed passing too.

This looks ready for maintainer review once the draft status is lifted. Nice one! 🎉

Generated by Contribution Check · ● 2.4M ·

@pelikhan
Copy link
Copy Markdown
Collaborator

@copilot merge main and recompile

Copilot AI and others added 2 commits April 16, 2026 13:51
…ory-issue

Co-authored-by: pelikhan <4175913+pelikhan@users.noreply.github.com>
Agent-Logs-Url: https://github.com/github/gh-aw/sessions/6cdf49f7-b40e-4d7a-876c-231165f06b79

Co-authored-by: pelikhan <4175913+pelikhan@users.noreply.github.com>
Copy link
Copy Markdown
Contributor Author

Copilot AI commented Apr 16, 2026

@copilot merge main and recompile

Done in 64f9d20 — merged main and recompiled all 191 workflow lock files successfully.

Warning

Firewall rules blocked me from connecting to one or more addresses (expand for details)

I tried to connect to the following addresses, but was blocked by firewall rules:

  • https://api.github.com/graphql
    • Triggering command: /usr/bin/gh /usr/bin/gh api graphql -f query=query($owner: String!, $name: String!) { repository(owner: $owner, name: $name) { hasDiscussionsEnabled } } -f owner=github -f name=gh-aw (http block)
    • Triggering command: /usr/bin/gh /usr/bin/gh api graphql -f query=query($owner: String!, $name: String!) { repository(owner: $owner, name: $name) { hasDiscussionsEnabled } } -f owner=github -f name=gh-aw -embedcfg /tmp/go-build2414702422/b345/embedcfg -pack (http block)
  • https://api.github.com/repos/astral-sh/setup-uv/git/ref/tags/eac588ad8def6316056a12d4907a9d4d84ff7a3b
    • Triggering command: /usr/bin/gh gh api /repos/astral-sh/setup-uv/git/ref/tags/eac588ad8def6316056a12d4907a9d4d84ff7a3b --jq [.object.sha, .object.type] | @tsv (http block)
  • https://api.github.com/repos/githubnext/agentics/git/ref/tags/-
    • Triggering command: /usr/bin/gh gh api /repos/githubnext/agentics/git/ref/tags/- --jq [.object.sha, .object.type] | @tsv (http block)

If you need me to access, download, or install something from one of these locations, you can either:

Copilot AI requested a review from pelikhan April 16, 2026 13:55
if [ -n "${GH_AW_ALLOWED_EXTENSIONS:-}" ]; then
echo "Pre-agent sanitization: enforcing allowed extensions: ${GH_AW_ALLOWED_EXTENSIONS}"
removed=0
while IFS= read -r file; do
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@copilot add tests

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added setup_cache_memory_git_test.sh in fa194e9. It covers 8 test scenarios (23 assertions): execute-bit stripping, .git/ dir exclusion, no filter when GH_AW_ALLOWED_EXTENSIONS is unset, extension-based removal, files without extensions, single-extension allowlists, combined stripping+filtering, and case-sensitive extension comparison. All pass.

@pelikhan pelikhan marked this pull request as ready for review April 16, 2026 14:02
Copilot AI review requested due to automatic review settings April 16, 2026 14:02
@github-actions
Copy link
Copy Markdown
Contributor

🧪 Test Quality Sentinel Report

Test Quality Score: 90/100

Excellent test quality

Metric Value
New/modified tests analyzed 3
✅ Design tests (behavioral contracts) 3 (100%)
⚠️ Implementation tests (low value) 0 (0%)
Tests with error/edge cases 3 (100%)
Duplicate test clusters 0
Test inflation detected ⚠️ Yes — 116 test lines / 8 production Go lines (14.5:1)
🚨 Coding-guideline violations None

i️ pkg/testutil/spec_test.go changes are pure whitespace formatting — no logic added or modified; excluded from scoring.


Test Classification Details

Test File Classification Notes
TestCacheMemoryGitSetupStep_AllowedExtensionsEnvVar pkg/workflow/cache_integrity_test.go:536 ✅ Design Table-driven (4 subtests): nil, empty, single, multi-extension — covers all boundary cases
TestCacheMemorySteps_PreAgentSanitizationEnvVar pkg/workflow/cache_integrity_test.go:598 ✅ Design End-to-end: config parse → extract → generate; asserts observable output contains correct env var
TestCacheMemorySteps_NoExtensionsNoSanitizationEnvVar pkg/workflow/cache_integrity_test.go:634 ✅ Design Negative case of above; verifies env var is absent when no restrictions configured

Test Inflation Note

⚠️ Test inflation: 14.5:1 ratio (click to expand)

cache_integrity_test.go added 116 lines while cache.go added only 8 lines. This exceeds the 2:1 threshold and costs 10 points.

However, context matters here: the bulk of the production logic lives in actions/setup/sh/setup_cache_memory_git.sh (+48 lines), which implements the actual sanitization in Bash. The Go tests are verifying the compiler correctly emits the env var that drives that script — they're the only practical way to test this integration at the Go unit-test level.

Impact on score: −10 pts (90 instead of 100). Not a guideline violation.


No Flagged Tests

All three new test functions are well-structured:

  • ✅ Build tag //go:build !integration present
  • ✅ No mock libraries (gomock, testify/mock, .EXPECT(), .On())
  • ✅ Descriptive assertion messages on every assert.* / require.* call
  • ✅ Table-driven test covers nil, empty, single-item, and multi-item edge cases
  • require.NoError used for critical setup; assert.* for validations

Language Support

Tests analyzed:

  • 🐹 Go (*_test.go): 3 tests — unit (//go:build !integration)
  • 🟨 JavaScript (*.test.cjs, *.test.js): 0 tests

Verdict

Check passed. 0% of new tests are implementation tests (threshold: 30%). All three tests verify behavioral contracts — they assert on the observable string output of the compiler, covering the positive path (env var present with correct value), negative path (env var absent), and boundary conditions (nil vs. empty slice). No coding-guideline violations detected.


📖 Understanding Test Classifications

Design Tests (High Value) verify what the system does:

  • Assert on observable outputs, return values, or state changes
  • Cover error paths and boundary conditions
  • Would catch a behavioral regression if deleted
  • Remain valid even after internal refactoring

Implementation Tests (Low Value) verify how the system does it:

  • Assert on internal function calls (mocking internals)
  • Only test the happy path with typical inputs
  • Break during legitimate refactoring even when behavior is correct
  • Give false assurance: they pass even when the system is wrong

Goal: Shift toward tests that describe the system's behavioral contract — the promises it makes to its users and collaborators.

References: §24514620892

🧪 Test quality analysis by Test Quality Sentinel · ● 567.3K ·

Copy link
Copy Markdown
Contributor

@github-actions github-actions Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

✅ Test Quality Sentinel: 90/100. Test quality is excellent — 0% of new tests are implementation tests (threshold: 30%). All three new tests verify behavioral contracts with proper edge-case coverage, assertion messages, and no mock library usage.

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR hardens cache-memory restores by adding a pre-agent sanitization gate in the git-setup script, and wiring allowed-extensions through to that setup step so unexpected file types from prior runs can be removed before the agent sees them.

Changes:

  • Add pre-agent working-tree sanitization to setup_cache_memory_git.sh (strip execute bits; optionally remove disallowed extensions).
  • Emit GH_AW_ALLOWED_EXTENSIONS from generateCacheMemoryGitSetupStep when extensions are configured.
  • Add unit tests asserting the env var is emitted/omitted as expected, and update a lock workflow snapshot accordingly.
Show a summary per file
File Description
actions/setup/sh/setup_cache_memory_git.sh Adds pre-agent sanitization (chmod a-x; optional extension-based deletion).
pkg/workflow/cache.go Passes GH_AW_ALLOWED_EXTENSIONS into the setup step when configured.
pkg/workflow/cache_integrity_test.go Adds tests for env var emission/omission in generated steps.
pkg/testutil/spec_test.go Formatting-only adjustments in a test case struct literal.
.github/workflows/ai-moderator.lock.yml Updates generated workflow snapshot to include the new env var.

Copilot's findings

Tip

Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

  • Files reviewed: 4/5 changed files
  • Comments generated: 4

Comment on lines +114 to +120
# 1. Strip execute bits from all working-tree files so that a prior run cannot plant
# executable scripts (e.g. helper.sh) that the agent or runner could invoke before
# any validation gate fires.
find . -not -path './.git/*' -type f -exec chmod a-x {} + 2>/dev/null || true
echo "Pre-agent sanitization: stripped execute permissions from all working-tree files"

# 2. If GH_AW_ALLOWED_EXTENSIONS is set (colon-separated, e.g. .json:.md:.txt), remove
Copy link

Copilot AI Apr 16, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The execute-bit stripping uses find ... -type f, which skips symlinks. A prior run could plant a symlink (e.g., secrets.json -> /home/runner/.ssh/...) that the agent can read before threat detection, bypassing both execute-bit stripping and extension enforcement. Consider deleting symlinks in the cache working tree (e.g., find ... -type l -delete) or otherwise handling them explicitly as part of pre-agent sanitization.

Suggested change
# 1. Strip execute bits from all working-tree files so that a prior run cannot plant
# executable scripts (e.g. helper.sh) that the agent or runner could invoke before
# any validation gate fires.
find . -not -path './.git/*' -type f -exec chmod a-x {} + 2>/dev/null || true
echo "Pre-agent sanitization: stripped execute permissions from all working-tree files"
# 2. If GH_AW_ALLOWED_EXTENSIONS is set (colon-separated, e.g. .json:.md:.txt), remove
# 1. Delete all working-tree symlinks so that a prior run cannot plant links to files
# outside the cache (e.g. secrets) that would bypass the regular-file checks below.
find . -not -path './.git/*' -type l -delete 2>/dev/null || true
echo "Pre-agent sanitization: deleted all working-tree symlinks"
# 2. Strip execute bits from all working-tree files so that a prior run cannot plant
# executable scripts (e.g. helper.sh) that the agent or runner could invoke before
# any validation gate fires.
find . -not -path './.git/*' -type f -exec chmod a-x {} + 2>/dev/null || true
echo "Pre-agent sanitization: stripped execute permissions from all working-tree files"
# 3. If GH_AW_ALLOWED_EXTENSIONS is set (colon-separated, e.g. .json:.md:.txt), remove

Copilot uses AI. Check for mistakes.
Comment on lines +126 to +147
while IFS= read -r file; do
filename="$(basename "$file")"
# Extract the last dot-prefixed segment as the extension, or empty if no dot
case "$filename" in
*.*) ext=".${filename##*.}" ;;
*) ext="" ;;
esac
# Check whether this extension appears in the colon-separated allowed list
found=0
IFS=: read -ra _ALLOWED_EXTS <<< "$GH_AW_ALLOWED_EXTENSIONS"
for _a in "${_ALLOWED_EXTS[@]}"; do
if [ "$ext" = "$_a" ]; then
found=1
break
fi
done
if [ "$found" -eq 0 ]; then
echo "Removing disallowed file: $file (extension: '${ext:-none}')"
rm -f "$file"
removed=$((removed + 1))
fi
done < <(find . -not -path './.git/*' -type f)
Copy link

Copilot AI Apr 16, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The allowed-extension enforcement loop relies on newline-delimited find output and read -r file, which breaks for filenames containing newlines and makes it possible for a crafted filename to evade checks/removal. Consider using NUL delimiters (find ... -print0 with while IFS= read -r -d '' file) for robust sanitization.

Copilot uses AI. Check for mistakes.
Comment on lines +128 to +141
# Extract the last dot-prefixed segment as the extension, or empty if no dot
case "$filename" in
*.*) ext=".${filename##*.}" ;;
*) ext="" ;;
esac
# Check whether this extension appears in the colon-separated allowed list
found=0
IFS=: read -ra _ALLOWED_EXTS <<< "$GH_AW_ALLOWED_EXTENSIONS"
for _a in "${_ALLOWED_EXTS[@]}"; do
if [ "$ext" = "$_a" ]; then
found=1
break
fi
done
Copy link

Copilot AI Apr 16, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Extension matching is currently case-sensitive and does not trim whitespace in GH_AW_ALLOWED_EXTENSIONS entries. This can cause unexpected deletions (e.g., configured ".MD" won’t match "readme.md", or ".json " won’t match ".json"). Consider normalizing both the extracted ext and the allowed list (trim + lowercase) once outside the per-file loop before comparisons.

Copilot uses AI. Check for mistakes.
Comment thread pkg/workflow/cache.go
Comment on lines +529 to +531
if len(cache.AllowedExtensions) > 0 {
fmt.Fprintf(builder, " GH_AW_ALLOWED_EXTENSIONS: '%s'\n", strings.Join(cache.AllowedExtensions, ":"))
}
Copy link

Copilot AI Apr 16, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

GH_AW_ALLOWED_EXTENSIONS is inserted into the generated workflow YAML as a single-quoted string, but the value is built from user-configured extensions without YAML escaping/validation. If an extension contains a single quote, newline, or other YAML-significant characters, it can break the workflow (and potentially enable YAML injection in generated steps). Consider validating allowed-extensions entries to a strict pattern (e.g., ^\.[A-Za-z0-9]+$) and/or escaping for YAML single-quoted scalars (replace ' with '') before writing.

Copilot uses AI. Check for mistakes.
Generated by Design Decision Gate workflow to document the architectural
decision to strip execute bits and filter disallowed extensions from the
cache-memory working tree before agent execution.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
@github-actions
Copy link
Copy Markdown
Contributor

Commit pushed: c575b1a

🏗️ ADR gate enforced by Design Decision Gate 🏗️

@github-actions
Copy link
Copy Markdown
Contributor

🏗️ Design Decision Gate — ADR Required

This PR makes significant changes to core business logic (>100 new lines in pkg/) but does not have a linked Architecture Decision Record (ADR).

AI has analyzed the PR diff and generated a draft ADR to help you get started:

📄 Draft ADR: docs/adr/26587-pre-agent-cache-memory-working-tree-sanitization.md

The draft captures the key design decision: extend setup_cache_memory_git.sh to unconditionally strip execute bits and conditionally filter disallowed extensions from the cache-memory working tree before agent execution, along with four genuine alternatives that were considered.

What to do next

  1. Review the draft ADR committed to your branch — it was generated from the PR diff
  2. Complete the missing sections — verify the alternatives reflect what was actually considered, add any context the AI couldn't infer, and adjust the rationale if needed
  3. Commit the finalized ADR to docs/adr/ on your branch
  4. Reference the ADR in this PR body by adding a line such as:

    ADR: ADR-26587: Pre-Agent Cache-Memory Working-Tree Sanitization

Once an ADR is linked in the PR body, this gate will re-run and verify the implementation matches the decision.

Why ADRs Matter

"AI made me procrastinate on key design decisions. Because refactoring was cheap, I could always say 'I'll deal with this later.' Deferring decisions corroded my ability to think clearly."

ADRs create a searchable, permanent record of why the codebase looks the way it does. Future contributors (and your future self) will thank you.


📋 Michael Nygard ADR Format Reference

An ADR must contain these four sections to be considered complete:

  • Context — What is the problem? What forces are at play?
  • Decision — What did you decide? Why?
  • Alternatives Considered — What else could have been done?
  • Consequences — What are the trade-offs (positive and negative)?

All ADRs are stored in docs/adr/ as Markdown files numbered by PR number (e.g., 26587-pre-agent-cache-memory-working-tree-sanitization.md for PR #26587).

🔒 This PR cannot merge until an ADR is linked in the PR body.

References: §24514620900

Note

🔒 Integrity filter blocked 1 item

The following item were blocked because they don't meet the GitHub integrity level.

To allow these resources, lower min-integrity in your GitHub frontmatter:

tools:
  github:
    min-integrity: approved  # merged | approved | unapproved | none

🏗️ ADR gate enforced by Design Decision Gate 🏗️ · ● 152.6K ·

@pelikhan
Copy link
Copy Markdown
Collaborator

@copilot review all comments

@pelikhan
Copy link
Copy Markdown
Collaborator

@copiolot review all comments

…, NUL delimiters, extension normalization, YAML injection)

Agent-Logs-Url: https://github.com/github/gh-aw/sessions/634f95c6-4b33-413e-b69c-63111abced23

Co-authored-by: pelikhan <4175913+pelikhan@users.noreply.github.com>
Copy link
Copy Markdown
Contributor Author

Copilot AI commented Apr 16, 2026

@copilot review all comments

All four review findings addressed in d3f511a:

  • Symlinks: added find -type l -delete before the chmod step
  • NUL delimiters: find -print0 + while IFS= read -r -d '' for newline-safe iteration
  • Extension normalization: pre-build a lowercase/whitespace-stripped allowed list; normalize the extracted extension the same way (case-insensitive, e.g. .MD matches .md)
  • YAML injection: added isValidFileExtension validator (^\.[A-Za-z0-9]+$) at parse time + single-quote escaping ('') at emit time in cache.go

Tests updated accordingly (shell tests 8–11 + new TestCacheMemoryAllowedExtensions_ValidationAndEscaping).

@pelikhan pelikhan merged commit 62a06af into main Apr 16, 2026
@pelikhan pelikhan deleted the copilot/fix-cache-memory-issue branch April 16, 2026 15:42
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

cache-memory: default integrity: none restores unvalidated prior-run content to agent filesystem before any detection gate

4 participants