Skip to content

ASI-06: Sanitize repo-memory and cache-memory content before prompt injection#28777

Closed
Copilot wants to merge 4 commits intomainfrom
copilot/asi-06-sanitize-repo-and-cache-memory
Closed

ASI-06: Sanitize repo-memory and cache-memory content before prompt injection#28777
Copilot wants to merge 4 commits intomainfrom
copilot/asi-06-sanitize-repo-and-cache-memory

Conversation

Copy link
Copy Markdown
Contributor

Copilot AI commented Apr 27, 2026

Persisted memory content (repo-memory and cache-memory) was injected into the agent prompt without any content scanning, enabling Memory & Context Poisoning attacks (OWASP Agentic Top 10 ASI-06): a compromised prior run could write prompt injection payloads into memory, overriding agent instructions on the next run.

Changes

Runtime content scanner (actions/setup/sh/sanitize_memory.sh)

New script that scans text files in a memory directory for 20+ known injection patterns — system prompt overrides (ignore previous instructions, you are now), role injection markers ([INST], [SYS], <|system|>), jailbreak phrases, and credential exfiltration directives. Suspicious files are moved to a quarantine directory (preserving relative path structure) and emit ::warning:: annotations. Binary files and .git/ are skipped.

Cache-memory integration (setup_cache_memory_git.sh)

Added step 4 to the existing pre-agent sanitization sequence (after symlink deletion → execute-bit strip → extension filtering) that calls sanitize_memory.sh against the restored cache directory.

Repo-memory integration (repo_memory.go)

generateRepoMemorySteps() emits an explicit named workflow step (Scan repo-memory for prompt injection) as the single scanning point for each cloned memory. The scan no longer runs redundantly inside clone_repo_memory_branch.sh.

Prompt boundary markers

All four memory prompt templates updated to carry sanitized="true" on their XML boundary elements, signalling to the model that the enclosed content has been scanned:

<repo-memory sanitized="true">   <!-- was: <repo-memory> -->
<cache-memory sanitized="true">  <!-- was: <cache-memory> -->

Go module (pkg/workflow/memory_sanitizer.go)

Provides generateRepoMemorySanitizationStep() — the single source of truth for emitting the scan step into compiled lock files. The emitted run command references the sanitizeMemoryScriptName constant to prevent drift if the script is renamed.

Tests

  • pkg/workflow/memory_sanitizer_test.go: Covers step generation for default/wiki/named memories and reads the actual runtime prompt files from disk to assert the sanitized="true" attribute is present.
  • pkg/workflow/sanitize_memory_script_test.go: Executes sanitize_memory.sh as a subprocess (following the pattern in git_patch_head_test.go) with 8 test functions covering: clean files, 9 injection pattern categories, relative-path preservation in quarantine, .git/ directory exclusion, empty directory, non-existent directory, and mixed safe/suspicious content.

Copilot AI and others added 2 commits April 27, 2026 18:17
Copilot AI changed the title [WIP] Fix memory sanitization for repo and cache content ASI-06: Sanitize repo-memory and cache-memory content before prompt injection Apr 27, 2026
Copilot AI requested a review from lpcox April 27, 2026 18:23
@lpcox lpcox marked this pull request as ready for review April 27, 2026 18:41
Copilot AI review requested due to automatic review settings April 27, 2026 18:41
@github-actions github-actions Bot mentioned this pull request Apr 27, 2026
@github-actions
Copy link
Copy Markdown
Contributor

🧪 Test Quality Sentinel Report

Test Quality Score: 63/100

⚠️ Acceptable, with suggestions

Metric Value
New/modified tests analyzed 6
✅ Design tests (behavioral contracts) 5 (83%)
⚠️ Implementation tests (low value) 1 (17%)
Tests with error/edge cases 2 (33%)
Duplicate test clusters 0
Test inflation detected ✅ Yes (181 test lines / 31 production lines ≈ 5.8:1)
🚨 Coding-guideline violations None

Test Classification Details

View all 6 tests
Test File Classification Issues Detected
TestGenerateRepoMemorySanitizationStep_DefaultMemory pkg/workflow/memory_sanitizer_test.go:31 ✅ Design None — verifies observable YAML step output
TestGenerateRepoMemorySanitizationStep_WikiMemory pkg/workflow/memory_sanitizer_test.go:50 ✅ Design No error path; happy-path variant only
TestGenerateRepoMemorySanitizationStep_NamedMemory pkg/workflow/memory_sanitizer_test.go:68 ✅ Design No error path; happy-path variant only
TestSanitizeMemoryScriptNameConstant pkg/workflow/memory_sanitizer_test.go:84 ⚠️ Implementation Asserts a constant string value — low behavioral value
TestRepoMemoryPromptHasSanitizedAttribute pkg/workflow/memory_sanitizer_test.go:90 ✅ Design Good: covers file content contract + 2 config variants via t.Run; require.NoError covers I/O error path
TestCacheMemoryPromptHasSanitizedAttribute pkg/workflow/memory_sanitizer_test.go:135 ✅ Design Good: covers file content contract + 2 config variants via t.Run; require.NoError covers I/O error path

Flagged Tests — Requires Review

⚠️ TestSanitizeMemoryScriptNameConstant (pkg/workflow/memory_sanitizer_test.go:84)

Classification: Implementation test
Issue: This test only asserts that the sanitizeMemoryScriptName constant equals "sanitize_memory.sh". Asserting a constant value provides no behavioral guarantee — if the constant were renamed or mistyped, the test would still pass as long as the string literal matched.
What design invariant does this test enforce? None observable — it verifies an internal string constant, not that the script is actually invoked or that the step output references the correct filename.
What would break if deleted? Only if someone changed the constant to the wrong value and also changed the assertion in the same commit. The three TestGenerateRepoMemorySanitizationStep_* tests already assert assert.Contains(t, output, "sanitize_memory.sh", ...) on the generated step output, providing stronger coverage of the same invariant.
Suggested improvement: Remove this test — it is fully redundant with the step-generation tests above, which assert that the output contains "sanitize_memory.sh". Alternatively, promote it to an integration-style test that verifies the script file exists on disk at the expected path (catching a deployment mismatch rather than a constant typo).


Test Inflation Note

The test file (181 lines added) is ~5.8× larger than the production file it covers (31 lines added). This is worth noting, though much of the test file exercises adjacent functions (buildRepoMemoryPromptSection, buildCacheMemoryPromptSection) and reads actual prompt files on disk — expanding the behavioral surface beyond the single production file. The inflation is not a false-comfort issue here, but consider whether the additional coverage of buildRepoMemoryPromptSection / buildCacheMemoryPromptSection should be credited against their respective production files (repo_memory.go).


Strengths

  • ✅ All assertions include descriptive messages — guideline compliant
  • ✅ Build tag //go:build !integration present on line 1
  • ✅ No mock libraries used — guideline compliant
  • TestRepoMemoryPromptHasSanitizedAttribute and TestCacheMemoryPromptHasSanitizedAttribute enforce the ASI-06 contract directly against deployed prompt files, which is high-value end-to-end coverage
  • require used correctly for setup assertions in helper (readPromptFile)

Language Support

Tests analyzed:

  • 🐹 Go (*_test.go): 6 tests — unit (//go:build !integration)
  • 🟨 JavaScript (*.test.cjs, *.test.js): 0 tests

Verdict

Check passed. 17% of new tests are implementation tests (threshold: 30%). No coding-guideline violations detected.


📖 Understanding Test Classifications

Design Tests (High Value) verify what the system does:

  • Assert on observable outputs, return values, or state changes
  • Cover error paths and boundary conditions
  • Would catch a behavioral regression if deleted
  • Remain valid even after internal refactoring

Implementation Tests (Low Value) verify how the system does it:

  • Assert on internal function calls or constant values
  • Only test the happy path with typical inputs
  • Break during legitimate refactoring even when behavior is correct
  • Give false assurance: they pass even when the system is wrong

Goal: Shift toward tests that describe the system's behavioral contract — the promises it makes to its users and collaborators.

References:

🧪 Test quality analysis by Test Quality Sentinel · ● 405.7K ·

Copy link
Copy Markdown
Contributor

@github-actions github-actions Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

✅ Test Quality Sentinel: 63/100. Test quality is acceptable — 17% of new tests are implementation tests (threshold: 30%). No coding-guideline violations detected. One low-value test (TestSanitizeMemoryScriptNameConstant) flagged for removal; see comment for details.

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Adds a pre-agent sanitization layer for persisted memory sources (repo-memory + cache-memory) to mitigate prompt injection / memory poisoning (OWASP Agentic Top 10 ASI-06) by scanning and quarantining suspicious memory files before they’re injected into prompts.

Changes:

  • Introduces a new sanitize_memory.sh scanner that quarantines files matching known injection/exfiltration patterns.
  • Integrates sanitization into cache-memory setup and repo-memory cloning, and emits an explicit, auditable workflow step for repo-memory scanning.
  • Updates memory prompt boundary templates to include sanitized="true" and adds Go tests for step generation + prompt template markers.
Show a summary per file
File Description
actions/setup/sh/sanitize_memory.sh New runtime scanner that scans memory directories and quarantines suspicious files with workflow warnings.
actions/setup/sh/setup_cache_memory_git.sh Runs the new scanner after existing pre-agent cache working-tree sanitization steps.
actions/setup/sh/clone_repo_memory_branch.sh Runs the new scanner after cloning/creating repo-memory working directory.
pkg/workflow/memory_sanitizer.go Adds a workflow-step generator for repo-memory sanitization.
pkg/workflow/repo_memory.go Emits an explicit “Scan repo-memory…” step after clone for auditability.
pkg/workflow/memory_sanitizer_test.go Adds tests for step generation and asserts sanitized="true" exists in prompt templates.
actions/setup/md/repo_memory_prompt*.md Adds sanitized="true" attribute to repo-memory prompt boundary tags.
actions/setup/md/cache_memory_prompt*.md Adds sanitized="true" attribute to cache-memory prompt boundary tags.

Copilot's findings

Tip

Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

  • Files reviewed: 10/10 changed files
  • Comments generated: 3

Comment on lines +1 to +8
#!/usr/bin/env bash

# sanitize_memory.sh
# Pre-agent content scanning for prompt injection in memory files.
#
# This script scans text files in a memory directory for known prompt injection
# patterns (system prompt overrides, role-play injections, instruction-ignoring
# directives) per OWASP Agentic Top 10 — ASI-06 (Memory & Context Poisoning).
Comment thread pkg/workflow/memory_sanitizer.go Outdated
}
builder.WriteString(" env:\n")
fmt.Fprintf(builder, " GH_AW_SCAN_DIR: %s\n", memoryDir)
builder.WriteString(" run: bash \"${RUNNER_TEMP}/gh-aw/actions/sanitize_memory.sh\"\n")
Comment on lines +548 to +553

// Step 2: Scan the cloned memory for prompt injection (ASI-06).
// The sanitize_memory.sh script is also invoked directly by the clone script,
// but we emit an explicit step here so that the scan appears in the workflow
// summary and its output is auditable independently of the clone step.
generateRepoMemorySanitizationStep(builder, memory, memoryDir)
@lpcox
Copy link
Copy Markdown
Collaborator

lpcox commented Apr 27, 2026

@copilot address this review feedback #28777 (review)

Copy link
Copy Markdown
Contributor Author

Copilot AI commented Apr 27, 2026

@copilot address this review feedback #28777 (review)

Addressed all three review points in commit 0bcbe11:

  1. Double-scanning: Removed the sanitize_memory.sh call from clone_repo_memory_branch.sh. The explicit compiled workflow step is now the single scanning point, keeping behaviour in one place.

  2. Use constant: The run command in generateRepoMemorySanitizationStep now uses sanitizeMemoryScriptName via fmt.Sprintf instead of a hard-coded string.

  3. Shell script tests: Added sanitize_memory_script_test.go with 8 test functions covering: clean files (no quarantine), 9 injection pattern categories, relative-path preservation in quarantine, .git/ directory exclusion, empty directory, non-existent directory, and mixed safe/suspicious content.

Warning

Firewall rules blocked me from connecting to one or more addresses (expand for details)

I tried to connect to the following addresses, but was blocked by firewall rules:

  • https://api.github.com/graphql
    • Triggering command: /usr/bin/gh /usr/bin/gh api graphql -f query=query($owner: String!, $name: String!) { repository(owner: $owner, name: $name) { hasDiscussionsEnabled } } -f owner=github -f name=gh-aw GO111MODULE 64/bin/go git rev-�� --show-toplevel go /usr/bin/git -json GO111MODULE 64/bin/go git (http block)
  • https://api.github.com/repos/actions/ai-inference/git/ref/tags/v1
    • Triggering command: /usr/bin/gh gh api /repos/actions/ai-inference/git/ref/tags/v1 --jq [.object.sha, .object.type] | @tsv -json GO111MODULE /usr/bin/git GOINSECURE GOMOD GOMODCACHE git rev-�� --show-toplevel GOPROXY /usr/bin/git GOSUMDB GOWORK 64/bin/go git (http block)
  • https://api.github.com/repos/actions/checkout/git/ref/tags/v3
    • Triggering command: /usr/bin/gh gh api /repos/actions/checkout/git/ref/tags/v3 --jq [.object.sha, .object.type] | @tsv -json GO111MODULE ache/go/1.25.8/x64/bin/go GOINSECURE GOMOD GOMODCACHE go env -json GO111MODULE ache/go/1.25.8/x64/bin/go GOINSECURE GOMOD GOMODCACHE go (http block)
  • https://api.github.com/repos/actions/checkout/git/ref/tags/v5
    • Triggering command: /usr/bin/gh gh api /repos/actions/checkout/git/ref/tags/v5 --jq [.object.sha, .object.type] | @tsv &#34;prettier&#34; --cheGOINSECURE GOPROXY 64/bin/go GOSUMDB GOWORK 64/bin/go git cat-�� --end-of-optionsGOINSECURE blob 64/bin/go (http block)
    • Triggering command: /usr/bin/gh gh api /repos/actions/checkout/git/ref/tags/v5 --jq [.object.sha, .object.type] | @tsv --show-toplevel sh /usr/bin/git npx prettier --ctr GOPROXY 64/bin/go git rev-�� --show-toplevel node /usr/bin/git --check scripts/**/*.js 64/bin/go git (http block)
    • Triggering command: /usr/bin/gh gh api /repos/actions/checkout/git/ref/tags/v5 --jq [.object.sha, .object.type] | @tsv --show-toplevel go /usr/bin/git -json GO111MODULE ache/go/1.25.8/x--show-toplevel git rev-�� --show-toplevel fe.txt /usr/bin/git -json GO111MODULE ache/go/1.25.8/x--show-toplevel git (http block)
  • https://api.github.com/repos/actions/checkout/git/ref/tags/v6
    • Triggering command: /usr/bin/gh gh api /repos/actions/checkout/git/ref/tags/v6 --jq [.object.sha, .object.type] | @tsv -json GO111MODULE 64/bin/go GOINSECURE GOMOD GOMODCACHE go env -json GO111MODULE 64/bin/go GOINSECURE GOMOD GOMODCACHE go (http block)
    • Triggering command: /usr/bin/gh gh api /repos/actions/checkout/git/ref/tags/v6 --jq [.object.sha, .object.type] | @tsv -json GO111MODULE x_amd64/cgo GOINSECURE GOMOD GOMODCACHE x_amd64/cgo env -json GO111MODULE 64/bin/go GOINSECURE GOMOD GOMODCACHE go (http block)
    • Triggering command: /usr/bin/gh gh api /repos/actions/checkout/git/ref/tags/v6 --jq [.object.sha, .object.type] | @tsv -json GO111MODULE 64/bin/go GOINSECURE GOMOD GOMODCACHE go env -json GO111MODULE ache/go/1.25.8/x64/bin/go GOINSECURE GOMOD GOMODCACHE go (http block)
  • https://api.github.com/repos/actions/github-script/git/ref/tags/v8
    • Triggering command: /usr/bin/gh gh api /repos/actions/github-script/git/ref/tags/v8 --jq [.object.sha, .object.type] | @tsv --show-toplevel /opt/hostedtoolcGOPROXY /usr/bin/git /tmp/go-build127bash -trimpath 64/bin/go git rev-�� --show-toplevel go /usr/bin/git -json GO111MODULE 64/bin/go git (http block)
  • https://api.github.com/repos/actions/github-script/git/ref/tags/v9
    • Triggering command: /usr/bin/gh gh api /repos/actions/github-script/git/ref/tags/v9 --jq [.object.sha, .object.type] | @tsv -json GO111MODULE /opt/hostedtoolcache/go/1.25.8/x64/bin/go GOINSECURE GOMOD GOMODCACHE go env -json GO111MODULE /usr/local/sbin/bash GOINSECURE GOMOD GOMODCACHE bash (http block)
    • Triggering command: /usr/bin/gh gh api /repos/actions/github-script/git/ref/tags/v9 --jq [.object.sha, .object.type] | @tsv -json GO111MODULE /opt/hostedtoolcache/go/1.25.8/x64/bin/go GOINSECURE GOMOD GOMODCACHE go env -json GO111MODULE /usr/bin/git GOINSECURE GOMOD GOMODCACHE git (http block)
    • Triggering command: /usr/bin/gh gh api /repos/actions/github-script/git/ref/tags/v9 --jq [.object.sha, .object.type] | @tsv --show-toplevel npx /usr/bin/git --check scripts/**/*.js 64/bin/go git rev-�� --show-toplevel go /usr/bin/git -json GO111MODULE 64/bin/go git (http block)
  • https://api.github.com/repos/actions/setup-go/git/ref/tags/v4
    • Triggering command: /usr/bin/gh gh api /repos/actions/setup-go/git/ref/tags/v4 --jq [.object.sha, .object.type] | @tsv -json GO111MODULE /opt/hostedtoolcache/go/1.25.8/x64/bin/go GOINSECURE GOMOD GOMODCACHE go env -json GO111MODULE /usr/bin/git GOINSECURE GOMOD GOMODCACHE git (http block)
  • https://api.github.com/repos/actions/setup-node/git/ref/tags/v4
    • Triggering command: /usr/bin/gh gh api /repos/actions/setup-node/git/ref/tags/v4 --jq [.object.sha, .object.type] | @tsv -json GO111MODULE 64/bin/go GOINSECURE GOMOD GOMODCACHE go env -json GO111MODULE ache/go/1.25.8/x64/bin/go GOINSECURE GOMOD GOMODCACHE go (http block)
    • Triggering command: /usr/bin/gh gh api /repos/actions/setup-node/git/ref/tags/v4 --jq [.object.sha, .object.type] | @tsv f/tags/v6 GO111MODULE sv GOINSECURE GOMOD GOMODCACHE go env -json GO111MODULE /opt/hostedtoolcache/go/1.25.8/x64/pkg/tool/linux_amd64/vet GOINSECURE GOMOD GOMODCACHE /opt/hostedtoolcache/go/1.25.8/x64/pkg/tool/linux_amd64/vet (http block)
  • https://api.github.com/repos/actions/setup-node/git/ref/tags/v6
    • Triggering command: /usr/bin/gh gh api /repos/actions/setup-node/git/ref/tags/v6 --jq [.object.sha, .object.type] | @tsv -json GO111MODULE 64/bin/go GOINSECURE GOMOD GOMODCACHE go env -json GO111MODULE 64/bin/go GOINSECURE GOMOD GOMODCACHE go (http block)
    • Triggering command: /usr/bin/gh gh api /repos/actions/setup-node/git/ref/tags/v6 --jq [.object.sha, .object.type] | @tsv -json GO111MODULE x_amd64/vet GOINSECURE GOMOD GOMODCACHE x_amd64/vet env -json GO111MODULE 64/bin/go GOINSECURE GOMOD GOMODCACHE go (http block)
  • https://api.github.com/repos/actions/upload-artifact/git/ref/tags/v4
    • Triggering command: /usr/bin/gh gh api /repos/actions/upload-artifact/git/ref/tags/v4 --jq [.object.sha, .object.type] | @tsv -json GO111MODULE 64/bin/go GOINSECURE GOMOD GOMODCACHE go env -json GO111MODULE ache/go/1.25.8/x64/bin/go GOINSECURE GOMOD GOMODCACHE go (http block)
  • https://api.github.com/repos/github/gh-aw-actions/git/ref/tags/v0.1.2
    • Triggering command: /usr/bin/gh gh api /repos/github/gh-aw-actions/git/ref/tags/v0.1.2 --jq [.object.sha, .object.type] | @tsv -json GO111MODULE /opt/hostedtoolcache/go/1.25.8/x64/bin/go GOINSECURE GOMOD GOMODCACHE go env -json GO111MODULE /usr/bin/tail GOINSECURE GOMOD GOMODCACHE tail (http block)
  • https://api.github.com/repos/github/gh-aw-actions/git/ref/tags/v1.0.0
    • Triggering command: /usr/bin/gh gh api /repos/github/gh-aw-actions/git/ref/tags/v1.0.0 --jq [.object.sha, .object.type] | @tsv -json GO111MODULE 64/bin/go GOINSECURE GOMOD GOMODCACHE go env -json GO111MODULE ache/go/1.25.8/x64/bin/go GOINSECURE GOMOD GOMODCACHE go (http block)
  • https://api.github.com/repos/github/gh-aw-actions/git/ref/tags/v1.2.3
    • Triggering command: /usr/bin/gh gh api /repos/github/gh-aw-actions/git/ref/tags/v1.2.3 --jq [.object.sha, .object.type] | @tsv -json GO111MODULE 64/bin/go GOINSECURE GOMOD GOMODCACHE go env -json GO111MODULE ache/go/1.25.8/x64/bin/go GOINSECURE GOMOD GOMODCACHE go (http block)
  • https://api.github.com/repos/github/gh-aw/git/ref/tags/v0.47.4
    • Triggering command: /usr/bin/gh gh api /repos/github/gh-aw/git/ref/tags/v0.47.4 --jq [.object.sha, .object.type] | @tsv --show-toplevel node /usr/bin/git prettier --check 64/bin/go git rev-�� --show-toplevel go /usr/bin/git -json GO111MODULE 64/bin/go git (http block)
  • https://api.github.com/repos/github/gh-aw/git/ref/tags/v1.0.0
    • Triggering command: /usr/bin/gh gh api /repos/github/gh-aw/git/ref/tags/v1.0.0 --jq [.object.sha, .object.type] | @tsv GOSUMDB GOWORK 64/bin/go GOINSECURE GOMOD GOMODCACHE node /opt�� run lint:cjs 64/bin/go GOSUMDB GOWORK 64/bin/go sh (http block)
  • https://api.github.com/repos/nonexistent/action/git/ref/tags/v999.999.999
    • Triggering command: /usr/bin/gh gh api /repos/nonexistent/action/git/ref/tags/v999.999.999 --jq [.object.sha, .object.type] | @tsv GOSUMDB GOWORK 64/bin/go GOINSECURE GOMOD GOMODCACHE node /opt�� run lint:cjs 64/bin/go GOSUMDB GOWORK 64/bin/go sh (http block)

If you need me to access, download, or install something from one of these locations, you can either:

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

ASI-06: Sanitize repo-memory and cache-memory content before prompt injection

4 participants