Skip to content

Integrity-aware cache-memory: git-backed integrity branching with policy-scoped keys #23370

@lpcox

Description

@lpcox

Integrity-Aware Cache-Memory

Problem

Cache-memory (/tmp/gh-aw/cache-memory/) is a flat filesystem with no integrity provenance. Today, a none-integrity run and a merged-integrity run write to the same cache entry. A prompt-injected agent can poison the cache, and the next run — regardless of its integrity level — blindly restores that data.

This is a Bell-LaPadula write-up violation: untrusted data flows into trusted contexts via the cache.

Concrete attack scenario

  1. Agent runs at min-integrity: none (triggered by external issue)
  2. Prompt injection causes agent to write malicious instructions to cache-memory/plan.json
  3. Next run at min-integrity: merged restores the same cache — trusts plan.json
  4. High-integrity run follows poisoned instructions

Proposed Solution: Git-Backed Integrity Branching

Replace the flat tarball cache snapshot with a local git repository inside the cache directory. Each integrity level maps to a git branch:

merged → approved → unapproved → none

Data flows downward only (read-up semantics): lower-integrity runs see higher-integrity data via merge, but higher-integrity runs never see lower-integrity data.

How it works

Pre-agent (compiler-generated step, runs outside AWF sandbox):

  1. Restore cache via actions/cache/restore (unchanged)
  2. Detect format: if no .git/, migrate legacy tarball (see Migration below)
  3. Checkout the branch matching the current run's min-integrity
  4. Merge down from all higher-integrity branches (-X theirs — higher integrity wins conflicts)
  5. Mount a tmpfs over .git/ to hide git metadata from the agent (see Security below)

Agent runs: Reads/writes files normally — completely unaware of git.

Post-agent (compiler-generated step, runs outside AWF sandbox):

  1. Unmount/remove the tmpfs overlay
  2. git add -A && git commit -m "run-${GITHUB_RUN_ID}" on the current integrity branch
  3. Optionally run git gc --auto to control repo size
  4. Save cache via actions/cache/save (unchanged)

Key properties

Property Flat tarball (today) Git-backed (proposed)
Integrity isolation ❌ None ✅ Per-branch
Scope isolation ❌ None ✅ Per-policy-hash
Deletion tracking ❌ No ✅ Native git
Conflict resolution N/A ✅ Higher integrity wins (-X theirs)
History / attribution ❌ None git log per run_id
Post-agent diffing ❌ Manual snapshot git diff HEAD~1
Agent awareness N/A ✅ Zero — agent sees plain files
Migration N/A ✅ Automatic, backward-compatible

Implementation

All changes are in the compiler (cache.go and related). No changes needed to actions/cache, the AWF sandbox, the agent runtime, or the MCP gateway.

Step 1: Update cache key format with integrity level and scope hash

Today's key:

memory-{workflowID}-{runID}

New key format:

memory-{integrityLevel}-{policyHash}-{workflowID}-{runID}

Policy hashing

The full allow-only policy — not just repos — defines the trust boundary. A cache built under one policy must not be restored into a run with a different policy. Changing blocked-users, trusted-users, trusted-bots, repos, or min-integrity alters what data the agent can see and who is trusted, so any change must invalidate the cache.

The compiler computes a deterministic 8-character hash of the entire canonical policy at compile time:

// Pseudocode — compiler computes this during lock file generation
canonical := canonicalPolicy(allowOnly)
policyHash := sha256(canonical)[:8]

Canonical policy format (sorted, normalized, deterministic):

blocked-users:{sorted,lowercase,deduped list}
min-integrity:{level}
repos:{canonical scope form}
trusted-bots:{sorted,lowercase,deduped list}
trusted-users:{sorted,lowercase,deduped list}

Canonical scope forms (for the repos component):

repos field Canonical form
"all" all
"owner" + owner="github" owner:github
["github/gh-aw"] github/gh-aw
["github/gh-aw-mcpg", "github/gh-aw"] github/gh-aw\ngithub/gh-aw-mcpg

Examples of full canonical forms:

# Simple repo-only policy
blocked-users:
min-integrity:none
repos:github/gh-aw
trusted-bots:
trusted-users:

# Policy with exceptions
blocked-users:attacker1\nspammer2
min-integrity:unapproved
repos:owner:github
trusted-bots:dependabot[bot]\nrenovate[bot]
trusted-users:alice\nbob

Sorting + dedup ensures list order doesn't matter. All fields are always present (empty if unset) so the hash is stable.

Why hash the full policy, not just repos:

  • Adding a blocked-users entry changes what data is accessible → cache from the unblocked era may contain poisoned data
  • Adding a trusted-users entry elevates certain users' integrity → old cache didn't distinguish their contributions
  • Changing trusted-bots alters which bot-authored content is trusted at writer level
  • Changing min-integrity changes the baseline trust level for all data

Why compute at compile time: The policy is static per workflow definition — it doesn't change between runs. If someone changes any policy field in the .md file and recompiles, the new lock file naturally gets a new hash → new cache → clean start. No runtime shell execution needed.

What policy isolation prevents:

  • Scope widening/narrowing: Different repos → cache miss → no cross-scope contamination
  • Trust escalation: Adding trusted-users → cache miss → old cache (without trust distinctions) is discarded
  • Unblocking: Removing a blocked-users entry → cache miss → old cache (which may contain blocked user's data marked as filtered) is discarded
  • Policy reordering: ["b","a"] and ["a","b"] in any list field → same hash → share cache correctly

Workflows without allow-only policy: Use a fixed sentinel value (e.g., nopolicy) as the policy hash. These workflows have no integrity enforcement, so policy isolation is moot — but the key format remains consistent.

Example generated cache steps

Restore keys (for a run at unapproved with policy hash 7e4d9f12):

- uses: actions/cache/restore@v5
  with:
    key: memory-unapproved-7e4d9f12-${{ env.GH_AW_WORKFLOW_ID_SANITIZED }}-${{ github.run_id }}
    restore-keys: |
      memory-unapproved-7e4d9f12-${{ env.GH_AW_WORKFLOW_ID_SANITIZED }}-
    path: /tmp/gh-aw/cache-memory

Save key:

- uses: actions/cache/save@v5
  with:
    key: memory-unapproved-7e4d9f12-${{ env.GH_AW_WORKFLOW_ID_SANITIZED }}-${{ github.run_id }}
    path: /tmp/gh-aw/cache-memory

Step 2: Generate pre-agent git setup script

The compiler should generate a shell script (e.g., setup_cache_memory_git.sh) with the following logic:

#!/bin/bash
set -euo pipefail

CACHE_DIR="/tmp/gh-aw/cache-memory"
INTEGRITY="${GH_AW_MIN_INTEGRITY:-none}"

# All integrity levels in descending order (highest first)
LEVELS=("merged" "approved" "unapproved" "none")

cd "$CACHE_DIR"

# --- Format detection & migration ---
if [ ! -d .git ]; then
  git init -b merged
  git add -A
  git commit --allow-empty -m "initial" --author="gh-aw <gh-aw@github.com>"

  # Create all integrity branches from the same baseline
  for level in "${LEVELS[@]}"; do
    git branch "$level" 2>/dev/null || true  # merged already exists as default
  done
fi

# --- Checkout current integrity branch ---
git checkout "$INTEGRITY"

# --- Merge down from higher-integrity branches ---
for level in "${LEVELS[@]}"; do
  if [ "$level" = "$INTEGRITY" ]; then
    break
  fi
  # Merge higher-integrity branch; -X theirs means higher integrity wins conflicts
  git merge "$level" -X theirs --no-edit -m "merge-from-$level" 2>/dev/null || true
done

Step 3: Generate post-agent git commit script

#!/bin/bash
set -euo pipefail

CACHE_DIR="/tmp/gh-aw/cache-memory"
RUN_ID="${GITHUB_RUN_ID:-unknown}"

cd "$CACHE_DIR"

# Stage all changes and commit on the current integrity branch
git add -A
git commit --allow-empty-message -m "run-${RUN_ID}" \
  --author="gh-aw <gh-aw@github.com>" 2>/dev/null || true

# Control repo size
git gc --auto 2>/dev/null || true

Step 4: Hide .git/ from the agent

Add a tmpfs mount to the AWF launch command to prevent the agent from accessing or manipulating git metadata:

# In the AWF invocation, add:
--mount type=tmpfs,destination=/tmp/gh-aw/cache-memory/.git

This ensures:

  • The agent sees an empty directory at .git/ — it cannot read branches, switch branches, or forge commits
  • The real .git/ is intact on the host filesystem underneath the tmpfs overlay
  • The tmpfs is ephemeral — it disappears when the container exits
  • The agent cannot replace .git/ with a symlink or directory (mount point already exists)
  • The post-agent step (running on the host, outside the container) sees the real .git/

Alternative (if AWF doesn't support per-path tmpfs mounts): Use GIT_DIR separation — store .git/ at /tmp/gh-aw/cache-meta/.git and set GIT_WORK_TREE=/tmp/gh-aw/cache-memory in the pre/post-agent scripts. Only mount the working tree into the container. Both directories are included in the cache tarball.

Step 5: Update compiler's min-integrity and scope awareness

The compiler already knows the workflow's min-integrity and allow-only policy from the frontmatter:

github:
  min-integrity: unapproved
  allow-only:
    repos: ["github/gh-aw"]

Use these values to:

  1. Compute the policy hash at compile time (Step 1)
  2. Set the cache key prefix with integrity level and policy hash (Step 1)
  3. Bake GH_AW_MIN_INTEGRITY and GH_AW_CACHE_POLICY_HASH into the lock file as environment variables
  4. Generate the pre/post-agent scripts (Steps 2–3)
  5. Add the tmpfs mount flag to the AWF launch command (Step 4)

Step 6: Legacy migration (backward compatibility)

The pre-agent script (Step 2) handles migration automatically:

  • No .git/: Legacy tarball → git init + import + create all integrity branches from the same baseline
  • .git/ exists: Already migrated → normal branch checkout + merge

Reverting to an older compiler version is safe: the compiler just ignores .git/ inside the tarball. The agent still sees the same files. The .git/ directory is inert overhead (~40KB for small caches) until a git-aware pre-agent step uses it.

Scope migration: Legacy caches use the old key format (memory-{workflowID}-...) which won't match the new format (memory-{integrity}-{policyHash}-{workflowID}-...). The first run after upgrade gets a cache miss and starts fresh. This is the correct behavior — you can't retroactively assign integrity/policy provenance to legacy data.

Merge semantics reference

Current run Sees data from Does NOT see
merged merged only approved, unapproved, none
approved approved + merged unapproved, none
unapproved unapproved + approved + merged none
none all levels

When two branches modify the same file, -X theirs during merge means the higher-integrity version wins. This is correct: if a merged run wrote config.json and an unapproved run also wrote config.json, the unapproved run's checkout should see the merged version (it merged from above).

Residual risks

  1. Agent can corrupt its own integrity level's data — by design; you can't trust a compromised none-integrity run's output at the none level. The protection is that this corruption doesn't flow upward.

  2. Cache size growth.git/ stores history. Mitigate with git gc --auto, --depth=1 shallow history, or periodic cache eviction (already happens via 7-day actions/cache TTL).

  3. Concurrent runs at the same integrity level — two simultaneous unapproved runs commit to the same branch but save with different run_id keys. The last one to save wins for that prefix match. This is the same behavior as today's flat cache — last writer wins.

  4. First run after upgrade is a cache miss — the new key format doesn't match old keys. This is intentional: legacy data has no integrity provenance and should not be trusted in the new model.

Testing plan

  1. Unit tests (compiler):

    • Verify cache key format includes integrity prefix and policy hash
    • Verify policy hash is deterministic: same policy fields in different order → same hash
    • Verify policy hash includes all fields: repos, min-integrity, blocked-users, trusted-users, trusted-bots
    • Verify changing any single policy field produces a different hash
    • Verify canonical forms: "all", "owner:X", sorted repo list, sorted user lists
    • Verify workflows without allow-only use sentinel policy hash
    • Verify pre-agent script generation includes git setup + merge logic
    • Verify post-agent script generation includes git commit
    • Verify AWF launch command includes tmpfs mount for .git/
  2. Integration test (workflow):

    • Run workflow at min-integrity: merged → verify cache created with git repo and all four branches
    • Run workflow at min-integrity: unapproved → verify merge from merged and approved branches
    • Run workflow at min-integrity: merged again → verify no data leaked from unapproved branch
    • Verify legacy tarball (no .git/) auto-migrates on first run
    • Verify different repos scopes produce separate caches (cache miss)
    • Verify identical repos scopes (reordered) share the same cache (cache hit)
    • Verify adding a blocked-users entry forces cache miss
    • Verify adding a trusted-users entry forces cache miss
  3. Security test:

    • Verify agent cannot access .git/ contents (tmpfs hides it)
    • Verify agent cannot git checkout to a different branch
    • Verify agent cannot git init a new repo (mount point blocks it)
    • Verify agent-written data only appears on the correct integrity branch after post-agent commit
    • Verify policy change forces cache miss (no cross-policy data leakage)

Metadata

Metadata

Labels

Type

No type
No fields configured for issues without a type.

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions