Skip to content

feat(audit): close audit-blindness gap for local .apm/ content (#887)#889

Merged
danielmeppiel merged 11 commits intomainfrom
feat/audit-includes-887
Apr 24, 2026
Merged

feat(audit): close audit-blindness gap for local .apm/ content (#887)#889
danielmeppiel merged 11 commits intomainfrom
feat/audit-includes-887

Conversation

@danielmeppiel
Copy link
Copy Markdown
Collaborator

What & Why

CISOs reported an audit-blindness gap: apm install was deploying local .apm/ content into governed directories (.github/, .claude/, etc.), but apm audit --ci never hash-verified those deployed files. The lockfile recorded local_deployed_files and their hashes, yet the audit pipeline only walked package dependencies — so post-install tampering of any locally-sourced file went undetected.

This PR closes the gap with end-to-end hash verification of every managed file plus a new includes: manifest field that gives users (and enterprises) explicit governance consent over what local content gets deployed.

Closes #887.

Solution overview

  • Self-entry synthesis at the lockfile read boundary — local-content state is projected as a virtual <self> dependency keyed by _SELF_KEY = ".". The on-disk YAML schema is unchanged; to_yaml() does a round-trip pop+restore so the synthetic entry never persists.
  • New includes: manifest field — three modes: auto (deploy all .apm/), an explicit list, or omitted (deploy nothing). Purely additive; existing manifests keep working.
  • Hash verification in the content-integrity audit check — now covers the self-entry alongside package deps, detecting post-install tampering on any managed file.
  • New baseline check includes-consent — advisory only, never blocks.
  • New policy manifest.require_explicit_includes (block / warn / off) — enterprises can mandate explicit include lists.
  • apm init -y scaffolds includes: auto — clean first-run UX.
  • Bundle stripping — local-content fields are removed before bundle publish; no leakage into the marketplace.

Critical bug fixes discovered during dogfood

These were caught while dogfooding the new audit on the APM repo itself and are essential to the fix:

  • cleanup.py — orphan loop was deleting all 26 deployed files post-install because the synthesized self-entry was misclassified as an orphan package. Fixed by treating _SELF_KEY as protected.
  • lockfile.pyis_semantically_equivalent skipped local_deployed_file_hashes comparison, so hash-only updates never persisted. Fixed by including the hashes in the equivalence check.
  • lockfile_enrichment.py (bundle path) — local-content fields needed stripping after the YAML round-trip. Fixed.

Audit coverage: before / after

flowchart LR
  subgraph BEFORE["BEFORE (audit blind to local .apm/)"]
    I1[apm install] --> S1[.apm/ source files]
    S1 --> D1[.github/ deployed]
    I1 --> L1[apm.lock.yaml<br/>local_deployed_files + hashes<br/>written but never read]
    A1[apm audit --ci] --> L1
    A1 -.never reads.-> D1
    D1 -.HASH DRIFT INVISIBLE.-> X1((CISO blind))
  end
Loading
flowchart LR
  subgraph AFTER["AFTER (every managed file verified)"]
    I2[apm install] --> S2[.apm/ source files]
    S2 --> D2[.github/ deployed]
    I2 --> L2[apm.lock.yaml<br/>local_deployed_files + hashes]
    L2 --> SE[from_yaml synthesizes<br/>self-entry under &lt;self&gt;]
    A2[apm audit --ci] --> L2
    SE --> HV[content-integrity:<br/>hash-verify every<br/>deployed file]
    HV -. detects .-> D2
    HV --> OK([7/7 green])
    HV --> DRIFT([hash-drift: file<br/>dep=&lt;self&gt;])
  end
Loading

Verification

  • Unit suite: 5383/5384 pass. The sole failure is the pre-existing flaky test_user_scope_skips_workspace_runtimes, unrelated to this change.
  • Live dogfood end-to-end:
    • apm install clean, apm audit --ci returns 7/7 green
    • Introduced drift on a deployed file -> audit detects with dep=<self>
    • Restored file -> audit returns to 7/7 green
  • apm init -y confirmed to scaffold includes: auto.
  • Pre-existing dogfood drift on auth-expert.agent.md and cicd.instructions.md remains visible — these are real post-install hand-edits from prior PRs and are handled outside this PR.

Panel reviews

All three specialist panels reviewed; CEO signed off SHIP.

  • Supply chain security — PASS-WITH-NOTES. Path-validation guard added to the hash-verify loop.
  • CLI logging UX — PASS-WITH-NOTES. Status-symbol contradiction fixed, remediation made conditional, hashes truncated for terminal width.
  • DevX UX — PASS-WITH-NOTES. init scaffold, doc check counts updated (6+16 -> 7+17), error wording branched, dep=<self> label introduced, cli-commands.md updated.
  • APM CEO — SHIP. "Every managed file, verified" — the audit-coverage milestone.

Breaking changes

None. Purely additive. No migration required for any user class — existing manifests, lockfiles, and CI flows continue to work unchanged.

Version

Bumps to 0.10.0 (minor: significant new functionality, no breakage).

Test plan

uv run pytest tests/unit tests/test_console.py
uv run pytest tests/integration/test_local_content_audit.py   # new file: 5 subprocess tests A-E

danielmeppiel and others added 4 commits April 24, 2026 01:56
Implement self-entry virtual lockfile mechanism + includes manifest field
+ hash drift detection + policy gate, addressing the audit-blindness
crisis identified by the CISO + crisis panel.

## Wave 1 (foundation)

- Lockfile self-entry: synthesize LockedDependency for local content
  (repo_url=<self>, source=local, local_path=., is_dev=true) at
  read boundary; pop+restore in to_yaml() with try/finally for
  byte-stable round-trip.
- New get_package_dependencies() excludes self-entry for callers
  doing dependency-walk operations.
- includes manifest field (None | 'auto' | List[str]) with parser.
- Policy schema: require_explicit_includes (bool, default false).
- Packer guard: skip source==local deps in apm bundles.

## Wave 2 (audit + policy + caller migrations)

- _check_lockfile_exists: probes local_deployed_files so local-only
  repos don't fail.
- _check_no_orphans: filters _SELF_KEY.
- _check_content_integrity: re-reads files, compares SHA-256 against
  deployed_file_hashes; skips missing/symlinks/unhashed entries.
- New _check_includes_consent: advisory [!] when includes undeclared
  but local content present (always passes).
- New _check_includes_explicit in policy seam, threaded via
  conditional kwarg from policy_gate.
- Migrated 6 callers from get_all_dependencies() to
  get_package_dependencies() to hide self-entry from
  walk-and-act operations.

47 new tests added across 5 files. 666 targeted tests pass.

Refs: #887

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
…-trip (#887)

Wave 3: verification tests + N1 fix.

## Scanner coverage (compute_deployed_hashes)

9 tests pinning the files-vs-hashes contract:
- Every regular file has a corresponding hash entry.
- Directories excluded; symlinks excluded (deterministic).
- Empty files hashed (well-known SHA-256 of b'').
- Hidden files (.mcp.json) hashed.
- The set difference 'local_deployed_files - hashes.keys()' equals
  exactly the directory entries.

Live confirmation on this repo: 26 files, 18 hashes, 8 directory diff
(under .github/skills/) -- no regular file is missing a hash.

## Packer regression + N1 fix

Architect's N1 finding: enrich_lockfile_for_pack() was serializing
'local_deployed_files' / 'local_deployed_file_hashes' verbatim into
bundle lockfiles, causing a phantom self-entry on unpack whose files
the unpacker would then fail to verify.

Fix at bundle/lockfile_enrichment.py: strip both fields from the bundle
lockfile dict after YAML round-trip and before pack: metadata
serialization. Original LockFile object untouched.

4 regression tests + live 'apm pack' confirms bundle lockfile is clean.

## Integration round-trip (5 tests, all pass live)

A. Install records self-entry (local_deployed_files + hashes).
B. Audit clean install: content-integrity passes, includes-consent
   emits [!] advisory.
C. Audit detects drift: tampered file -> exit 1 + 'hash-drift: <path>'.
D. Includes declared (auto or list): no [!] advisory.
E. Policy require_explicit_includes: blocks undeclared/auto.

End-to-end via subprocess (matches existing test_local_install.py
convention).

65 targeted tests pass.

Refs: #887

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
)

Wave 4 + critical fixes for the local-content audit work:

- apm.yml: declare 'includes: auto' to consent to local .apm/ deployment
  governance for this repo's own dogfood.
- pyproject.toml + uv.lock: bump to 0.10.0 (target release for #887).
- CHANGELOG.md: Added/Fixed/Changed entries under [Unreleased].
- Docs: lockfile-spec.md ($4.5 Self-Entry Convention),
  manifest-schema.md ($3.9 'includes' field), governance-guide.md
  (remove drift-gap caveats now that hash verification ships),
  apm-usage/governance.md skill.

Critical bug fixes discovered during dogfood:
- cleanup.py: orphan loop iterates lockfile.dependencies which now
  includes the synthesized self-entry under '.'. Without a guard,
  every install was deleting all 26 deployed local files because
  '.' is never in apm.yml's external dependencies. Add explicit
  skip on _SELF_KEY before the orphan check.
- lockfile.py: is_semantically_equivalent compared local_deployed_files
  but skipped local_deployed_file_hashes. Hash-only changes were
  treated as 'no change', so post-install hash refreshes never
  persisted, causing audit to report stale hash drift forever. Add
  the hash-dict comparison.

Verified locally: install clean, audit 7/7 green, drift introduced
detected, drift removed back to green.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Three-panel review (security, devx, logging) feedback applied:

Security:
- ci_checks.py:_check_content_integrity: gate every hash-verification
  read through BaseIntegrator.validate_deploy_path so a forged lockfile
  with a traversal path (e.g. '../../.env') cannot induce reads outside
  managed locations or leak hashes via audit output.

DevX UX:
- _helpers.py:_create_minimal_apm_yml: scaffold 'includes: auto' so
  'apm init' projects don't trip the includes-consent advisory the
  moment a primitive lands in .apm/. Matches what manifest-schema.md
  promises ('default for newly initialised projects').
- policy_checks.py:_check_includes_explicit: branch the error message
  -- 'add includes: [...]' when manifest has no includes field, and
  'replace includes: auto' only when the user actually wrote auto.
- ci_checks.py: render the synthesized self-entry as 'dep=<self>'
  rather than the internal '.' constant in hash-drift details.
- Docs: bump check counts (6 baseline -> 7, 16 policy -> 17) across
  governance-guide.md, apm-policy.md, ci-cd.md, ci-policy-setup.md.
  Add includes-consent to enumerated baseline list. Document hash
  drift detection in cli-commands.md ('What it detects').
- manifest-schema.md: correct 'rejected at parse time' to 'rejected
  at install/audit time by the explicit-includes policy check'.

Logging:
- ci_checks.py:_check_content_integrity: build remediation message
  conditionally so hash-only failures don't suggest 'apm audit --strip'
  (which only strips Unicode).
- ci_checks.py: truncate hashes in hash-drift detail line for terminal
  width (full hashes still in JSON output).
- ci_checks.py:_check_includes_consent: drop '[!]' prefix from
  message text -- the audit table renderer owns the status column;
  embedding [!] inside a passed=True row produced contradictory
  '[+]  ... [!] ...' output. Replace 'consent check N/A' with
  'includes consent check skipped' (no jargon).
- test_file_scanner.py: replace section-sign character with ASCII.

All 5383 unit tests pass (1 pre-existing flaky MCP scope test
unrelated). Live verification: install clean, audit 7/7 green,
drift detection works, 'apm init -y' scaffolds 'includes: auto'.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Copilot AI review requested due to automatic review settings April 24, 2026 00:34
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Closes the apm audit --ci integrity gap for locally-deployed .apm/ content by projecting local lockfile state into a virtual self-entry and extending baseline audit to hash-verify all managed files (including local ones), plus adding an includes: manifest consent signal with optional policy enforcement.

Changes:

  • Synthesize a virtual . / <self> lockfile dependency in-memory and update key call sites to skip it where appropriate.
  • Extend baseline content-integrity to include per-file SHA-256 drift detection; add includes-consent baseline advisory and manifest.require_explicit_includes policy check.
  • Add/adjust tests and docs to cover the new audit behavior, policy seam wiring, and bundling safeguards (strip local-content metadata; skip local-source deps in bundles).
Show a summary per file
File Description
uv.lock Bumps apm-cli version to 0.10.0.
pyproject.toml Bumps project version to 0.10.0.
apm.yml Bumps repo manifest version and adds includes: auto.
apm.lock.yaml Updates local deployed file hashes (dogfood drift baseline).
CHANGELOG.md Adds Unreleased entries for self-entry/includes/audit hash drift and related fixes.
docs/src/content/docs/reference/manifest-schema.md Documents new includes: manifest field and policy interaction.
docs/src/content/docs/reference/lockfile-spec.md Documents the in-memory self-entry convention and invariants.
docs/src/content/docs/reference/cli-commands.md Updates apm audit --ci baseline check enumeration and hash drift behavior.
docs/src/content/docs/integrations/ci-cd.md Updates baseline/policy check counts (7 + 17).
docs/src/content/docs/guides/ci-policy-setup.md Updates baseline/policy check counts and text references.
docs/src/content/docs/enterprise/governance-guide.md Updates governance matrix/counts and adds explicit-includes policy documentation.
docs/src/content/docs/enterprise/apm-policy.md Updates CI-time baseline check count in docs.
packages/apm-guide/.apm/skills/apm-usage/governance.md Extends governance skill docs with explicit-includes knob + includes: explanation.
src/apm_cli/deps/lockfile.py Adds _SELF_KEY, synthesizes self-entry in from_yaml(), excludes it in new get_package_dependencies(), updates equivalence and installed-paths guard.
src/apm_cli/policy/ci_checks.py Extends baseline checks: local-only lockfile relevance, orphan skip for self, hash drift verification, adds includes-consent advisory.
src/apm_cli/policy/policy_checks.py Adds explicit-includes policy check and a seam param to run it only when manifest context is provided.
src/apm_cli/policy/parser.py Parses/validates manifest.require_explicit_includes.
src/apm_cli/policy/schema.py Adds require_explicit_includes to policy schema.
src/apm_cli/install/phases/policy_gate.py Threads ctx.apm_package.includes into dependency policy checks when available.
src/apm_cli/install/phases/cleanup.py Protects the synthesized self-entry from orphan cleanup deletion.
src/apm_cli/bundle/packer.py Skips source=local dependencies when collecting deployed files for apm bundles.
src/apm_cli/bundle/lockfile_enrichment.py Strips local_deployed_files + local_deployed_file_hashes from bundle lockfile YAML.
src/apm_cli/models/apm_package.py Adds includes field parsing/validation to APMPackage.
src/apm_cli/integration/skill_integrator.py Uses get_package_dependencies() to avoid <self> in ownership maps.
src/apm_cli/integration/mcp_integrator.py Uses get_package_dependencies() to avoid <self> during transitive collection.
src/apm_cli/commands/_helpers.py apm init scaffolds includes: auto and skips self-entry in install-path expectations.
src/apm_cli/commands/uninstall/engine.py Uses get_package_dependencies() to avoid self-entry crashes in uninstall flows.
src/apm_cli/commands/deps/cli.py Uses get_package_dependencies() so self-entry does not show up in apm deps tree.
tests/unit/test_lockfile_self_entry.py Adds unit coverage for self-entry synthesis + serialization invariants.
tests/unit/test_self_entry_caller_guards.py Regression tests for representative callers skipping the self-entry.
tests/unit/test_packer.py Adds tests for excluding local-source deps in bundles and stripping local-content fields.
tests/unit/test_deps_list_tree_info.py Adds guard test ensuring deps tree hides self-entry.
tests/unit/test_ci_checks.py Adds baseline audit coverage for hash drift + local-only repo behavior + includes-consent.
tests/unit/test_audit_policy_command.py Updates baseline check count expectations in CI/policy JSON tests.
tests/unit/test_apm_package.py Adds unit tests for includes parsing and behavior.
tests/unit/install/test_policy_gate_phase.py Tests that policy_gate threads manifest_includes correctly.
tests/unit/policy/test_policy_checks.py Updates expected policy check count to 17 and imports explicit-includes check.
tests/unit/policy/test_run_dependency_policy_checks.py Adds seam-level tests for explicit-includes check gating.
tests/unit/policy/test_parser.py Adds parser/validator tests for require_explicit_includes.
tests/unit/install/test_file_scanner.py Adds coverage tests for compute_deployed_hashes directory/symlink/empty-file behavior.
tests/integration/test_local_content_audit.py Adds end-to-end subprocess tests for local-content install + audit drift detection + policy enforcement.

Copilot's findings

Comments suppressed due to low confidence (1)

docs/src/content/docs/reference/manifest-schema.md:180

  • This section claims apm audit emits an [!] includes-consent advisory, but the current CI renderer only shows pass/fail status symbols and includes-consent returns passed=True with no [!] prefix in the message. Either update the implementation to include [!] in the includes-consent message (including JSON/SARIF), or adjust this doc text to describe the advisory without asserting a specific status symbol.
Declares which local `.apm/` content the project consents to publish when packing or deploying. Three forms are supported:

1. **Undeclared** -- field omitted. Legacy behaviour: all local `.apm/` content is published as if `auto` were set. `apm audit` emits an `[!]` includes-consent advisory whenever local content is deployed under this form.
2. **`includes: auto`** -- explicit consent to publish all local `.apm/` content via the file scanner. No path enumeration required. Default for newly initialised projects.
  • Files reviewed: 40/41 changed files
  • Comments generated: 5

Comment thread src/apm_cli/policy/ci_checks.py Outdated
Comment thread tests/integration/test_local_content_audit.py Outdated
Comment thread tests/unit/test_audit_policy_command.py Outdated
Comment thread docs/src/content/docs/reference/manifest-schema.md
Comment thread CHANGELOG.md Outdated
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
Agent-Logs-Url: https://github.com/microsoft/apm/sessions/fb9d2cd4-ba19-4ecb-b6cd-743cd5743bdd

Co-authored-by: danielmeppiel <51440732+danielmeppiel@users.noreply.github.com>
@danielmeppiel
Copy link
Copy Markdown
Collaborator Author

Note on follow-up

This PR is the foundation that Epic #898 builds on. The Epic-PR will rebase on top of the work landed here and extend it with:

No scope change requested here; this PR ships as designed and closes #684 + #887. The Epic owns the next layer.

@danielmeppiel danielmeppiel added panel-review Trigger the apm-review-panel gh-aw workflow and removed panel-review Trigger the apm-review-panel gh-aw workflow labels Apr 24, 2026
@danielmeppiel
Copy link
Copy Markdown
Collaborator Author

Found issue: shared/apm.md restore step crashes on PRs that modify apm.lock.yaml

Tracking issue: #901

What's failing here

The pr-review-panel agent run on this PR aborts before the agent even starts: https://github.com/microsoft/apm/actions/runs/24883083247/job/72856352586?pr=889

error: Your local changes to the following files would be overwritten by checkout:
        apm.lock.yaml
ERR_API: Failed to checkout PR branch: The process '/usr/bin/git' failed with exit code 1

This isn't specific to PR #889 - it will happen on any PR that touches apm.lock.yaml.

What's actually going on (plain English)

shared/apm.md (our recommended gh-aw integration template) runs in two phases on a PR:

  1. Trusted base phase: checks out main, downloads the APM bundle, calls microsoft/apm-action in restore mode to deploy primitives into the workspace. Restore mode also writes apm.lock.yaml back into the workspace as a side-effect (even though apm unpack documents the lockfile as metadata-only).
  2. PR checkout phase: gh-aw runs git checkout -B <branch> origin/pr-head. Git refuses because the workspace is dirty.

When the PR also modifies apm.lock.yaml (which is exactly what every dependency-related PR does), the two writes collide and checkout aborts.

Why this matters beyond PR #889

  • It triggers on the canonical first-run PR for new APM adopters: "I added a dependency, please review." That's the worst possible adopter to break.
  • The error names neither APM nor microsoft/apm-action, so consumers debug gh-aw or assume their checkout is broken.
  • shared/apm.md ships from microsoft/apm but executes in every external repo that imports it. It's effectively a public API surface for APM's CI integration - regressions have cross-repo blast radius.
  • The crash silently masks a separate higher-severity finding (PR-controlled primitive shadowing) that we'll file against gh-aw once this is fixed - see apm unpack writes apm.lock.yaml / apm.yml to output dir, violating documented metadata-only contract #901's "Out of scope follow-ups".

Chosen fix

Append one step to shared/apm.md after "Restore APM packages":

- name: Reset tracked files after APM restore
  run: git checkout -- . 2>/dev/null || true

Then gh aw compile and commit the regenerated .lock.yml files.

Why this approach

  • Untracked primitives survive: bundle-deployed files in .github/{skills,agents,instructions,prompts}/ are untracked in consumer repos, so they survive both git checkout -- . and the subsequent PR checkout. Agent discovery keeps working unchanged.
  • Idempotent and fail-safe: no-op when clean, || true swallows non-zero in non-git contexts, order-insensitive.
  • Whole-tree, not file-list: hard-coding apm.lock.yaml apm.yml would silently regress when future bundle paths overlap tracked files (e.g., .github/agents/*.agent.md). Safe here because no prior gh-aw pre-step writes tracked workspace files.
  • Zero new inputs, zero new env vars for adopters; mental model unchanged; no auth surface.

Alternatives considered:

  • Drop the lockfile write inside microsoft/apm-action - the right durable fix, but a breaking change to the action's contract. Held for the action's next major.
  • Relocate restore to /tmp/... and copy primitives in after PR checkout - breaks agent primitive discovery, no gh-aw post-checkout pre-step hook, credential-surface risk.

Full rationale, required actions before merge, verification steps, and out-of-scope follow-ups (apm-action vNext, parallel diagnostic warning PR, gh-aw primitive-shadowing escalation) are in #901.

@danielmeppiel
Copy link
Copy Markdown
Collaborator Author

Update on the fix layering: after re-examining the architecture, the bridge-step approach in gh-aw was the wrong layer. We control both apm unpack and microsoft/apm-action, so the fix belongs there. New plan:

  1. apm unpack writes apm.lock.yaml / apm.yml to output dir, violating documented metadata-only contract #901 (rewritten): apm unpack stops writing apm.lock.yaml / apm.yml to the output directory (aligns implementation with the CLI's own documented metadata-only contract). Root cause.
  2. v1.5: restore mode must install apm CLI so bundles are unpacked via 'apm unpack' (not raw tar fallback) apm-action#26: v2 of the action builds on the fixed CLI; working-directory stops getting dirtied on restore. Every action consumer (gh-aw users, raw GH Actions users, anything) gets the fix transparently with zero opt-in.
  3. shared/apm.md restore step crashes on PRs that modify tracked files (e.g. apm.lock.yaml) github/gh-aw#28256 (closed as redirect): shared/apm.md just bumps to microsoft/apm-action@v2 once available. One-line change. No template gymnastics.

The single-line git checkout -- . workaround in microsoft/apm's own copy of shared/apm.md stays in place until step 2 ships, then it becomes a documented no-op and we drop it.

@danielmeppiel
Copy link
Copy Markdown
Collaborator Author

Final diagnosis (corrected)

Verified -- the bug is isolated to microsoft/apm-action's restore mode. The apm unpack CLI is correct.

What's actually happening

  1. microsoft/apm-action restore mode (runner.ts:90) deliberately skips installing the apm CLI for speed.
  2. Therefore extractBundle (bundler.ts:62-89) always falls through to its tar xzf --strip-components=1 fallback.
  3. Raw tar dumps the entire bundle into working-directory -- including apm.lock.yaml and apm.yml, which apm unpack would have correctly excluded.
  4. When working-directory defaults to ${{ github.workspace }} (a git checkout), those tracked files become dirty.
  5. The next step (checkout_pr_branch.cjs) tries git checkout and aborts:
    error: Your local changes to the following files would be overwritten by checkout:
            apm.lock.yaml
    ERR_API: Failed to checkout PR branch
    
  6. This breaks every pull_request_target agentic workflow whose triggering PR modifies apm.lock.yaml -- the canonical first-run PR for any new APM adopter.

Single-layer fix

microsoft/apm-action#26 -- ship as v1.5: install the apm CLI in restore mode (drop the "skip install" optimization), take the verified apm unpack path. Tiny diff in one file (runner.ts), no API change, no breaking change. The CLI and gh-aw template need no edits.

Tactical mitigation while v1.5 ships

A small PR will add a post-restore git checkout -- . 2>/dev/null || true in microsoft/apm's own copy of shared/apm.md. It becomes a documented no-op once we bump to microsoft/apm-action@v1.5.

What changed vs. the prior comment

References:

danielmeppiel added a commit to microsoft/apm-action that referenced this pull request Apr 24, 2026
…rkspace (#27)

* fix(restore): install APM and unpack via apm CLI to avoid dirtying workspace

In restore mode the action previously skipped 'ensureApmInstalled()'
(runner.ts:90 — 'skip APM installation entirely'), forcing extractBundle
through its raw 'tar xzf --strip-components=1' fallback. That fallback
extracted the *entire* bundle — including 'apm.lock.yaml' and 'apm.yml'
— into 'working-directory'. When 'working-directory' was a git checkout
(the default '${{ github.workspace }}'), those tracked files became
dirty and any subsequent 'git checkout' aborted with:

  error: Your local changes to the following files would be overwritten
  by checkout: apm.lock.yaml

This broke every 'pull_request_target' agentic workflow whose triggering
PR modified the lockfile — i.e., the canonical first-run PR for any new
APM adopter (originally surfaced via gh-aw 'shared/apm.md' callers).

The 'apm unpack' CLI honors the bundle contract: it copies only files
listed in the lockfile's 'deployed_files' (primitives + apm_modules) and
never writes the lockfile or manifest to the output dir. The fix is to
always install APM in restore mode so extractBundle takes the verified
'apm unpack' path.

Tool-cached install adds at most a single small download per runner
(negligible vs. agent-job cost), and we get bundle integrity verification
for free. The previous 'unverified — install APM for integrity checks'
warning was already advertising the fallback as the inferior code path.

Defense-in-depth: the tar fallback (now reached only if 'apm install'
itself fails) also gets '--exclude=apm.lock.yaml --exclude=apm.lock
--exclude=apm.yml' so it can never dirty a workspace either.

Tests:
  - new unit test in runner.test.ts asserts ensureApmInstalled() runs
    before extractBundle() in restore mode
  - existing bundler tar-fallback test extended to assert exclude flags
  - new integration job 'test-restore-clean-workspace' in CI reproduces
    the gh-aw scenario end-to-end on a real ubuntu-latest runner: pack a
    bundle, commit a tracked apm.lock.yaml, restore the bundle, then
    assert (a) 'git status' shows no modifications to apm.lock.yaml /
    apm.yml and (b) 'git checkout <baseline-sha> -- .' succeeds without
    the regression error

Refs: #26
Refs: microsoft/apm#889

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* review: drop v1.5 reference + sharpen unverified-restore log message

Addresses two low-confidence Copilot review nits on #27:
- runner.ts:94 — drop the 'added in v1.5' marker from the rationale
  comment so it doesn't go stale on backports/cherry-picks; the
  rationale stands on its own.
- runner.ts:116 — restore mode now installs APM up-front, so the
  unverified branch only runs if that install failed transiently and
  extractBundle fell through to tar. Point operators at the install
  logs instead of telling them to install APM.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

---------

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
@danielmeppiel
Copy link
Copy Markdown
Collaborator Author

Tracking the follow-up cleanup in #902 — bump shared/apm.md to microsoft/apm-action@v1.4.2 now that the upstream fix has shipped (release: https://github.com/microsoft/apm-action/releases/tag/v1.4.2).

danielmeppiel added a commit that referenced this pull request Apr 24, 2026
v1.4.2 fixes the restore-mode workspace pollution that caused #901 and
the CI failure surfaced in #889. With v1.4.2, restore mode installs APM
and uses 'apm unpack' to extract bundles, writing only files declared in
the lockfile's deployed_files instead of overwriting tracked
apm.lock.yaml / apm.yml / apm_modules in the caller's git workspace.

- shared/apm.md: bump both pack + restore steps to @v1.4.2
- pr-review-panel.lock.yml: regenerated via 'gh aw compile' (SHA pin updated)
- actions-lock.json: SHA pin updated

Closes #902.
Upstream: microsoft/apm-action#27, release v1.4.2.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
@danielmeppiel danielmeppiel added panel-review Trigger the apm-review-panel gh-aw workflow and removed panel-review Trigger the apm-review-panel gh-aw workflow labels Apr 24, 2026
@microsoft microsoft deleted a comment from github-actions Bot Apr 24, 2026
@danielmeppiel danielmeppiel removed the panel-review Trigger the apm-review-panel gh-aw workflow label Apr 24, 2026
@danielmeppiel danielmeppiel added panel-review Trigger the apm-review-panel gh-aw workflow and removed panel-review Trigger the apm-review-panel gh-aw workflow labels Apr 24, 2026
@github-actions
Copy link
Copy Markdown

🌱 OSS Growth Hacker — Findings


1 · Feature–adoption fit: strong enterprise unlock, careful on community friction

Enterprise signal is excellent. The includes:require_explicit_includesapm audit --ci chain gives CISOs an auditable answer to "what exactly are we shipping?" — which is the question that blocks enterprise adoption of any agent toolchain. This is a top-three enterprise objection remover alongside dependency-allow/deny and MCP transport governance. Well done.

Community friction is managed, but with a gap. The includes-consent advisory is advisory-only — smart for adoption (existing users aren't broken). But the advisory message needs to land as helpful rather than nagging. Recommendation: ensure the audit output includes the one-liner fix (add includes: auto to apm.yml) directly in the advisory text, so the user can act in < 5 seconds without leaving the terminal.


2 · Scaffolding: includes: auto in apm init is the right call

The fact that _helpers.py:489 scaffolds includes: auto by default means every new project is born clean — no advisory surprise on first apm audit. This is correct for the first-5-minutes funnel. The inline comment at L484–488 is a model for developer empathy in scaffolding.

Gap: the hello-world template does NOT include includes: auto. This means users who apm init --template hello-world will get a project that immediately triggers the includes-consent advisory if they add any .apm/ content. The template should be updated to include includes: auto so the happy path is consistent whether the user runs bare apm init or uses a template.

Side-channel → CEO: The template gap is minor but worth a fast follow. If we ship the advisory without fixing templates, the first thing a new user sees from apm audit could be a warning they didn't cause — that's a retention risk. Recommend adding includes: auto to all templates in this PR or a fast follow before release.


3 · CHANGELOG narrative: solid substance, needs story shaping for release notes

The CHANGELOG entries are accurate and complete — 6 bullets under Added, 2 under Changed, 2 under Fixed. But they're written for maintainers, not users. For the release announcement, the story arc should be:

Headline: "APM now audits your local agent content — not just your dependencies."

One-liner: Every .apm/ file you author is now hash-verified, drift-detected, and policy-governed — closing the last blind spot in apm audit.

Enterprise hook: require_explicit_includes: true gives platform teams explicit, PR-reviewable control over which local files are published.

The current CHANGELOG buries the "audit blindness" fix under "Fixed" when it's the lead feature for the release narrative. Recommend reordering or adding a summary paragraph at the top of Unreleased.


4 · Docs coverage: comprehensive but missing the quickstart bridge

What's done well:

  • manifest-schema.md §3.9: Three-form explanation with YAML examples is excellent. The progressive disclosure (undeclared → auto → explicit list) maps perfectly to the maturity curve.
  • governance-guide.md: The require_explicit_includes callout paragraph is clear and well-placed. The 6→7 baseline check updates are consistent across all surfaces (ci-policy-setup, ci-cd, governance-guide, cli-commands).
  • lockfile-spec.md §4.5: Self-entry convention is thorough. is_dev: true as non-negotiable with the bundle-export reasoning is exactly the kind of design doc that builds trust with contributors.
  • governance.md skill: Clean addition that agent-context consumers can reference.

What's missing — quickstart and first-package funnel:
Neither quick-start.md nor first-package.md mention includes: at all. These are the two highest-traffic pages for new users. When a user follows the quickstart, authors their first .apm/ skill, and then runs apm audit, they'll see the includes-consent advisory with no context for what it means.

Recommendation: add a single sentence + code snippet to the quickstart at the point where users first create .apm/ content:

> 💡 Your `apm.yml` already declares `includes: auto`, which tells APM
> to publish all your local `.apm/` content. For stricter control,
> replace `auto` with an explicit path list — see [Manifest Schema §3.9].

Side-channel → CEO: The quickstart bridge is the highest-leverage doc change in this PR. One sentence in the right place prevents a "what is this advisory?" moment that would otherwise become a GitHub issue. Recommend this is added before merge, not as a follow-up.


5 · Competitive positioning: this is a differentiator — exploit it

Side-channel → CEO: No competing agent package manager (that I'm aware of) offers granular local-content governance with advisory → enforcement progression and policy-driven org control. This is the enterprise pitch:

"Other tools install agent context. APM governs it."

The includes: field is the visible proof of this claim. It's the only manifest field in any agent PM that answers "which of your own files are you shipping?" — and it does so with a three-tier maturity model (undeclared → auto → explicit) that matches how orgs actually adopt governance (start permissive, tighten later).

Launch beat recommendation: This deserves its own blog post or thread, not just a release note. Angle: "Why your agent's local files are a supply chain surface — and how APM closes the gap." The drift-detection + hash-integrity story is viscerally compelling for security-minded audiences.


6 · Advisory approach: correct for adoption, watch for advisory fatigue

The non-blocking includes-consent advisory is the right call for the funnel:

  • ✅ Doesn't break existing users (no gate failure on undeclared)
  • ✅ Teaches the concept before enforcing it
  • ✅ Enterprise teams can enforce via policy when ready

Risk: If the advisory appears on every audit run for legacy projects that don't set includes:, it becomes noise that users learn to ignore. Ensure the advisory is:

  1. Shown once prominently, not repeated per-file
  2. Suppressible via includes: auto (which it is — good)
  3. Not counted as a "finding" in the exit summary (so 0 findings stays clean for projects that just haven't opted in yet)

7 · require_explicit_includes: enterprise adoption lever, correctly positioned

This policy field is correctly placed as audit-only (not install-time blocking by default). The progression is:

  1. Dev adds .apm/ content → audit shows advisory → dev adds includes: auto → advisory gone
  2. Platform team enables require_explicit_includes: trueauto rejected → dev enumerates paths → PR-reviewable

This mirrors exactly how npm files field and .npmignore evolved — start permissive, tighten with organizational maturity. The enterprise pitch writes itself.


8 · Minor observations

  • Governance-guide.md removes the "no drift detection" limitation from both the known-gaps section (L490) and the "not guaranteed" section (L315). This is a confidence-building moment worth calling out in release notes: "We closed the drift gap."
  • The bypass contract table is updated consistently (6→7 baseline, 16→17 policy). The thoroughness here builds enterprise trust — these tables get screenshot'd into security review decks.
  • apm-policy.md change is minimal (6→7 count update). Consider adding a one-liner about includes-consent as a baseline check in the "CI time" section for discoverability.

Summary: This PR is a strong enterprise adoption lever and a well-executed security improvement. The two actionable items before merge are: (1) bridge the quickstart/first-package docs so new users understand includes: before they hit the advisory, and (2) add includes: auto to the hello-world template for funnel consistency. Everything else is release-narrative polish that can happen in parallel.

Generated by PR Review Panel for issue #889 · ● 15.1M ·

@github-actions
Copy link
Copy Markdown

🔐 Supply-Chain Security Review — PR #889

Finding 1 · includes list accepts unsanitized paths (Medium — pre-wire for future work)

File: src/apm_cli/models/apm_package.py:200-213

The includes field parser validates type (string "auto" or list[str]) but does not validate path content against traversal sequences. A malicious or careless apm.yml can declare:

includes:
  - "../../.env"
  - "../../secrets/tokens.json"

These strings are stored verbatim on APMPackage.includes and currently flow only to advisory/policy checks (ci_checks._check_includes_consent, policy_checks._check_includes_explicit), which check presence of the field, not content of the paths. No file I/O is performed today, so this is not an exploitable vulnerability right now.

However, the stated purpose of includes is to govern which local files get deployed. The moment any future code path resolves these strings against the filesystem (e.g., project_root / path), path traversal becomes exploitable. The security model requires path safety at the earliest trust boundary — parse time.

Recommendation: Add validate_path_segments (from utils/path_security.py) to each entry in the list branch at parse time:

from ..utils.path_security import validate_path_segments

elif isinstance(includes_value, list):
    if not all(isinstance(item, str) for item in includes_value):
        raise ValueError(...)
    for item in includes_value:
        validate_path_segments(item, context="includes path")
    includes = list(includes_value)

This costs nothing today and prevents a class of bugs when includes gains file-operation semantics.


Finding 2 · _SELF_KEY cross-module private import (Low — fragility)

Files: src/apm_cli/install/phases/cleanup.py:70, src/apm_cli/policy/ci_checks.py:19

Both modules import _SELF_KEY (a private-by-convention name) from lockfile.py. The underscore prefix signals "internal detail, may change without notice." If a future refactor renames or removes this constant, the orphan-cleanup guard at cleanup.py:78 would break:

if _orphan_key == _SELF_KEY:
    continue

A broken import raises ImportError (loud — good). But if the constant were renamed and the old symbol left as a stale export with a different value, the guard would silently stop working, causing orphan cleanup to delete the project's own .github/ files — a data-loss scenario that also breaks the security model's containment guarantee.

Recommendation: Either promote _SELF_KEY to a public SELF_KEY constant (drop the underscore, document the contract), or add a is_self_entry(dep_key) predicate to the LockFile API and have callers use that instead of comparing against an internal constant.


Finding 3 · get_all_dependencies() call-site audit — all safe ✅

I audited every call site that still uses get_all_dependencies() (as opposed to the new get_package_dependencies()):

Call site Self-entry handling Verdict
packer.py:128 Filters source == "local" immediately after ✅ Safe
unpacker.py:129 Bundle lockfile has local_deployed_files stripped → no self-entry synthesized ✅ Safe
plugin_exporter.py:470 Self-entry has is_dev=True → skipped by dev filter ✅ Safe
resolve.py:62 Verbose logging only ✅ Safe
lockfile.py:274 (to_yaml) Self-entry popped before serialization, restored in finally ✅ Safe
lockfile.py:418 (get_installed_paths) Explicit local_path == _SELF_KEY: continue guard ✅ Safe

All callers are accounted for. The packer.py and plugin_exporter.py sites use implicit guards (field-value filters) rather than the explicit _SELF_KEY sentinel — this is acceptable given the defense-in-depth, but see Finding 2 about making the sentinel a stable contract.


Finding 4 · Self-entry repo_url="<self>" cannot be fetched — but to_dependency_ref() is fragile (Low)

The synthesized self-entry at lockfile.py:316-324 uses repo_url="<self>". I traced all paths where to_dependency_ref() could be called on a LockedDependency:

  • get_installed_paths() (lockfile.py:421) — guarded: skips local_path == _SELF_KEY at line 419.
  • _dep_install_path() (plugin_exporter.py:399) — guarded: self-entry skipped by is_dev=True.
  • _helpers.py:137guarded: uses get_package_dependencies().

If any of these guards were bypassed and to_dependency_ref() were called on the self-entry, get_install_path() would attempt Path(".").name""validate_path_segments("", ..., reject_empty=True)PathTraversalError. This is fail-closed — good. No network fetch is possible because the path validation throws before any URL resolution.

No action needed — defense in depth is adequate here.


Finding 5 · Advisory vs. hard-fail escalation path is correctly separated ✅

Two distinct checks exist and serve complementary roles:

  • ci_checks._check_includes_consent (line 339): Always passed=True. Baseline advisory nudge. Cannot block CI on its own.
  • policy_checks._check_includes_explicit (line 575): Gated on policy.manifest.require_explicit_includes (schema default: False). When enabled, rejects both None and "auto" — only an explicit path list satisfies it.

The escalation from advisory → hard-fail requires an org-level policy file opt-in (require_explicit_includes: true), parsed in policy/parser.py:218. The two checks are wired to separate runners (run_baseline_checks vs run_dependency_policy_checks), so there is no risk of the advisory accidentally becoming a hard block at the baseline level.


Finding 6 · Hash comparison timing safety — not applicable ✅

The dict(self.local_deployed_file_hashes) != dict(other.local_deployed_file_hashes) at lockfile.py:458 uses Python's standard !=. This is used in is_semantically_equivalent(), which is a write-optimization (avoid rewriting lockfile when nothing changed), not a security gate. The attacker already knows the expected hashes (they're in the lockfile on disk), so there is no secret to leak via timing side-channels. hmac.compare_digest is not needed here.

The actual integrity check in _check_content_integrity (ci_checks.py:297) uses compute_file_hash(file_path) != expected_hash — also not timing-sensitive because the comparison detects drift between on-disk content and a locally-recorded hash, not authentication against a remote secret.


Finding 7 · Bundle lockfile strip has no race condition ✅

lockfile_enrichment.py:163-165 operates on a fresh yaml.safe_load() copy of the serialized lockfile, not the in-memory LockFile object. The to_yaml() call at line 146 already excludes the self-entry via the pop/try/finally pattern. The subsequent data.pop() removes the flat local_deployed_files / local_deployed_file_hashes keys from the deserialized dict copy. No shared mutable state is involved — no race condition is possible.


Summary

# Finding Severity Action
1 includes paths not validated against traversal Medium Add validate_path_segments at parse time
2 _SELF_KEY private import across modules Low Promote to public constant or expose predicate
3 get_all_dependencies() call-site audit ✅ Clean
4 Self-entry repo_url="<self>" fetch risk ✅ Clean Fail-closed by path validation
5 Advisory/hard-fail escalation separation ✅ Clean
6 Hash comparison timing safety ✅ Clean Not a secret comparison
7 Bundle lockfile strip race condition ✅ Clean No shared mutable state

Finding 1 is the only item I'd request addressed in this PR — it's a cheap parse-time guard that prevents a future vulnerability when includes gains file-operation semantics. Finding 2 is a hardening suggestion that can be addressed separately.

Generated by PR Review Panel for issue #889 · ● 15.1M ·

@danielmeppiel danielmeppiel added panel-review Trigger the apm-review-panel gh-aw workflow and removed panel-review Trigger the apm-review-panel gh-aw workflow labels Apr 24, 2026
@github-actions
Copy link
Copy Markdown

APM Review Panel Verdict

Disposition: APPROVE (with two minor follow-up items noted below; neither is a merge blocker)


Per-persona findings

Python Architect:

This PR closes a real audit gap with a well-reasoned design. The self-entry synthesis, the includes: manifest field, and the policy/audit wiring are all correctly layered. The class diagram and execution flow are below.

classDiagram
    direction TB
    class LockFile {
        <<DataStore>>
        +dependencies: Dict[str, LockedDependency]
        +local_deployed_files: List[str]
        +local_deployed_file_hashes: Dict[str, str]
        +get_all_dependencies() List
        +get_package_dependencies() List
        +from_yaml(yaml_str) LockFile
        +to_yaml() str
    }
    note for LockFile "Self-Entry Synthesis:\n'.' key synthesized in from_yaml()\nstripped (pop + finally) in to_yaml()"
    class LockedDependency {
        <<ValueObject>>
        +repo_url: str
        +source: str
        +local_path: str
        +is_dev: bool
        +depth: int
        +deployed_file_hashes: Dict
    }
    class APMPackage {
        <<DataClass>>
        +includes: Optional[Union[str, List[str]]]
        +from_apm_yml(path) APMPackage
    }
    class ManifestPolicy {
        <<DataClass>>
        +require_explicit_includes: bool
    }
    class ApmPolicy {
        <<DataClass>>
        +manifest: ManifestPolicy
    }
    class CIChecks {
        <<Module>>
        +run_baseline_checks(root) CIAuditResult
        +_check_content_integrity(root, lock) CheckResult
        +_check_includes_consent(manifest, lock) CheckResult
    }
    class PolicyChecks {
        <<Module>>
        +run_dependency_policy_checks(...) CIAuditResult
        +_check_includes_explicit(includes, policy) CheckResult
        +_INCLUDES_NOT_PROVIDED: object
    }
    note for PolicyChecks "Sentinel Object pattern:\n_INCLUDES_NOT_PROVIDED skips\nexplicit-includes check for\nlegacy callers without manifest ctx"
    class BaseIntegrator {
        <<AbstractBase>>
        +validate_deploy_path(rel, root) bool
    }
    LockFile *-- LockedDependency : contains
    CIChecks ..> LockFile : reads
    CIChecks ..> APMPackage : reads includes
    CIChecks ..> BaseIntegrator : validate_deploy_path guard
    PolicyChecks ..> APMPackage : reads includes
    PolicyChecks ..> ManifestPolicy : reads require_explicit_includes
    ApmPolicy *-- ManifestPolicy
    class LockFile:::touched
    class APMPackage:::touched
    class ManifestPolicy:::touched
    class CIChecks:::touched
    class PolicyChecks:::touched
    classDef touched fill:#fff3b0,stroke:#d47600
Loading
flowchart TD
    A["apm audit --ci\ncommands/audit.py"] --> B["run_baseline_checks\npolicy/ci_checks.py"]
    B --> C["_check_lockfile_exists\n[FS] reads apm.yml, conditionally reads apm.lock.yaml\nNEW: promotes local-only repos to has_deps=True"]
    C --> D{"has_deps or\nlocal_deployed_files?"}
    D -- No --> E["[+] no deps, lockfile not required"]
    D -- Yes --> F{"lockfile on disk?"}
    F -- No --> G["[x] lockfile missing"]
    F -- Yes --> H["[FS] LockFile.from_yaml()\nsynthesizes '.' self-entry from local_deployed_files"]
    H --> I["_check_ref_consistency"]
    I --> J["_check_deployed_files_present\n[FS] checks deployed file paths"]
    J --> K["_check_no_orphans\nskips dep_key == _SELF_KEY"]
    K --> L["_check_config_consistency"]
    L --> M["_check_content_integrity\n[FS] scan_lockfile_packages + compute_file_hash\nNEW: iterates lock.dependencies incl. self-entry\napplies validate_deploy_path guard per rel_path"]
    M --> N{"unicode issues\nor hash drift?"}
    N -- No --> O["[+] No Unicode or hash drift"]
    N -- Yes --> P["[x] N file(s) with hash drift -- run 'apm install'\ndep_label = '<self>' for local content"]
    O --> Q["_check_includes_consent (advisory, passed=True always)\nNEW: nudges toward 'includes:' declaration"]
    P --> Q
    Q --> R["CIAuditResult (7 checks)"]
    S["apm install\ninstall/phases/policy_gate.py"] --> T["extra_kwargs = {manifest_includes: apm_package.includes}\nif ctx.apm_package else {}"]
    T --> U["run_dependency_policy_checks(...manifest_includes=...)"]
    U --> V["_check_includes_explicit\nNEW: skipped if _INCLUDES_NOT_PROVIDED"]
    V --> W{"require_explicit_includes?"}
    W -- False --> X["[+] skip"]
    W -- True --> Y{"includes == None or 'auto'?"}
    Y -- Yes --> Z["[x] block install"]
    Y -- "No (list)" --> AA["[+] explicit paths OK"]
Loading

Design patterns

  • Used in this PR: Sentinel Object (_INCLUDES_NOT_PROVIDED = object()) -- allows run_dependency_policy_checks to distinguish "caller did not supply includes" from None (explicitly absent). Shown in the class diagram PolicyChecks note. Value Object + Lazy Synthesis -- LockedDependency for the "." self-entry materialized at deserialization and stripped at serialization, annotated in the LockFile note.
  • Pragmatic suggestion: The to_yaml() pop-and-restore pattern (self.dependencies.pop(_SELF_KEY, None) + finally) mutates the instance dict, which is a thread-safety smell for a value used as a data store. A cleaner approach: call get_package_dependencies() (which already excludes the self-entry) directly in to_yaml() instead of popping from the live dict. That eliminates the mutation with no behavior change. No new abstraction needed -- pure simplification.

One inconsistency: get_package_dependencies() filters with dep.local_path != "." (literal string), while the _get_all_install_paths loop uses dep.local_path == _SELF_KEY. Both evaluate to "." since _SELF_KEY = ".", but using two different comparison axes for the same sentinel is a minor readability hazard. Recommend picking one style.


CLI Logging Expert:

Output paths are correct: all findings flow through CheckResult and surface via the existing CI table renderer. No direct _rich_* calls introduced in command functions.

Hash drift detail lines ("hash-drift: .github/prompts/file.md (dep=<self>, expected=abc123..., actual=def456...)") are verbose but appropriate for CI debug output. Truncating to 12 chars is good for terminal width; full hashes available via --json.

One concern: _check_includes_consent always returns passed=True, causing the CI table to render [+] for an advisory message that begins "consider adding...". Users conditioned to green=good may miss the nudge. The docstring explains this is intentional and the policy field promotes it to a hard block. Acceptable design, but documenting this in the CLI reference or a FAQ entry would reduce confusion.

The remedy string in _check_content_integrity ("run 'apm install' to restore drifted files") is correct, actionable, and follows the "Include the fix" rule.


DevX UX Expert:

The includes: auto scaffold in apm init (commands/_helpers.py) is the right default: new users get auditability without any extra configuration, matching the "5-minute to success" north star.

The three-tier mental model (undeclared / auto / explicit list) is clean and well-documented in manifest-schema.md. The progression from advisory to hard block (via policy) mirrors how other package manager enforcement works (eslint warn -> error, cargo features gated, etc.) -- familiar to the target audience.

CLI reference (docs/src/content/docs/reference/cli-commands.md), manifest schema, lockfile spec, and governance docs are all updated in the same PR. Rule 4 compliance confirmed.

The includes-consent advisory check showing [+] while containing a "consider adding" message is a mild UX gap (echoed from CLI Logging). Minor enough to defer -- the policy field provides the hard path for teams that need it.


Supply Chain Security Expert:

Integrity improvements are genuine: hash drift detection for locally-authored files closes a real gap. The validate_deploy_path guard is correctly applied inside the hash verification loop, mirroring the existing guard in _check_deployed_files_present.

Three security observations:

  1. Circular trust awareness: The hash check compares on-disk content against hashes stored in the same apm.lock.yaml. If an attacker has write access to both the file and the lockfile, they can update both simultaneously and bypass detection. This is the known limitation of all lockfile-based integrity checks and is not introduced by this PR -- the existing SHA pinning for remote deps has the same property. Acceptable, but worth documenting in enterprise/security.md as a scope boundary.

  2. _SELF_KEY is a private constant imported across module boundaries (from ..deps.lockfile import _SELF_KEY in ci_checks.py and install/phases/cleanup.py). Python convention is to not import _-prefixed names outside their defining module. If _SELF_KEY becomes a public constant (SELF_KEY) or callers use dep.local_path == "." instead, the coupling is cleaner. Currently the constant is also compared against dep.local_path (a different field that happens to share the value "."), which is confusing.

  3. Silent swallow in _check_lockfile_exists: The new try/except Exception: pass block that reads the lockfile to detect local_deployed_files silently drops YAML parse errors. A corrupted lockfile would silently fall through as "no local content," skipping the downstream audit checks. At minimum, a logger.debug() on the exception path would make this debuggable.

No path traversal risks introduced. Packer stripping of local_deployed_files from bundled lockfiles is correct and prevents phantom self-entries on consumer unpack.


Auth Expert: Not activated -- no auth-related files changed; the PR touches lockfile.py, ci_checks.py, policy_checks.py, models/apm_package.py, and integration modules, none of which affect token resolution, AuthResolver host classification, or credential helpers.


OSS Growth Hacker:

This PR is a strong enterprise narrative beat: "APM now detects config drift in your team's locally-authored AI primitives." That maps directly to the trust / governance story that CISOs and VPEs care about (per Lorenzo Storelli's AI Controls prototype using APM as substrate).

Specific growth angles:

  1. The includes: auto default in apm init means new users get the auditability story for free without reading docs -- zero-friction adoption of a governance feature.
  2. The includes-consent advisory in CI output is a discovery nudge that appears in the natural workflow. Users learn about includes: through apm audit --ci without needing a separate docs expedition.
  3. require_explicit_includes: true is a one-line policy field that compliance teams can gate -- worth a dedicated mention in the next release post alongside the enterprise governance guide.

Side-channel to CEO: this is the first feature that makes APM's local content governance machine-verifiable in CI. Worth framing the release note around "trust but verify your team's AI configurations" -- that's a viral angle for security-conscious engineering orgs. The WIP/growth-strategy.md entry should note this as a conversion surface for the enterprise segment.


CEO arbitration

Specialists aligned: this is a well-scoped, well-tested feature that closes a genuine audit gap without introducing new threat surfaces. The thread-safety smell in to_yaml() and the cross-module private-constant import are both worth addressing but neither is a merge blocker at the current usage patterns. The [+] advisory rendering is a conscious design decision documented in the code, appropriate given the policy promotion path. The CHANGELOG is thorough, no breaking changes without migration lines, and all Rule 4 doc sync requirements are met.

Two items are worth a follow-up issue rather than blocking merge: the to_yaml() mutation (replace pop+finally with a get_package_dependencies()-based serialization path) and the _SELF_KEY export hygiene (make it public or use dep.local_path == "." uniformly). Strategic call: APPROVE. The enterprise governance narrative this unlocks is directly aligned with APM's positioning as the trust layer for AI-native development.


Required actions before merge

  1. None -- disposition is APPROVE with no blocking pre-merge requirements. The two items below are filed as follow-ups.

Optional follow-ups

  • to_yaml() mutation cleanup (src/apm_cli/deps/lockfile.py:266-289): Replace the pop+finally pattern with a direct call to get_package_dependencies() (or an equivalent filter in the serialization loop) to eliminate mutable dict state and the thread-safety smell.
  • _SELF_KEY hygiene (src/apm_cli/deps/lockfile.py:18): Rename to SELF_KEY (public) or replace cross-module imports in ci_checks.py and cleanup.py with dep.local_path == "." checks, matching get_package_dependencies()'s style.
  • Silent swallow in _check_lockfile_exists (src/apm_cli/policy/ci_checks.py:56-66): Add logger.debug("Could not read lockfile to check local_deployed_files: %s", e) in the except block so YAML corruption is debuggable in verbose mode.
  • Document circular-trust limitation (docs/src/content/docs/enterprise/security.md): Add a note that hash drift detection catches post-install edits but does not protect against a simultaneous lockfile + file modification by an attacker with write access to both.
  • Advisory UX (src/apm_cli/policy/ci_checks.py:339-374): If user research shows [+] advisory messages are missed, consider a dedicated advisory result type that renders as [i] in the CI table. Requires updating the table renderer, so defer until there is evidence of missed advisories.

Generated by PR Review Panel for issue #889 · ● 1.2M ·

@danielmeppiel danielmeppiel merged commit 7d62017 into main Apr 24, 2026
16 checks passed
@danielmeppiel danielmeppiel deleted the feat/audit-includes-887 branch April 24, 2026 17:40
srid added a commit to juspay/kolu that referenced this pull request Apr 25, 2026
Rewrite `just ai::apm-sync` around two upstream apm improvements:

* microsoft/apm#889 (merged) wires content-integrity hash verification
  into `apm audit --ci` — every APM-managed file (`.claude/`, `.codex/`,
  `.agents/`, `.opencode/`, `opencode.json`, MCP configs) is now
  checked against the lockfile's recorded hashes.  The recursive
  `diff -r` loop that re-installed into scratch and compared each
  deploy dir is redundant and goes away.

* microsoft/apm#888 (`apm install --root`) replaces the rsync-the-
  worktree-into-scratch dance with a single flag.  The scratch
  staging that remains is purely for the `apm compile` leg — apm's
  distributed compiler scans the project tree to score AGENTS.md
  placement, so the scratch needs the source file inventory.  It
  drops to an `rsync -aH` with one set of excludes (down from ~20
  lines of setup + 25 lines of diff gates).

* Staged on juspay/apm@feat/install-compile-root-flag until #888 lands
  in microsoft/apm — revert the `apm_cmd` pin once that happens.

Net: the recipe trims from 127 lines to 51; all seven audit checks
pass, and the per-AGENTS.md diff loop still catches compile-output
drift that audit can't (AGENTS.md files aren't lockfile-tracked).
srid added a commit to juspay/kolu that referenced this pull request Apr 25, 2026
Rewrite `just ai::apm-sync` around two upstream apm improvements:

* microsoft/apm#889 (merged) wires content-integrity hash verification
  into `apm audit --ci` — every APM-managed file (`.claude/`, `.codex/`,
  `.agents/`, `.opencode/`, `opencode.json`, MCP configs) is now
  checked against the lockfile's recorded hashes.  The recursive
  `diff -r` loop that re-installed into scratch and compared each
  deploy dir is redundant and goes away.

* microsoft/apm#888 (`apm install --root`) replaces the rsync-the-
  worktree-into-scratch dance with a single flag.  The scratch
  staging that remains is purely for the `apm compile` leg — apm's
  distributed compiler scans the project tree to score AGENTS.md
  placement, so the scratch needs the source file inventory.  It
  drops to an `rsync -aH` with one set of excludes (down from ~20
  lines of setup + 25 lines of diff gates).

* Staged on juspay/apm@feat/install-compile-root-flag until #888 lands
  in microsoft/apm — revert the `apm_cmd` pin once that happens.

Net: the recipe trims from 127 lines to 51; all seven audit checks
pass, and the per-AGENTS.md diff loop still catches compile-output
drift that audit can't (AGENTS.md files aren't lockfile-tracked).
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

panel-review Trigger the apm-review-panel gh-aw workflow

Projects

None yet

Development

Successfully merging this pull request may close these issues.

feat: close audit-blindness gap for local .apm/ content via virtual self-entry + includes: manifest field

3 participants