Skip to content

apm unpack writes apm.lock.yaml / apm.yml to output dir, violating documented metadata-only contract #901

@danielmeppiel

Description

@danielmeppiel

Summary

apm unpack writes apm.lock.yaml and apm.yml into the output directory, even though the CLI's own documentation states these are bundle metadata, not output. This contract violation cascades up the stack:

Originating failure: https://github.com/microsoft/apm/actions/runs/24883083247/job/72856352586?pr=889

error: Your local changes to the following files would be overwritten by checkout:
        apm.lock.yaml
ERR_API: Failed to checkout PR branch: The process '/usr/bin/git' failed with exit code 1

Originating scenario (plain English)

An agentic workflow built on the documented gh-aw template shared/apm.md runs in two phases on a PR:

  1. Trusted base phase: checks out main, downloads the APM bundle (built earlier with isolated: 'true' into /tmp/gh-aw/apm-workspace), then calls microsoft/apm-action in restore mode to deploy primitives into the workspace. As a side-effect, the action writes apm.lock.yaml and apm.yml back into the workspace.
  2. PR checkout phase: gh-aw runs git checkout -B <branch> origin/pr-head. Git refuses because the workspace is dirty.

When the PR also modifies apm.lock.yaml (which is exactly what every dependency-related PR does), the writes collide and the checkout aborts. The agent never starts.

Root cause: apm unpack contract violation

apm unpack's documented contract states that apm.lock.yaml is bundle metadata and is not copied to the output directory. The implementation does the opposite: it writes the full bundle contents — including apm.lock.yaml and apm.yml — into the output directory. Anything depending on apm unpack (most notably microsoft/apm-action) inherits the bug.

Proposed fix: align CLI with its own contract

Goal: apm unpack <bundle> --output <dir> writes only what the documented contract says it should: agent primitives under .github/{skills,agents,instructions,prompts}/ and apm_modules/. apm.lock.yaml and apm.yml are not written to <dir>.

Behavior change

Current Proposed
<dir>/.github/{skills,agents,instructions,prompts}/ Written (overwrites collisions) Written (no-clobber: existing files in <dir> win, mirroring discover_primitives_with_dependencies priority)
<dir>/apm_modules/ Written Written (wholesale; gitignored / package-owned)
<dir>/apm.lock.yaml Written Not written
<dir>/apm.yml Written Not written
Bundle metadata access Implicit (via written files in <dir>) Explicit: --metadata-dir <path> flag (optional) writes apm.lock.yaml + apm.yml to a separate caller-controlled path

Why no-clobber for primitives

Today, unpacker.py:37-39 documents: "If a local file has the same name as a bundle file, the bundle file wins (overwrite)." This contradicts APM's own discovery priority (discover_primitives_with_dependencies: local primitives have highest priority). Aligning unpack's collision behavior with discovery's collision behavior makes the model consistent end-to-end.

Why no opt-in flag

Callers who relied on apm.lock.yaml being written to the output dir were relying on undocumented behavior that contradicts the CLI's own contract. Treat as a Fixed bullet in CHANGELOG, not a breaking-flag dance.

Migration

Acceptance criteria

  • apm unpack <bundle> --output <dir> does not write apm.lock.yaml or apm.yml into <dir>.
  • apm unpack <bundle> --output <dir> with a pre-existing tracked primitive (e.g. <dir>/.github/agents/foo.agent.md) does not overwrite it; the local file wins.
  • apm unpack <bundle> --output <dir> --metadata-dir <metadir> writes apm.lock.yaml and apm.yml to <metadir> (and not to <dir>).
  • Documentation for apm unpack reflects the lockfile-is-metadata contract that the implementation now honors.

Three-layer plan (for cross-link visibility)

  1. microsoft/apm (this issue): align apm unpack with its documented contract. Root cause.
  2. microsoft/apm-action (v1.5: restore mode must install apm CLI so bundles are unpacked via 'apm unpack' (not raw tar fallback) apm-action#26): ship v2 built on the fixed CLI. working-directory stops getting dirtied on restore. Every action consumer (gh-aw users, raw GH Actions users, anything) gets the fix transparently.
  3. github/gh-aw (#28256, closed as redirect): bump shared/apm.md pin to microsoft/apm-action@v2 once available. One-line change. No template gymnastics.

Temporary mitigation (already shipping)

While the CLI + action fixes are in flight, microsoft/apm's own copy of shared/apm.md carries this single-line workaround after the Restore step:

- name: Reset tracked files after APM restore
  run: git checkout -- . 2>/dev/null || true

This is acceptable as a quarantined, single-repo fix because we control the copy. It is not the right shape for upstream gh-aw or for apm-action; those layers get the structural fix above. The workaround becomes a documented no-op once microsoft/apm-action#26 ships.

References

Out of scope

  • PR-controlled primitive shadowing in pr-review-panel-style workflows (separate higher-severity finding): once this fix lets the workflow succeed, gh-aw's checkout_pr_branch.cjs can replace trusted primitives (e.g. .github/skills/apm-review-panel/SKILL.md) with the PR's versions. Independent of this issue; will file separately against gh-aw with a proper mitigation proposal once the fix chain above lands.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions