Skip to content

feat(marketplace): add audit command to flag deps that bypass marketplace pinning#881

Open
edenfunf wants to merge 3 commits intomicrosoft:mainfrom
edenfunf:feat/marketplace-doctor-847
Open

feat(marketplace): add audit command to flag deps that bypass marketplace pinning#881
edenfunf wants to merge 3 commits intomicrosoft:mainfrom
edenfunf:feat/marketplace-doctor-847

Conversation

@edenfunf
Copy link
Copy Markdown
Contributor

@edenfunf edenfunf commented Apr 23, 2026

Description

A marketplace author pinning a package at a specific SHA still leaves a supply-chain gap: if that package's own apm.yml declares dependencies.apm using direct repo paths (e.g. owner/repo/path), those transitive deps are resolved via git clone and track HEAD -- the marketplace's version pinning does not flow through them.

This PR adds apm marketplace audit <name> that fetches each plugin's own apm.yml at its pinned ref and warns when a dependencies.apm entry would be resolved outside the marketplace catalogue.

Classification

  • Bypasses (warn): bare owner/repo, owner/repo/subpath, https:// / ssh:// git URLs, and { git: URL, ... } object-form entries.
  • Clean: name@marketplace refs; local paths (./x, /abs, ../x).
  • Skipped: plugin has no apm.yml at the pinned ref, or source type is not an addressable github manifest.
  • Unverifiable errors: fetch failure or malformed YAML (per-plugin isolation -- one bad plugin does not abort the run).

Behavior

  • Default run is informational and exits 0. The per-plugin section header is suppressed on an all-clean run so the summary is not preceded by an empty section.
  • --strict exits non-zero on any bypass warning or unverifiable plugin, for use in CI.
  • --verbose surfaces clean plugins and skipped reasons inline; on errors it also prints the captured traceback for debugging.

Type of change

  • New feature

Implementation notes

  • Lives at src/apm_cli/commands/marketplace/audit.py (new package layout from refactor: split marketplace commands into package modules #1024) for the CLI surface and src/apm_cli/marketplace/audit.py for domain logic. Wired into MarketplaceGroup as an Authoring command between outdated and doctor.
  • Coexists with the existing apm marketplace doctor (environment diagnostics) -- different mental model, different subcommand. Aligns with the existing top-level apm audit (content integrity scan) under the project's "audit = security/integrity" naming family.
  • fetch_raw() is added as a new public primitive in marketplace/client.py alongside the new _try_proxy_fetch_raw() helper. _try_proxy_fetch and _fetch_file are reconciled with main: their public contracts are preserved (returns dict | None, raises MarketplaceFetchError); internally, _try_proxy_fetch now delegates to _try_proxy_fetch_raw to keep proxy I/O DRY between marketplace.json and arbitrary-file callers. No existing call site changes.
  • fetch_raw() raises the neutral MarketplaceError base so audit can surface its own per-plugin context instead of inheriting the MarketplaceFetchError "run apm marketplace update" retry hint, which is wrong at plugin granularity.
  • Object-style dep entries ({ git: ..., path: ..., ref: ... } -- the same shape DependencyReference.parse_from_dict accepts) are normalized to strings before classification so they are not silently dropped.
  • Path traversal in the plugin source path field is rejected via the existing validate_path_segments.

Review feedback addressed

Per @danielmeppiel's review:

  • Naming: renamed doctor -> audit (option 1). The existing apm marketplace doctor (environment diagnostics) is unchanged.
  • Package layout: re-ported into the new src/apm_cli/commands/marketplace/ package as audit.py, registered in __init__.py alongside check, outdated, etc., and added to MarketplaceGroup._authoring_commands.
  • marketplace/client.py reconciled against main: kept _try_proxy_fetch semantically identical (signature, return type, JSON-decode-on-bytes behavior), added _try_proxy_fetch_raw as a sibling helper that _try_proxy_fetch now delegates to. _fetch_file is unchanged from main. The existing test_marketplace_client.py patches at _try_proxy_fetch continue to work without modification.
  • Doc sync:
    • CHANGELOG.md -- new ### Added entry under [Unreleased].
    • docs/src/content/docs/reference/cli-commands.md -- full command reference (synopsis, flags, classification, exit codes, examples).
    • packages/apm-guide/.apm/skills/apm-usage/commands.md -- in-product guide entry alongside other marketplace authoring commands.

Testing

  • Tested locally
  • All existing tests pass (uv run pytest tests/unit tests/test_console.py -- 6982 passed; the preexisting environment-specific test_is_tool_available flake is unrelated).
  • 53 new tests in tests/unit/marketplace/test_marketplace_audit.py covering the classifier, dep normalisation including dict-form, plugin coord resolution, fetch error-message regression, and the CLI command (default / --strict on bypass / --strict on unverifiable plugin / unverifiable without strict / clean exits / --verbose output).
  • 3 new tests in tests/unit/marketplace/test_marketplace_client.py::TestFetchRaw pinning the public contract of fetch_raw: raises neutral MarketplaceError (not MarketplaceFetchError), short-circuits on proxy hit, and respects PROXY_REGISTRY_ONLY=1 to keep direct GitHub fetches off.

Manual verification

End-to-end run of apm marketplace audit exercising the full CLI through real Click + real run_audit / classify_dependency / fetch_plugin_apm_yml orchestration with only network boundaries stubbed, across:

  • string-form bypass, dict-form { git: https:// ... }, dict-form { git: git@... } SSH -- all flagged
  • marketplace ref only, local path only, mixed marketplace + local -- all clean
  • missing apm.yml (skipped), malformed YAML (parse error), HTTP 5xx (network error)
  • singular/plural subject-verb agreement, --strict exit code, --verbose clean+skipped detail, summary counts, conditional section header on all-clean

Additionally, apm marketplace add (which goes through fetch_marketplace -> _fetch_file -> _try_proxy_fetch -> _try_proxy_fetch_raw) was confirmed against the real GitHub API to ensure the client.py reconciliation did not break the live network path used by all existing marketplace consumer commands.

Fixes #847

Copy link
Copy Markdown
Collaborator

@danielmeppiel danielmeppiel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Strategic context

Thanks for this work, @edenfunf. A dependency-bypass auditor is exactly the kind of marquee surface APM's Secure-by-default story needs, and the supply-chain rigor in this PR (see "What's already great" below) is well above the bar we usually see for a first contribution. We very much want to land this -- but the branch has drifted hard against main since you opened it, and a clean rebase isn't mechanical anymore. Below are the three blockers, with concrete options for each.


Blockers (ordered)

[x] 1. Naming collision: apm marketplace doctor already exists on main

Since refactor PR #1024, main ships an apm marketplace doctor command at src/apm_cli/commands/marketplace/doctor.py that runs environment diagnostics (git binary, network reachability, auth, marketplace config readiness). Your doctor is a dependency-bypass auditor -- different inputs, different outputs, different mental model. They cannot coexist under the same name.

Resolution options (pick one -- see "Naming proposals" below for our recommendation).

[x] 2. Target file src/apm_cli/commands/marketplace.py was deleted on main

PR #1024 split that single file into a marketplace/ package: __init__.py, check.py, doctor.py, init.py, migrate.py, outdated.py, publish.py, validate.py, plugin/. Your PR adds 112 lines to the now-deleted marketplace.py, so a rebase produces CONFLICT (modify/delete).

Resolution: re-port your command registration into the new package layout. Concretely, after picking a name (say audit), add src/apm_cli/commands/marketplace/audit.py and wire it from src/apm_cli/commands/marketplace/__init__.py next to the existing check, doctor, outdated, etc. registrations.

[!] 3. src/apm_cli/marketplace/client.py has evolved independently

Your PR refactors _try_proxy_fetch -> _try_proxy_fetch_raw and adds fetch_raw. Main has independently changed the same _try_proxy_fetch (modern dict | None return type, cfg.scheme / headers wiring). The conflict is resolvable but non-trivial -- please reconcile against current main rather than force-applying your version, so the proxy fetch path keeps the upstream improvements.


Naming proposals (you pick)

In rough order of our preference:

  1. apm marketplace audit -- our recommendation. Cleanest semantic split from environment doctor, easy to document, no breaking change for existing users. Short, memorable, signals "supply-chain audit".
  2. Fold into apm marketplace check --strict-pinning -- attractive if you'd rather not introduce a new top-level subcommand. Trade-off: check is currently lockfile drift / integrity oriented, so the flag would need a clear scope.
  3. Rename existing env-diagnostic to apm marketplace env-doctor, keep doctor for your auditor -- works semantically, but it's a breaking change to a shipped command and would need a deprecation shim plus a CHANGELOG BREAKING entry.

Maintainers are happy with option 1. If you'd prefer 2 or 3, flag it in your next push and we'll align.


Doc requirements (repo doc-sync rule)

Any new CLI command must update the following in the same PR:

  • CHANGELOG.md -- new ### Added entry under [Unreleased], following the Keep a Changelog format already used in the file.
  • docs/src/content/docs/reference/cli.md (or the closest existing page under docs/src/content/docs/) -- one short section: synopsis, flags, exit codes, a one-line example.
  • packages/apm-guide/.apm/skills/apm-usage/commands.md -- add the new command alongside the other apm marketplace ... entries so the in-product guide stays in sync.

If option 3 above is chosen, also add a ### Changed (BREAKING) CHANGELOG entry for the doctor rename and update any docs that reference the existing marketplace doctor.


What's already great

This is a serious piece of work and we don't want any of it lost in the rebase:

  • classify_dependency -- clean separation between marketplace-pinned, bypassed, and ambiguous cases. Exactly the right primitive for this surface.
  • Dict-form bypass detection -- catching the {name: ..., source: ...} long-form that side-steps marketplace pinning is the subtle case most reviewers would have missed.
  • FetchStatus enum for fault isolation -- distinguishing "marketplace says no" from "network failed" from "auth missing" is the right call; it keeps the auditor honest under partial outages and makes exit-code semantics tractable.
  • Path-traversal guards -- aligned with the repo's validate_path_segments / ensure_path_within rule. Good instinct.
  • 592 lines of unit tests for doctor.py -- coverage is genuinely thorough.

None of this needs to change. The blockers are all about where the code lives and what it's called, not what it does.


Next step

  1. Rebase onto current main.
  2. Pick a name (we recommend audit).
  3. Re-port the command into the new src/apm_cli/commands/marketplace/ package layout.
  4. Reconcile marketplace/client.py against main's _try_proxy_fetch rather than overwriting it.
  5. Add the three doc updates listed above.
  6. Re-request review.

If the rebase is painful (it will be), say the word in a comment and a maintainer will pair on it or push a rebase branch you can pull from. We'd rather help you across the line than see this stall.

Thanks again for the contribution.

Copilot AI review requested due to automatic review settings April 30, 2026 18:11
@edenfunf edenfunf force-pushed the feat/marketplace-doctor-847 branch from 8f1c9d4 to ec91c42 Compare April 30, 2026 18:11
@edenfunf edenfunf changed the title feat(marketplace): add doctor command to flag deps that bypass marketplace pinning feat(marketplace): add audit command to flag deps that bypass marketplace pinning Apr 30, 2026
@edenfunf
Copy link
Copy Markdown
Contributor Author

edenfunf commented Apr 30, 2026

@danielmeppiel thanks for the thorough review. I have rebased onto current main and addressed all three blockers plus the doc-sync requirement. Force-pushed; the branch is now a single clean commit on top of main. Re-requesting review when you have a moment.

How each blocker was resolved

1. Naming collision (doctor) -- went with option 1, apm marketplace audit. The existing apm marketplace doctor (environment diagnostics) is unchanged, and audit matches the project's existing security-family naming alongside the top-level apm audit (content integrity scan). No breaking change.

2. Target file deletion -- re-ported the command into the new src/apm_cli/commands/marketplace/ package as audit.py. Registration is in __init__.py next to the existing check/outdated/doctor imports, and audit is added to MarketplaceGroup._authoring_commands between outdated and doctor. The 112 lines that previously sat in the deleted commands/marketplace.py are now in the new module, refreshed for the package layout.

3. client.py evolved independently -- I want to call out the design choice here so the diff reads correctly. Rather than overwriting _try_proxy_fetch, the function is kept exactly as main has it (same (source, file_path) -> dict | None signature, same JSON-decode-on-bytes behavior, same caller). The new _try_proxy_fetch_raw(owner, repo, file_path, ref) -> bytes | None is a sibling helper, and _try_proxy_fetch now delegates to it for the byte-level fetch -- so the proxy I/O path stays DRY between marketplace.json and arbitrary-file callers, but no existing call site has to change. _fetch_file is byte-identical to main, so the existing test_marketplace_client.py::TestPrivateRepoAuth patch at _try_proxy_fetch continues to work without modification. The new fetch_raw public primitive is purely additive.

Doc sync (per repo rule)

  • CHANGELOG.md -- new `### Added` entry under `[Unreleased]` (Keep a Changelog format).
  • docs/src/content/docs/reference/cli-commands.md -- full command reference (synopsis, args, flags, classification table, exit codes, examples) inserted before the existing apm marketplace doctor section.
  • packages/apm-guide/.apm/skills/apm-usage/commands.md -- in-product guide entry next to the other apm marketplace ... rows.

Tests

Total 56 new tests, all passing alongside the existing suite (6982 passed):

  • 53 in tests/unit/marketplace/test_marketplace_audit.py covering the classifier, dep normalisation including dict-form, plugin coord resolution, the MarketplaceFetchError-leakage regression, and the CLI command across default / --strict on bypass / --strict on unverifiable plugin / unverifiable without --strict / clean exit / --verbose clean+skipped detail.
  • 3 in tests/unit/marketplace/test_marketplace_client.py::TestFetchRaw pinning the contract of fetch_raw: it raises the neutral MarketplaceError (not MarketplaceFetchError), short-circuits on proxy hit, and honors PROXY_REGISTRY_ONLY=1 to keep direct GitHub fetches off.

Other small improvements made during the rebase

  • Modernised marketplace/audit.py type hints to the project's PEP 604 / PEP 585 style (X | None, list[X], tuple[X, ...], Callable from collections.abc) so it matches the rest of main.
  • Removed (issue #847) references from docstrings and comments -- issue tracking already lives in the commit message (Fixes #847) and this PR description, and source-level mentions tend to drift as the codebase evolves.
  • Added traceback.format_exc() to the verbose error path in the CLI command so -v retains diagnostic value when the audit fails before reaching the per-plugin loop, matching the convention in commands/marketplace/check.py.
  • Suppressed the Audit Results: per-plugin section header on an all-clean default run so the summary is not preceded by an empty section.

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Note

Copilot was unable to run its full agentic suite in this review.

Adds a new apm marketplace audit command to detect transitive dependencies.apm entries that bypass marketplace pinning by fetching each plugin’s apm.yml at its pinned ref and classifying dependency shapes.

Changes:

  • Refactors marketplace client proxy fetching to expose a reusable fetch_raw() primitive for arbitrary files at a ref.
  • Implements marketplace audit core logic + Click CLI command (apm marketplace audit NAME, with --strict / -v).
  • Adds unit + CLI integration tests and updates user-facing docs + changelog.

Reviewed changes

Copilot reviewed 9 out of 9 changed files in this pull request and generated 7 comments.

Show a summary per file
File Description
tests/unit/marketplace/test_marketplace_client.py Adds focused tests for the new fetch_raw() error contract + proxy-only behavior.
tests/unit/marketplace/test_marketplace_audit.py Adds extensive test coverage for dependency classification, dep normalization, plugin coord resolution, fetch error messaging, and CLI --strict behavior.
src/apm_cli/marketplace/client.py Factors byte-level proxy fetch and introduces fetch_raw() for raw file retrieval with shared auth/proxy behavior.
src/apm_cli/marketplace/audit.py Implements audit core: classification, dep collection/normalization, plugin GitHub coord resolution, per-plugin fetch/parse, and aggregation.
src/apm_cli/commands/marketplace/audit.py Adds the apm marketplace audit command with summary output and CI-friendly --strict exit behavior.
src/apm_cli/commands/marketplace/init.py Registers the new audit subcommand in the marketplace command group exports.
packages/apm-guide/.apm/skills/apm-usage/commands.md Documents apm marketplace audit NAME in the commands table.
docs/src/content/docs/reference/cli-commands.md Adds detailed CLI reference documentation for apm marketplace audit.
CHANGELOG.md Announces the new command and its behavior/exit semantics.

Comment thread CHANGELOG.md
Comment thread src/apm_cli/marketplace/client.py
Comment thread src/apm_cli/marketplace/client.py Outdated
Comment thread src/apm_cli/marketplace/client.py Outdated
Comment thread src/apm_cli/marketplace/client.py
Comment thread src/apm_cli/marketplace/client.py
Comment thread src/apm_cli/marketplace/audit.py
…lace pinning

A marketplace author pinning a package at a specific SHA still leaves
a supply-chain gap: if that package's own apm.yml declares
dependencies.apm using direct repo paths (e.g. owner/repo/path),
those transitive deps are resolved via git clone and track HEAD --
the marketplace's version pinning does not flow through them.

Add apm marketplace audit <name>:

- Fetches each plugin's own apm.yml at its pinned ref and warns when
  a dependencies.apm entry would be resolved outside the marketplace
  catalogue.

Classification:
- Bypasses (warn): bare owner/repo, owner/repo/subpath, https:// /
  ssh:// git URLs, and { git: URL, ... } object-form entries.
- Clean: name@marketplace refs; local paths (./x, /abs, ../x).
- Skipped: plugin has no apm.yml at the pinned ref, or source type
  is not an addressable github manifest.
- Unverifiable errors: fetch failure or malformed YAML (per-plugin
  isolation -- one bad plugin does not abort the run).

Behavior:
- Default run is informational and exits 0.  Suppresses the per-plugin
  section header on an all-clean run so the summary is not preceded by
  an empty section.
- --strict exits non-zero on any bypass warning or unverifiable
  plugin, for use in CI.
- --verbose surfaces clean plugins and skipped reasons inline; on
  errors, also prints the captured traceback for debugging.

Implementation notes:
- Lives at src/apm_cli/commands/marketplace/audit.py (new package
  layout from microsoft#1024) and src/apm_cli/marketplace/audit.py (domain
  logic). Wired into MarketplaceGroup as an Authoring command.
- Coexists with the existing apm marketplace doctor (environment
  diagnostics) -- different mental model, different subcommand.
- Aligns with the existing apm audit (content integrity scan) under
  the project's "audit = security/integrity" naming family.
- fetch_raw() is added as a new public primitive in marketplace/client.py
  alongside the new _try_proxy_fetch_raw() helper.  _try_proxy_fetch
  and _fetch_file are reconciled with main: their public contracts
  are preserved (returns dict | None, raises MarketplaceFetchError);
  internally, _try_proxy_fetch now delegates to _try_proxy_fetch_raw
  to keep proxy I/O DRY between marketplace.json and arbitrary-file
  callers.  No existing call site changes.
- fetch_raw() raises the neutral MarketplaceError base so audit can
  surface its own per-plugin context instead of inheriting the
  MarketplaceFetchError "run apm marketplace update" retry hint,
  which is wrong at plugin granularity.
- Object-style dep entries ({ git: ..., path: ..., ref: ... } -- the
  same shape DependencyReference.parse_from_dict accepts) are
  normalized to strings before classification so they are not
  silently dropped.
- Path traversal in the plugin source path field is rejected via the
  existing validate_path_segments.

Tests (56 new):
- 53 in tests/unit/marketplace/test_marketplace_audit.py covering the
  classifier, dep normalisation including dict-form, plugin coord
  resolution, fetch error-message regression, and the CLI command
  (default / --strict on bypass / --strict on unverifiable plugin /
  unverifiable without strict / clean exits / --verbose output).
- 3 in tests/unit/marketplace/test_marketplace_client.py::TestFetchRaw
  pinning the public contract of fetch_raw: raises neutral
  MarketplaceError (not MarketplaceFetchError), short-circuits on
  proxy hit, and respects PROXY_REGISTRY_ONLY=1 to keep direct GitHub
  fetches off.

Docs:
- CHANGELOG.md: Added entry under [Unreleased].
- docs/src/content/docs/reference/cli-commands.md: full command
  reference (synopsis, flags, classification, exit codes, examples,
  note on bypassing the 1h marketplace.json cache).
- packages/apm-guide/.apm/skills/apm-usage/commands.md: in-product
  guide entry alongside other marketplace authoring commands.

Fixes microsoft#847
@edenfunf edenfunf force-pushed the feat/marketplace-doctor-847 branch from 13f7e29 to e7b603d Compare May 1, 2026 06:07
@edenfunf
Copy link
Copy Markdown
Contributor Author

edenfunf commented May 1, 2026

Pushed e7b603d. Rebased onto current main, two of the review items became code changes, one extra security alignment fell out of the rebase, and two are left as-is for reasons below.

Addressed in code

(items 2-4) fetch_raw no longer returns None ambiguously. The PROXY_REGISTRY_ONLY block path used to share a None return with confirmed HTTP 404; audit mapped that to NO_MANIFEST (silent skip), so in a proxy-only deployment a marketplace could look clean while nothing was actually verified -- the exact failure mode supply-chain audits are supposed to prevent. fetch_raw now raises MarketplaceError on that path with the file coords and reason. The audit chain already wraps MarketplaceError to NETWORK_ERROR ("could not verify"), so --strict correctly exits 1 in proxy-only environments. Docstring updated; TestFetchRaw::test_raises_when_proxy_only_blocks_github pins the new contract.

(item 7) _resolve_plugin_github_coords now requires exact owner/name. Previously repo.partition("/") accepted owner/repo/extra and silently coerced it to owner='owner', name='repo/extra', which then 404'd against GitHub and surfaced as NO_MANIFEST. The check is now repo.count("/") != 1, so misconfigured catalogue entries get the more accurate UNSUPPORTED_SOURCE classification. Covered by test_repo_with_extra_slashes_unsupported.

Also caught during the rebase

fetch_raw was missing the host-kind guard that #1034 added to _fetch_file. Because fetch_raw is new on this branch, it never went through that review pass. Without the guard a plugin source with host: "gitlab.com" would have caused fetch_raw to attach Authorization: token <github_pat> to a request aimed at GitLab -- the exact credential-leak scenario the #1034 changelog calls out. fetch_raw now mirrors _fetch_file's structure: AuthResolver.classify_host(host) filters to github / ghe_cloud / ghes, and the inner _do_fetch keeps the conditional credential attach as defense-in-depth. test_non_github_host_rejected_before_request asserts no HTTP request is issued.

Not changed

(items 5-6) auth_resolver: object | None typing. A Protocol or the real AuthResolver class would be tighter, but _fetch_file on main uses the same object | None. Tightening only fetch_raw would create an inconsistency inside the same module; doing it across the codebase is a separate cross-cutting refactor that doesn't belong in a feature PR.

(item 1) The "doctor"/"audit" + #847/#881 metadata comment. The branch name feat/marketplace-doctor-847 is kept because renaming would close the PR; title, body, commit message, code, docs, and changelog all say audit. Fixes #847 references the issue; (#881) in the changelog references the PR delivering the fix -- these are intentionally different references and follow the Keep a Changelog convention already used in this file.

Status

  • uv run pytest tests/unit tests/test_console.py -- 7075 passed; the preexisting test_is_tool_available environment-only failure is unrelated.
  • 58 new tests: 54 in test_marketplace_audit.py, 4 in test_marketplace_client.py::TestFetchRaw (proxy hit, proxy-only raise, neutral-error wrap, non-GitHub fail-closed).
  • ruff clean across all changed files.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

marketplace doctor: warn when package dependencies bypass marketplace

3 participants