fix(policy): fail closed on malformed manifest YAML#1053
fix(policy): fail closed on malformed manifest YAML#1053sergio-sisternes-epam merged 3 commits intomainfrom
Conversation
Replace silent bypass with failing manifest-parse CheckResult when apm.yml cannot be parsed. Fixes four code paths: - run_policy_checks(): catch (ValueError, yaml.YAMLError) instead of silently returning empty CIAuditResult - run_baseline_checks(): same pattern in ci_checks.py - _check_lockfile_exists(): return passed=False instead of passed=True - _load_raw_apm_yml(): replace bare except Exception with yaml.YAMLError and OSError, log WARNING on parse failure Add manifest-parse to _CHECK_ARTIFACT_MAP for SARIF output. 14 new tests cover malformed YAML, non-dict YAML, scalar YAML, and regression guards. 522 policy tests pass, 6744 total unit tests pass. Update security.md, policy-reference.md, governance-guide.md, governance.md skill resource, and CHANGELOG. Closes #936 Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
a7074a2 to
704e3d7
Compare
There was a problem hiding this comment.
Pull request overview
Note
Copilot was unable to run its full agentic suite in this review.
Fixes a security gap where malformed apm.yml could cause policy/baseline checks to be silently bypassed, ensuring audits fail closed when the manifest cannot be parsed.
Changes:
- Fail-closed behavior added for malformed/non-mapping
apm.ymlacross policy and baseline CI checks via a dedicatedmanifest-parsefailing check. _load_raw_apm_yml()tightened to avoid swallowing parse/read errors silently and to log warnings for visibility.- Docs and unit tests added/updated to describe and enforce the new failure semantics.
Reviewed changes
Copilot reviewed 10 out of 10 changed files in this pull request and generated 5 comments.
Show a summary per file
| File | Description |
|---|---|
| tests/unit/policy/test_policy_checks.py | Adds regression/unit tests covering malformed manifest handling and _load_raw_apm_yml() behavior. |
| tests/unit/policy/test_ci_checks.py | Adds tests ensuring baseline checks fail-closed on malformed manifests (including a patched path). |
| src/apm_cli/policy/policy_checks.py | Implements fail-closed manifest parsing in run_policy_checks() and adds warning logs in _load_raw_apm_yml(). |
| src/apm_cli/policy/ci_checks.py | Implements fail-closed manifest parsing in _check_lockfile_exists() and run_baseline_checks(). |
| src/apm_cli/policy/models.py | Registers the new manifest-parse check artifact mapping to apm.yml. |
| packages/apm-guide/.apm/skills/apm-usage/governance.md | Documents that malformed apm.yml is unconditionally fail-closed via manifest-parse. |
| docs/src/content/docs/enterprise/security.md | Documents the new manifest integrity behavior for audits. |
| docs/src/content/docs/enterprise/policy-reference.md | Adds a manifest_parse row to the outcomes table (but see doc consistency comment below). |
| docs/src/content/docs/enterprise/governance-guide.md | Adds a failure semantics row for malformed project manifest (but see naming consistency comment below). |
| CHANGELOG.md | Records the security fix under [Unreleased]. |
- Split FileNotFoundError from OSError in _load_raw_apm_yml() to avoid noisy warnings for normal "missing file" conditions (TOCTOU) - Add UnicodeDecodeError handling for non-UTF8/binary files - Add OSError to TOCTOU catch in run_policy_checks() and _check_lockfile_exists() so file-disappear races map to manifest-parse failure instead of unhandled exceptions - Fix docs: manifest_parse -> manifest-parse (hyphen) and clarify it is a local audit check, not a fetch outcome Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
477e8e9 to
a40e911
Compare
APM Review Panel Verdict: REJECT
Required before merge (10 items)
Nits (13 items, skip if you want)
CEO arbitrationThe panel has converged on five substantive findings, three of which form a tightly coupled correctness-and-security cluster. First, the check-name contract violation in Dissent resolved: The cli-logging-expert's required finding on Growth/positioning note: This fix is being undersold. "APM silently skipped your security policy checks when Per-persona findings (full)Python ArchitectclassDiagram
namespace models {
class CheckResult {
<<dataclass>>
+name: str
+passed: bool
+message: str
+details: List[str]
}
class CIAuditResult {
<<dataclass>>
+checks: List[CheckResult]
+passed: bool
+failed_checks: List[CheckResult]
+to_json() dict
+to_sarif() dict
}
}
namespace policy_checks {
class PolicyCheckModule {
<<module>>
+_load_raw_apm_yml(project_root) Optional[dict]
+run_policy_checks(project_root, policy) CIAuditResult
+run_baseline_checks(project_root) CIAuditResult
}
}
namespace ci_checks {
class CICheckModule {
<<module>>
+_check_lockfile_exists(project_root) CheckResult
+run_baseline_checks(project_root) CIAuditResult
}
}
class APMPackage {
<<model>>
+from_apm_yml(path) APMPackage$
+has_apm_dependencies() bool
+get_apm_dependencies() List[DependencyReference]
}
class LockFile {
<<model>>
+read(path) Optional[LockFile]$
+dependencies: Dict[str, LockedDependency]
}
class ApmPolicy {
<<model>>
+dependency_policy: DependencyPolicy
}
CIAuditResult "1" o-- "*" CheckResult : contains
PolicyCheckModule ..> CheckResult : creates
PolicyCheckModule ..> CIAuditResult : returns
PolicyCheckModule ..> APMPackage : parses via
PolicyCheckModule ..> LockFile : reads
PolicyCheckModule ..> ApmPolicy : enforces
CICheckModule ..> CheckResult : creates
CICheckModule ..> CIAuditResult : returns
CICheckModule ..> APMPackage : parses via
CICheckModule ..> LockFile : reads
note for CICheckModule "_check_lockfile_exists can emit name=manifest-parse\ncontract mismatch (Required finding #1)"
note for PolicyCheckModule "Two parse paths:\n1. APMPackage.from_apm_yml (raises, fail-closed)\n2. _load_raw_apm_yml (returns None, silent)\nOnly path 1 is security-critical."
class CheckResult:::touched
class CIAuditResult:::touched
class PolicyCheckModule:::touched
class CICheckModule:::touched
classDef touched fill:#fff3b0,stroke:#d47600
flowchart TD
A([apm audit --ci]) --> B["ci_checks.run_baseline_checks\nci_checks.py:425"]
B --> C["_check_lockfile_exists\nci_checks.py:25 [FS]"]
C --> D{"apm.yml exists?\n[FS]"}
D -- no --> E["CheckResult(lockfile-exists, passed=True)"]
D -- yes --> F["APMPackage.from_apm_yml\n[FS] -- PARSE 1"]
F -- "ValueError/YAMLError" --> G["CheckResult(name=manifest-parse, passed=False)\nWRONG NAME for this function"]
F -- ok --> H{"has_deps?"}
H -- no deps/no lockfile --> I["CheckResult(lockfile-exists, passed=True)"]
H -- deps declared --> J{"lockfile_path.exists?\n[FS]"}
J -- yes --> K["CheckResult(lockfile-exists, passed=True)"]
J -- no --> L["CheckResult(lockfile-exists, passed=False)"]
E & G & I & K & L --> M["result.checks.append\nrun_baseline_checks:437"]
M --> N{"result.checks[0].passed?"}
N -- false --> O([return result fail-closed])
N -- true --> P["clear_apm_yml_cache()\nci_checks.py:455 CACHE BUST"]
P --> Q["APMPackage.from_apm_yml\n[FS] -- PARSE 2 REDUNDANT/TOCTOU"]
Q -- "ValueError/YAMLError" --> R["CheckResult(manifest-parse, passed=False)\nresult.checks.append, return"]
Q -- ok --> S["LockFile.read\n[FS]"]
S --> T["_check_ref_consistency\n_check_deployed_files_present\n_check_no_orphans\n_check_content_integrity\n[FS/LOCK]"]
T --> U([return CIAuditResult])
B2([policy_checks.run_policy_checks]) --> V{"apm.yml exists?\n[FS]"}
V -- no --> W([return empty CIAuditResult PERMISSIVE])
V -- yes --> X["APMPackage.from_apm_yml\n[FS] -- raises on malformed"]
X -- "ValueError/YAMLError" --> Y["CheckResult(manifest-parse, passed=False), return"]
X -- ok --> Z["_load_raw_apm_yml\n[FS] -- PARSE 2 returns None on error"]
Z --> AA["raw-field checks: compilation-target\ncompilation-strategy, scripts-policy\n(raw_yml or {}).get(...) -- PERMISSIVE on None"]
AA --> AB([return CIAuditResult])
Design patterns
CLI Logging ExpertRequired findings above. No additional findings. DevX UX ExpertRequired findings above. No additional findings. Supply Chain Security ExpertRequired findings above. Additional nit: Auth ExpertInactive -- PR touches only policy YAML parsing files (policy_checks.py, ci_checks.py, models.py) and documentation -- no auth, token, credential, or host-classification paths are affected. OSS Growth HackerRequired findings above. No additional findings beyond what is captured in the CEO growth/positioning note. Verdict computed deterministically: 10 required findings across 5 active panelists. APPROVE iff N == 0. Push a new commit to clear this verdict label automatically.
|
Architecture: - Hoist manifest parsing out of _check_lockfile_exists into run_baseline_checks -- single parse eliminates TOCTOU window - _check_lockfile_exists now accepts Optional[APMPackage] and always returns name="lockfile-exists" (contract violation fixed) - Remove redundant clear_apm_yml_cache() + second from_apm_yml() call - Add OSError to catch tuple in run_baseline_checks - Add _logger + debug logging for lockfile read exceptions UX: - All manifest-parse error messages include remediation hint - CHANGELOG: add Migration block with detection command - governance-guide.md: disambiguate org policy vs project manifest rows - security.md: add remediation callout Docstrings: - _load_raw_apm_yml: document ordering, defence-in-depth contract Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Panel Findings -- Triage & ResolutionThanks for the thorough review. All required findings have been triaged and addressed. Here's the breakdown: Fixed (8/10 required items)
Deferred (1/10)
Nits addressed
Not addressed (rationale)
|
Description
Replace silent policy-enforcement bypass with failing
manifest-parseCheckResult whenapm.ymlis malformed YAML or not a mapping. Four code paths inpolicy_checks.pyandci_checks.pyswallowed parse errors and returned permissive defaults, allowing an attacker to bypass all policy enforcement by introducing a subtly malformed manifest.Fixes #936
Type of change
Testing
Details
The vulnerability
run_policy_checks()except (ValueError, FileNotFoundError): return result-- emptyCIAuditResult, all checks trivially pass(ValueError, yaml.YAMLError), appends failingCheckResult(name="manifest-parse", passed=False)run_baseline_checks()CheckResultinstead of silent return_check_lockfile_exists()except (ValueError, FileNotFoundError)returnspassed=True(ValueError, yaml.YAMLError), returnspassed=False_load_raw_apm_yml()except Exception: return None-- swallows everything silentlyyaml.YAMLErrorandOSErrorspecifically, logs WARNINGWhat's NOT changed
The org policy file (
apm-policy.yml) loading path was already correct --parser.py::load_policy()raisesPolicyValidationErroron malformed YAML, andoutcome_routing.pyhandlesmalformedoutcomes via the documentedfetch_failure_defaultknob. This PR fixes the project manifest (apm.yml) parse path only.Implementation (3 files)
src/apm_cli/policy/policy_checks.py-- fix_load_raw_apm_yml()andrun_policy_checks()src/apm_cli/policy/ci_checks.py-- fix_check_lockfile_exists()andrun_baseline_checks()src/apm_cli/policy/models.py-- addmanifest-parseto_CHECK_ARTIFACT_MAPTests (2 files, 18 new tests)
tests/unit/policy/test_policy_checks.py--TestLoadRawApmYml(6 tests),TestRunPolicyChecksMalformedManifest(5 tests incl. regression guard)tests/unit/policy/test_ci_checks.py--TestCheckLockfileExistsMalformedManifest(4 tests),TestRunBaselineChecksMalformedManifest(3 tests incl. monkeypatched second-catch path)Documentation (5 files)
CHANGELOG.md-- Security entry under[Unreleased]docs/src/content/docs/enterprise/security.md-- manifest integrity bulletdocs/src/content/docs/enterprise/policy-reference.md--manifest_parseoutcome rowdocs/src/content/docs/enterprise/governance-guide.md-- failure semantics table rowpackages/apm-guide/.apm/skills/apm-usage/governance.md-- clarifying paragraphValidation
audit_report.pysyntax error excluded)