Skip to content

Automate NOTICE.md maintenance with SBOM-driven generator + CI drift gate #1044

@danielmeppiel

Description

@danielmeppiel

Problem

Following PR #1043, the repo now ships a hand-curated NOTICE.md enumerating third-party components and their license / attribution text. Hand-curation does not scale:

  • Direct dependencies will drift over time as pyproject.toml evolves
  • License versions / NOTICE / AUTHORS files in upstream projects can be updated
  • Contributors adding deps cannot reasonably be expected to manually update NOTICE
  • There is currently no CI gate preventing the file from going stale

This is the same pattern the broader Python / OSS ecosystem and other Microsoft repositories solve with a generator + drift check (e.g. VS Code's oss-attribution-generator, Component Governance).

Proposal

Three pieces, designed to be additive (no behavior change for users, no runtime dep change):

1. Generator script — scripts/generate-notice.py

  • Reads [project] dependencies from pyproject.toml (direct deps only — industry-standard NOTICE scope)
  • For each dep, resolves license text + optional NOTICE / AUTHORS attribution via:
    • importlib.metadata against the installed uv-managed env (primary — fastest, deterministic)
    • PyPI JSON API as fallback when local metadata is incomplete
  • Renders the NOTICES template format CELA documents (header line + per-component sections)
  • Two modes:
    • default — regenerate NOTICE.md
    • --check — exit 1 if regeneration would change the file (used by CI)
  • ~200 LOC, no new runtime dependency

2. SBOM generation — CycloneDX

  • Add cyclonedx-bom to [dependency-groups] dev
  • Generator script emits a CycloneDX JSON SBOM alongside NOTICE
  • SBOM is the single source of truth: NOTICE is a view over it
  • Foundation for future supply-chain reporting (US EO 14028, EU CRA)

3. CI drift gate — .github/workflows/notice-drift.yml

  • Triggers: pull_request + merge_group (mirrors ci.yml pattern)
  • Runs the generator in --check mode → fails with actionable error if NOTICE is stale
  • Uploads SBOM as workflow artifact (30 day retention)
  • Wired into merge-gate.yml EXPECTED_CHECKS so it becomes a required check at the same authority level as Build & Test

4. Supply-chain bonus — license policy gate

  • Add actions/dependency-review-action to fail PRs that would introduce GPL-* / AGPL-* / SSPL-* etc. into runtime deps
  • Zero runtime impact; PR-time only

Why CycloneDX, not pip-licenses

pip-licenses is fine for license listing but doesn't produce a structured SBOM. CycloneDX is:

  • An OWASP standard
  • Widely adopted in Microsoft (used by Defender for Cloud's container scanning, GitHub Advanced Security)
  • Required for compliance reporting going forward (US EO 14028, EU CRA Article 7)
  • Future-proof: the SBOM can drive vulnerability scanning, license policy, and NOTICE simultaneously

Acceptance criteria

  • uv run python scripts/generate-notice.py regenerates NOTICE.md
  • uv run python scripts/generate-notice.py --check exits 0 if up to date, 1 with diff if not
  • CI gate fails when a dep is added to pyproject.toml without regenerating NOTICE.md
  • CI gate uploads SBOM as a workflow artifact on every PR
  • New check is added to merge-gate.yml EXPECTED_CHECKS
  • Generated NOTICE.md is byte-for-byte equivalent to the manually-authored file from PR chore: add NOTICE file for third-party components #1043 (proof that the script reproduces our existing output before we entrust CI to it)

Out of scope (follow-ups)

  • Transitive dependency NOTICE expansion (current direct-only scope is industry-standard; expand only if CELA mandates)
  • Publishing SBOM as a release asset on build-release.yml (one-line addition; do once design lands)
  • Vulnerability scanning over the SBOM (separate concern)

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    Status

    In Progress

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions