Skip to content

fix(ci): deploy docs after bot-cut releases via workflow_call#953

Merged
danielmeppiel merged 1 commit intomainfrom
fix/docs-deploy-on-bot-release
Apr 26, 2026
Merged

fix(ci): deploy docs after bot-cut releases via workflow_call#953
danielmeppiel merged 1 commit intomainfrom
fix/docs-deploy-on-bot-release

Conversation

@danielmeppiel
Copy link
Copy Markdown
Collaborator

TL;DR

The Deploy Docs workflow listens for release: published, but that event never fires when the release is created by GITHUB_TOKEN — a documented Actions safeguard against recursion. The CI/CD Pipeline cuts releases as github-actions[bot], so docs haven't auto-deployed since v0.9.3 (and won't going forward). This PR fixes it by converting docs.yml into a reusable workflow and invoking it directly from the release job.

Problem (WHY)

  • v0.9.3 published at 2026-04-26T12:04:31Z (release)
  • API check: repos/microsoft/apm/actions/runs?event=release returns 0 runs in the last 30 days, across the entire repo
  • Last actual Deploy Docs deploy was a manual workflow_dispatch on 2026-04-23 (3 days before v0.9.3)
  • Root cause: release was authored by github-actions[bot] (confirmed via API). GitHub Actions documents: "events triggered by the GITHUB_TOKEN [...] will not create a new workflow run"

So the published site at https://microsoft.github.io/apm still serves pre-0.9.3 content despite a successful release.

Approach (WHAT)

Convert docs.yml into a reusable workflow (workflow_call) and have the CI/CD Pipeline's create-release job invoke it directly via a new deploy-docs job. No new credentials; no event-loss risk.

Why this over the alternatives

Approach New creds? Coupling Failure modes
workflow_call from release job (this PR) None Explicit, intentional Direct invocation — no event-loss risk
workflow_run event chain None Implicit, name-based ("CI/CD Pipeline") Always runs the default-branch workflow file (cut-at-tag pinning gotcha); fires on every CI/CD completion → must filter conclusion + branch + detect "was a release cut"
Fine-grained PAT in release step PAT secret Loose (release event) Tied to a user; expires (max 1y); Microsoft PAT policies still apply; rotation burden
GitHub App token App install Loose (release event) App install on a Microsoft-org repo requires security review
Fold docs into CI/CD Pipeline None Maximally tight Loses PR-time docs build; can't deploy docs without re-running CI

The workflow_call approach is the right level of coupling for this case: the docs site is part of the release artifact, not an independent observer, so the release pipeline declaring "publishing a release includes deploying its docs" is correct domain modeling.

Implementation (HOW)

docs.yml — add workflow_call trigger with an is_prerelease input, extend the existing gating conditions on build and deploy jobs to honor it. Keep release: published, pull_request, and workflow_dispatch triggers as-is for backward compatibility (human-cut releases, PR builds, manual re-publish all continue to work).

build-release.yml — add a deploy-docs job that needs: [create-release], gated to stable tags only (is_prerelease != 'true'), and uses: ./.github/workflows/docs.yml with is_prerelease: false. Job declares its own permissions: (contents: read, pages: write, id-token: write) — these bound what the called workflow can request.

deploy-docs:
  name: Deploy Docs
  needs: [create-release]
  if: github.ref_type == 'tag' && needs.create-release.outputs.is_prerelease != 'true'
  uses: ./.github/workflows/docs.yml
  permissions:
    contents: read
    pages: write
    id-token: write
  with:
    is_prerelease: false

Trade-offs

  • Trades event-driven loose coupling for explicit pipeline composition. For a single repo where the release pipeline owns end-to-end shipping (artifacts + docs + downstream packages), composition is the right call. If APM later grows to N independent post-release consumers maintained by different teams, revisit (App-token + release: published becomes more attractive at that scale).
  • release: published trigger retained. Slightly more surface area in the trigger list, but it's a safety net — human-cut releases (rare but possible) still deploy docs without further work.

Validation evidence

  • actionlint clean for the changes (only pre-existing SC2086 shellcheck infos elsewhere in build-release.yml)
  • YAML loads cleanly via yaml.safe_load; both jobs lists parse as expected:
    • docs.yml jobs: build, deploy
    • build-release.yml jobs: ..., create-release, deploy-docs, gh-aw-compat, ...
  • ASCII-only (verified: 0 non-ASCII bytes added in docs.yml)
  • All existing trigger paths preserved (PR-time build, manual workflow_dispatch, human-cut release: published)

How to test

  1. Backfill v0.9.3: trigger Deploy Docs via workflow_dispatch on main to republish the v0.9.3 docs site immediately (no merge needed).
  2. End-to-end on next release: cut the next tag (v0.9.4 or later); confirm a Deploy Docs job now appears as a child of the CI/CD Pipeline run, and https://microsoft.github.io/apm reflects the new version within minutes.
  3. PR build still works: this PR itself should trigger a Deploy Docs build-only run if it touches docs/** (it doesn't, so no run expected — matches current behavior).
  4. Prerelease gating: cut any prerelease tag (vX.Y.Z-rc1); confirm deploy-docs job is skipped (if: evaluates false).

Co-authored-by: Copilot 223556219+Copilot@users.noreply.github.com

Releases created by GITHUB_TOKEN do not trigger downstream workflow
runs (a documented Actions safeguard against recursion), so the
`release: published` trigger on docs.yml never fired for v0.9.3 --
which is why https://microsoft.github.io/apm still serves pre-0.9.3
content even though the v0.9.3 release published successfully.

Fix: convert docs.yml into a reusable workflow (`workflow_call` with
an `is_prerelease` input) and have the CI/CD Pipeline's release job
invoke it directly via a new `deploy-docs` job after `create-release`.

This is the cleanest fix under Microsoft org policy:

- No new credentials. A PAT or GitHub App token would also work
  (release would then be attributed to a non-bot identity, so
  `release: published` would fire), but App installs on Microsoft-org
  repos require security review, and PATs are tied to a user and have
  rotation/policy overhead.
- No `workflow_run` gotchas. `workflow_run` would tightly couple
  docs.yml to the CI/CD Pipeline name, always run the default-branch
  workflow file (cut-at-tag pinning gotcha), and force docs.yml to
  re-derive whether the upstream run actually cut a release.
- Backward compatible. `release: published` and `workflow_dispatch`
  triggers are kept, so human-cut releases and manual re-publishes
  continue to work unchanged.
- Visible in the Actions DAG. The deploy appears as a child of the
  release run, so 'did docs deploy?' is one click from the release
  run instead of buried in a separate workflow timeline.

For v0.9.3 specifically, the docs site can be republished via a
manual `Deploy Docs` workflow_dispatch run on `main`.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Copilot AI review requested due to automatic review settings April 26, 2026 13:53
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR fixes docs not deploying after bot-created releases by converting the existing Deploy Docs workflow into a reusable workflow (workflow_call) and explicitly invoking it from the release pipeline after a stable tag release is published.

Changes:

  • Added workflow_call support to docs.yml with an is_prerelease input and updated gating so stable workflow_call runs upload artifacts and deploy.
  • Added a deploy-docs job in build-release.yml that calls docs.yml after create-release, skipping prereleases.
Show a summary per file
File Description
.github/workflows/docs.yml Adds workflow_call entrypoint + prerelease-aware gating for upload/deploy.
.github/workflows/build-release.yml Invokes docs deployment explicitly after create-release for stable tags.

Copilot's findings

  • Files reviewed: 2/2 changed files
  • Comments generated: 0

@danielmeppiel danielmeppiel merged commit 3922c0d into main Apr 26, 2026
18 checks passed
@danielmeppiel danielmeppiel deleted the fix/docs-deploy-on-bot-release branch April 26, 2026 14:10
danielmeppiel added a commit that referenced this pull request Apr 27, 2026
* chore(release): cut 0.9.4

CHANGELOG entry for 0.9.4 covers all 7 PRs merged since v0.9.3:

- #974 SKILL_BUNDLE day-0 install parity (Added)
- #954 automate apm-triage-panel workflow (Added)
- #970 python-architect mermaid classDiagram trap (Changed)
- #911 REQUESTS_CA_BUNDLE TLS validation (Fixed)
- #971 triage-panel project-sync dispatch (Fixed)
- #910 CLI consistency cleanup (Fixed)
- #958 issue templates label taxonomy (Fixed)
- #953 docs auto-deploy after bot-cut releases (Fixed)

Open milestone 0.9.4 issues (41) reassigned to 0.9.5.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* chore(changelog): tighten 0.9.4 entries (so-what for developers)

Refactor per Keep-a-Changelog spirit: lead with developer impact,
trim agent-internals prose, group maintainer-only changes.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* chore(changelog): add #660 install.sh air-gapped entry to 0.9.4

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

---------

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
danielmeppiel added a commit that referenced this pull request Apr 27, 2026
Copilot reviewer caught: #skill-bundle was wrong on two counts.
1. Rendered anchor is #skill-bundle-skillmd-at-root, not #skill-bundle --
   slugifier strips parens/dots/dashes differently than expected.
2. That heading documents the SKILL.md-at-root shape, but npx skills'
   layout is documented under 'Skill collection (skills/<name>/SKILL.md)'
   further down the same page.

The 'Skill collection' anchor exists in source (added in 0.9.4) but is
not deployed yet -- v0.9.4 docs auto-deploy from #953 is mid-rollout.
Linking to the page root is robust today and lands on the right doc;
both shapes are visible without a scroll.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
danielmeppiel added a commit that referenced this pull request Apr 27, 2026
* docs(readme): add 'Coming from npx skills add?' conversion block

Surgical insert between the apm.yml example and the three promises -- the
funnel moment where npx-skills users decide whether to switch. Names
vercel-labs/agent-skills and the real 'deploy-to-vercel' skill so the claim
is verifiable in 30 seconds. Bold inline lead (no h3) keeps the page flow
intact for non-npx readers; one outbound link defers prose to docs.

Both commands verified locally end-to-end: whole-bundle install integrates
7 skills; --skill deploy-to-vercel integrates 1 and persists the subset
into apm.yml + apm.lock.yaml's skill_subset.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* docs(readme): drop fragment, link to package-types page root

Copilot reviewer caught: #skill-bundle was wrong on two counts.
1. Rendered anchor is #skill-bundle-skillmd-at-root, not #skill-bundle --
   slugifier strips parens/dots/dashes differently than expected.
2. That heading documents the SKILL.md-at-root shape, but npx skills'
   layout is documented under 'Skill collection (skills/<name>/SKILL.md)'
   further down the same page.

The 'Skill collection' anchor exists in source (added in 0.9.4) but is
not deployed yet -- v0.9.4 docs auto-deploy from #953 is mid-rollout.
Linking to the page root is robust today and lands on the right doc;
both shapes are visible without a scroll.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* docs(readme): fix broken anchor in npx skills add conversion block

Agent-Logs-Url: https://github.com/microsoft/apm/sessions/6725d577-cad3-4772-b2d5-d1d6bf44e926

Co-authored-by: danielmeppiel <51440732+danielmeppiel@users.noreply.github.com>

* docs(readme): restore precise #skill-collection anchor now that 0.9.4 docs are live

v0.9.4 docs deploy completed (manual workflow_dispatch run 24983311706
after the underlying skip bug was identified -- fix in #981). The live
page now exposes #skill-collection-skillsnameskillmd, so we can land
readers exactly on the heading that documents the skills/<name>/SKILL.md
layout npx skills users already know.

Verified live:
  curl -s .../reference/package-types/ | grep -oE 'id="skill[^"]*"'
  -> id="skill-bundle-skillmd-at-root"
  -> id="skill-collection-skillsnameskillmd"

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

---------

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Co-authored-by: copilot-swe-agent[bot] <198982749+Copilot@users.noreply.github.com>
danielmeppiel pushed a commit that referenced this pull request Apr 27, 2026
Promotes [Unreleased] to [0.10.0] - 2026-04-27. Each PR since v0.9.4
gets one 'so what' line:

- #926 Microsoft 365 Cowork target ships impl
- #790 marketplace authoring CLI (init, package add/set, build, check,
  outdated, doctor, publish) -- collapsed from 20+ bullets to one
- #722 marketplace plugin -> package rename + --help sectioning -- collapsed
- #980 README 'Coming from npx skills add' conversion block
- #981 docs auto-deploy on tag push (real fix for the #953 attempt)
- #985 pr-description-skill evals suite
- #984 pr-description-skill mermaid hardening
- #989 cowork sys.platform mock for Windows CI

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants