Skip to content

PR-context workflows still lose APM-restored skills after v0.71.0 #28221

@theletterf

Description

@theletterf

Summary

We now have a small public repro showing that APM-restored skills are invocable in workflow_dispatch runs, but still unavailable in pull_request runs under engine: copilot.

This reproduces even after recompiling the workflow against gh-aw v0.71.0, which appears intended to fix PR-context APM restore ordering in #28002.

Public repro

Repo:

The workflow:

  • imports one public skill through shared/apm.md
  • uses engine: copilot
  • runs in two contexts: workflow_dispatch and pull_request
  • avoids fallback manual review so the result reflects skill availability only

Imported skill:

  • elastic/elastic-docs-skills/skills/review/docs-check-style

What works

workflow_dispatch

The skill is restored and invocable in dispatch runs.

Successful run examples:

The smoke test confirmed that the agent could invoke the skill and recover exact values from the returned skill context, including:

  • version 1.0.5
  • allowed-tools Read, Grep, Glob, Bash(vale *), WebFetch
  • first source URL https://www.elastic.co/docs/contribute-docs/style-guide

What fails

pull_request

The same repo, same engine, same imported skill, same workflow family fail in PR context.

Failing PR run on older compile:

Failing PR run after recompiling the smoke test against gh-aw v0.71.0:

The PR-context result is explicit:

Skill invocation failed

Invocation attempted: `skill(skill: docs-check-style)`

Result: The skill `docs-check-style` was not found.
The only available skill is `customizing-copilot-cloud-agents-environment`.

Why this is surprising

PR #28002 appears to be the intended fix:

That PR description says shared APM restore was moved into pre-agent-steps so PR-context base restore would no longer clobber .github/skills before agent startup.

However, the public PR-context repro still fails after recompiling against v0.71.0.

Key observations

  1. This is not a blanket APM import failure.

    • APM pack/unpack succeeds.
    • Dispatch runs can invoke the skill successfully.
  2. This is not a skill-content issue.

    • The same public Elastic Docs Skill works in dispatch context.
  3. This is trigger-context specific.

    • workflow_dispatch: success.
    • pull_request: failure.
  4. In the generated PR-context job order we still observed:

    • Restore APM packages
    • later Restore agent config folders from base branch

    That ordering still looks suspicious because the restore step overwrites .github from the trusted base snapshot.

Request

Could you investigate why APM-restored skills remain available in dispatch runs but are still unavailable in PR runs under engine: copilot, even after the v0.71.0 change associated with #28002?

Possible angles:

Related context

Earlier issue:

That issue started from a larger private repro; the repo above is the cleaner public repro with both success and failure modes.

Metadata

Metadata

Type

No type
No fields configured for issues without a type.

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions