[copilot-cli-research] Copilot CLI Deep Research - 2026-03-19 #21858

2026-03-19T21:29:15Z

github-actions[bot]
bot Mar 19, 2026

Analysis Date: 2026-03-19
Repository: github/gh-aw
Triggered by: @pelikhan
Scope: 175 total workflows, 83 using Copilot engine (47%)

📊 Executive Summary

This first comprehensive analysis of Copilot CLI usage across 175 agentic workflows reveals a significant gap between available features and actual adoption. Of 83 Copilot-engine workflows, only 12 (14%) use the AWF sandbox firewall, while 7 of 9 custom agent files go unused. Key advanced features like max-continuations, startup-timeout, engine.env, and granular GitHub tool permissions are almost entirely absent despite being fully supported.

The most impactful opportunity is broader AWF sandbox adoption — 71 Copilot workflows currently run with full network access, creating unnecessary security exposure. The second biggest opportunity is leveraging existing custom agent files (.github/agents/*.agent.md) to provide specialized personas and behavioral instructions to specific workflow types.

Primary Recommendation: Systematically enable AWF sandbox for security-sensitive workflows and wire up unused custom agent files to appropriate workflows.

🔴 Critical Findings

Security: 71 of 83 Copilot Workflows Run Without AWF Sandbox

Most Copilot workflows execute with unrestricted network access. Only 12 workflows use sandbox: agent: awf. This is the highest-impact opportunity — especially for workflows that process untrusted input (issues, PRs, comments).

Underutilization: 7/9 Custom Agent Files Are Never Used

The repository has 9 .github/agents/*.agent.md files, but only ci-cleaner and technical-doc-writer are wired to workflows. The following agents are orphaned:

agentic-workflows.agent.md
contribution-checker.agent.md
create-safe-output-type.agent.md
custom-engine-implementation.agent.md
grumpy-reviewer.agent.md
interactive-agent-designer.agent.md
w3c-specification-writer.agent.md

🟡 Medium Priority Issues

max-continuations Nearly Absent: Only 1 workflow (smoke-copilot) uses max-continuations: 2. Copilot CLI explicitly supports --autopilot --max-autopilot-continues, making this ideal for complex multi-step workflows. Complex workflows like ci-doctor, code-scanning-fixer, and hourly-ci-cleaner could benefit significantly.

GitHub Tool Permissions Too Broad: 122 workflows use toolsets: [default] which grants broad GitHub MCP access. Only 8 workflows define granular allowed: lists. For read-only workflows, specifying toolsets: [repos, issues] instead of [default] would reduce attack surface.

startup-timeout Never Used: Despite being a supported feature, no workflow sets startup-timeout. This helps detect hung Copilot CLI processes that never initialize, avoiding wasted runner time.

1️⃣ Current State Analysis

View Copilot CLI Capabilities Inventory

Available Engine Configuration Options

engine:
  id: copilot
  version: "0.0.422"          # Pin CLI version (default: latest)
  model: gpt-5.1-codex-mini   # Override model
  agent: agent-id              # Custom .github/agents/*.agent.md
  command: /path/to/copilot    # Custom executable path
  args: ["--add-dir", "/extra"]# Extra CLI args
  env:                         # Custom environment variables
    MY_VAR: "value"
  api-target: api.acme.ghe.com # Enterprise endpoint

CLI Flags Auto-Generated by gh-aw

--add-dir (workspace, /tmp/gh-aw/, cache-memory dirs)
--log-level all --log-dir (logs_folder)
--disable-builtin-mcps
--allow-tool (per configured tool)
--autopilot --max-autopilot-continues N (when max-continuations > 1)
--agent (id) (when engine.agent is set)

Available Sandbox Options

sandbox:
  agent: awf               # Enable AWF network firewall
  agent:
    id: awf
    mounts:                # Additional filesystem mounts
      - "/usr/bin/make:/usr/bin/make:ro"
    memory: "4g"           # Container memory limit
    args: ["--extra-flag"] # Extra AWF args
    env:                   # Env vars on the AWF step
      MY_VAR: value
    command: custom-awf    # Custom sandbox binary

Available Feature Flags

features:
  copilot-requests: true       # Enable copilot-requests mode
  mcp-gateway: true            # Enable MCP gateway
  disable-xpia-prompt: true    # Disable XPIA injection prompt
  mcp-scripts: true            # Enable MCP scripts

View Usage Statistics

Metric	Count	Percentage
Total workflows	175	100%
Copilot engine workflows	83	47%
With AWF sandbox	12	14% of Copilot
With strict mode	117	67% of all
With timeout-minutes	169	97% of all
With copilot-requests feature	41	49% of Copilot
With model override	7	8% of Copilot
With engine.agent	2	2% of Copilot
With max-continuations	1	1% of Copilot
With web-fetch tool	20	24% of Copilot
With cache-memory	~10	12% of Copilot
With custom agent files wired	2	22% of available agents

Most Common Tools:

github with toolsets: [default] — 122 uses
github simple (no toolsets) — ~8 uses
bash with specific commands — many
bash: ["*"] (allow all) — 10 workflows
web-fetch — 20 workflows

Engine Distribution:

engine: copilot (simple) — 82 workflows
engine: {id: copilot, ...} (extended) — ~4 workflows
engine: claude / engine: codex — 20+ workflows
engine: gemini — handful

2️⃣ Feature Usage Matrix

Feature	Available	Used	Not Used	Usage Rate
`engine.model`	✅	7 workflows	—	8%
`engine.agent`	✅	2 workflows	—	2%
`engine.version`	✅	0 workflows	All	0%
`engine.args`	✅	0 workflows	All	0%
`engine.env`	✅	0 workflows	All	0%
`engine.api-target`	✅	0 workflows	All	0%
`sandbox.agent: awf`	✅	12 workflows	71	14%
`sandbox.agent.mounts`	✅	1 workflow	—	1%
`sandbox.agent.memory`	✅	0 workflows	All	0%
`max-continuations`	✅	1 workflow	—	1%
`startup-timeout`	✅	0 workflows	All	0%
`github.allowed` (granular)	✅	8 workflows	114	7%
`block-domains`	✅	0 workflows	All	0%
`features.copilot-requests`	✅	41 workflows	—	49%
`features.mcp-gateway`	✅	0 workflows	All	0%
`features.disable-xpia-prompt`	✅	0 workflows	All	0%
`features.mcp-scripts`	✅	1 workflow	—	1%
custom agent files	✅	2/9	7/9	22%
`cache-memory` tool	✅	~10 workflows	—	12%

3️⃣ Missed Opportunities

🔴 High Priority Opportunities

Opportunity 1: Enable AWF Sandbox for Security-Sensitive Workflows

What: 71 Copilot workflows run without the AWF network firewall
Why It Matters: Workflows processing untrusted content (issues, PR comments, external user input) are exposed to prompt injection and data exfiltration via unconstrained network access
Where: Any workflow triggered by issues, pull_request, issue_comment, or discussion events
How: Add sandbox: agent: awf and ensure network.allowed lists only required domains

sandbox:
  agent: awf
network:
  allowed:
    - defaults
    - github.com

Candidate workflows: auto-triage-issues, ai-moderator, bot-detection, contribution-check, code-scanning-fixer, breaking-change-checker, and ~30 others triggered by user content

Opportunity 2: Wire Unused Custom Agent Files to Workflows

What: 7 of 9 .github/agents/ files are never referenced by workflows
Why It Matters: Agent files provide specialized personas, capabilities, and behavioral instructions that make agents significantly more effective at specific tasks
Where: Workflows whose purpose aligns with unused agents

Agent File	Potential Workflows
`agentic-workflows.agent.md`	`create-agentic-workflow.md`, `update-agentic-workflow.md`
`contribution-checker.agent.md`	`contribution-check.md`
`grumpy-reviewer.agent.md`	`code-simplifier.md`, `cli-consistency-checker.md`
`w3c-specification-writer.agent.md`	`weekly-blog-post-writer.md`, `technical-doc-writer.md`
`interactive-agent-designer.agent.md`	`workflow-generator.md`, `create-agentic-workflow.md`

# Example: wire contribution-checker agent
engine:
  id: copilot
  agent: contribution-checker

🟡 Medium Priority Opportunities

Opportunity 3: Enable `max-continuations` for Complex Workflows

What: Only 1 workflow uses max-continuations, but Copilot CLI fully supports autopilot mode
Why It Matters: Complex multi-step tasks (CI fixes, code refactors, multi-file changes) benefit enormously from autopilot — the agent can loop, check its work, and retry without manual re-triggering
Where: ci-doctor.md, code-scanning-fixer.md, hourly-ci-cleaner.md, tidy.md, update-astro.md

max-continuations: 3  # Allow up to 3 autonomous continuation loops

Note from the codebase: This translates to --autopilot --max-autopilot-continues 3.

Opportunity 4: Add `startup-timeout` to All Copilot Workflows

What: Zero workflows set startup-timeout despite it being a supported field
Why It Matters: Without a startup timeout, a hung Copilot CLI process that never prints its first token will block the runner for the full timeout-minutes value. A startup timeout of 2-3 minutes catches initialization failures quickly.
Where: All Copilot workflows — add as a default

startup-timeout: 3  # Fail fast if agent doesn't start within 3 minutes
timeout-minutes: 30

Opportunity 5: Use Granular GitHub Tool Permissions

What: 122 workflows allow toolsets: [default] which includes repos, issues, pull requests, code scanning, and more — even when only a subset is needed
Why It Matters: Principle of least privilege — reducing tool access limits what a compromised or hallucinating agent can do
Where: Read-only reporting workflows that only need issue/PR reading

# Before (too permissive):
tools:
  github:
    toolsets: [default]

# After (scoped to need):
tools:
  github:
    toolsets: [repos, issues]
    min-integrity: none

Opportunity 6: Pin Copilot CLI Version for Critical Workflows

What: All workflows use engine: copilot without a version pin, meaning they always install latest
Why It Matters: A breaking CLI update can fail many workflows simultaneously. Critical production workflows should be pinned for reproducibility.
Where: High-frequency scheduled workflows: hourly-ci-cleaner, daily-news, daily-copilot-token-report, weekly-blog-post-writer

engine:
  id: copilot
  version: "latest"  # Change to a specific version like "0.0.422" for stability

Opportunity 7: Use `engine.env` for Workflow-Specific Configuration

What: No workflows use engine.env despite it being available for all engines
Why It Matters: This allows passing workflow-specific configuration to the agent without modifying the prompt, enabling cleaner separation of concerns
Where: Workflows that currently hardcode values in the prompt

engine:
  id: copilot
  env:
    TARGET_REPO: $\{\{ vars.TARGET_REPOSITORY }}
    ANALYSIS_WINDOW: "90d"
    DEBUG_MODE: "false"

🟢 Low Priority Opportunities

Opportunity 8: Set `sandbox.agent.memory` for Resource-Intensive Workflows

What: No workflow sets a memory limit for AWF containers
Why It Matters: Long-running agents (playwright, large file analysis) can consume excessive memory and impact runner stability
Where: Workflows with AWF sandbox that do heavy work

sandbox:
  agent:
    id: awf
    memory: "4g"  # Limit container to 4GB RAM

Opportunity 9: Use `block-domains` for Defense in Depth

What: No workflow uses domain blocking despite it being supported
Why It Matters: In addition to allowlisting, blocking known C2 and data exfiltration domains adds another layer of security
Where: Any workflow with AWF sandbox

network:
  allowed:
    - defaults
    - github.com
  blocked:
    - pastebin.com
    - ngrok.io
    - webhook.site

Opportunity 10: Enable `features.mcp-gateway` for Better MCP Routing

What: No production workflow uses the mcp-gateway feature (was tested in smoke workflows previously but removed)
Why It Matters: MCP gateway provides improved routing, observability, and error handling for MCP server calls
Where: Workflows with multiple MCP servers configured

Opportunity 11: Fix `ci-coach.md` Cache Path Bug

What: ci-coach.md writes to /tmp/cache-memory/ci-coach/last-analysis.json but the correct path is /tmp/gh-aw/cache-memory/
Why It Matters: Cache data is never persisted between runs, defeating the purpose of cache-memory
Fix: Update the path and add cache-memory: true to the tools section

tools:
  cache-memory: true  # Enable cache-memory tool

Then change the agent prompt to use /tmp/gh-aw/cache-memory/ci-coach/ instead of /tmp/cache-memory/ci-coach/.

Opportunity 12: Expand `max-continuations` Awareness in Comments

What: Several workflows include comments like "Note: max-turns not available for Copilot engine (Claude only)" — this is outdated; Copilot supports max-continuations (maps to --max-autopilot-continues)
Where: hourly-ci-cleaner.md comment block
Fix: Update comments to reference max-continuations as the Copilot equivalent

4️⃣ Specific Workflow Recommendations

View High-Impact Workflow Changes

`ci-doctor.md`

Current: Uses gpt-5.1-codex-mini model, no sandbox, no max-continuations
Recommend: Add max-continuations: 3 to allow iterative diagnosis, and sandbox: agent: awf for safety
Benefit: Agent can diagnose → attempt fix → verify → retry autonomously

`hourly-ci-cleaner.md`

Current: Uses AWF sandbox + mounts + agent: ci-cleaner ✅
Issue: Comment says "max-turns not available for Copilot" — outdated! Copilot supports max-continuations
Recommend: Add max-continuations: 3 and remove the incorrect comment
Benefit: CI cleaner can iterate on fixes rather than stopping after first attempt

`contribution-check.md`

Current: No engine.agent configured
Recommend: Add engine: {id: copilot, agent: contribution-checker} to use the existing contribution-checker.agent.md
Benefit: Specialized contribution-checking persona with better accuracy

`code-scanning-fixer.md`

Current: No sandbox, no max-continuations
Recommend: sandbox: agent: awf, max-continuations: 2
Benefit: Security + iterative fixing capability

`ci-coach.md`

Current: Uses wrong cache path /tmp/cache-memory/
Recommend: Fix to /tmp/gh-aw/cache-memory/ and add cache-memory: true to tools
Benefit: Analysis history actually persists across runs

`workflow-generator.md`

Current: Has assign-to-agent but no specialized agent file
Recommend: Wire to interactive-agent-designer.agent.md
Benefit: Better workflow generation with specialized agent persona

5️⃣ Trends & Insights

View Historical Context (First Analysis)

This is the first comprehensive analysis of Copilot CLI usage in this repository. Future analyses will track:

Whether sandbox adoption increases after security recommendations
Which custom agents get wired to workflows
Whether max-continuations gets adopted for complex workflows
Whether startup-timeout gets standardized
Progression of GitHub tool permission scoping

Baseline established: 2026-03-19, Run §23317404420

6️⃣ Best Practice Guidelines

Based on this research, here are recommended best practices for Copilot engine workflows:

Always set startup-timeout: Add startup-timeout: 3 to all Copilot workflows to catch hung initialization early
Use AWF for untrusted input: Any workflow triggered by user-submitted content (issues, PRs, comments) should have sandbox: agent: awf
Leverage custom agents: Before writing long behavioral prompts, check if an existing .github/agents/*.agent.md matches your use case — or create one
Scope GitHub tool permissions: Use toolsets: [repos, issues] instead of toolsets: [default] when your workflow only reads issues and PRs
Use max-continuations for iterative tasks: CI fixers, code reviewers, and content generators benefit from multiple autonomous continuation loops
Pin versions for critical scheduled workflows: Use engine: {id: copilot, version: "X.Y.Z"} for high-frequency scheduled workflows to prevent surprise breakage
Use correct cache-memory path: Always use /tmp/gh-aw/cache-memory/ (not /tmp/cache-memory/)

7️⃣ Action Items

Immediate Actions (this week):

Fix ci-coach.md cache path bug (/tmp/cache-memory/ → /tmp/gh-aw/cache-memory/)
Update hourly-ci-cleaner.md comment about max-turns; add max-continuations: 3
Wire contribution-checker.agent.md to contribution-check.md

Short-term (this month):

Audit 71 sandbox-less Copilot workflows; enable AWF for those processing user input
Wire remaining unused custom agents to appropriate workflows
Add startup-timeout: 3 as standard to all Copilot workflows
Add max-continuations: 2-3 to ci-doctor, code-scanning-fixer

Long-term (this quarter):

Scope GitHub tool permissions from toolsets: [default] to specific toolsets where possible
Evaluate features.mcp-gateway for workflows with multiple MCP servers
Consider engine.env pattern for workflows with hardcoded configuration values
Establish version pinning policy for scheduled production workflows

View Supporting Evidence & Methodology

Research Methodology

Codebase scan: Reviewed all Copilot-related Go files (pkg/workflow/copilot_*.go) to inventory available features and CLI flags
Documentation review: Analyzed docs/src/content/docs/reference/engines.md for documented capabilities
Workflow survey: Grep analysis across all 175 .github/workflows/*.md files for feature usage patterns
Constants analysis: Reviewed pkg/constants/constants.go for feature flags and engine configuration
Execution analysis: Reviewed copilot_engine_execution.go to understand how features map to CLI flags

Key Files Analyzed

pkg/workflow/copilot_engine.go — Engine interface and capabilities
pkg/workflow/copilot_engine_execution.go — CLI argument construction
pkg/workflow/copilot_engine_tools.go — Tool permission logic
pkg/workflow/copilot_mcp.go — MCP server configuration
pkg/workflow/sandbox.go — Sandbox configuration structures
docs/src/content/docs/reference/engines.md — Engine documentation

Data Sources

175 workflow markdown files in .github/workflows/*.md
9 agent files in .github/agents/*.agent.md

References:

AI generated by Copilot CLI Deep Research Agent · history

expires on Mar 20, 2026, 9:29 PM UTC

2026-03-19T21:55:58Z

github-actions[bot]
bot Mar 19, 2026
Author

🤖 Beep boop! The smoke test agent was here! Testing all systems... ✅ Running checks on PR #21752. Stay tuned for the full report!

Note

🔒 Integrity filtering filtered 2 items

Integrity filtering activated and filtered the following items during workflow execution.
This happens when a tool call accesses a resource that does not meet the required integrity or secrecy level of the workflow.

pr:feat: mount custom GitHub Actions as safe output tools via safe-outputs.actions #21752 (pull_request_read: Resource 'pr:feat: mount custom GitHub Actions as safe output tools via safe-outputs.actions #21752' has lower integrity than agent requires. Agent would need to drop integrity tags [approved:all unapproved:all] to trust this resource.)
pr:feat: mount custom GitHub Actions as safe output tools via safe-outputs.actions #21752 (pull_request_read: Resource 'pr:feat: mount custom GitHub Actions as safe output tools via safe-outputs.actions #21752' has lower integrity than agent requires. Agent would need to drop integrity tags [unapproved:all approved:all] to trust this resource.)

📰 BREAKING: Report filed by Smoke Copilot · ◷

0 replies

2026-03-19T21:56:05Z

github-actions[bot]
bot Mar 19, 2026
Author

🎉 The smoke test robot stopped by for a visit!

tests run in the dark
each assertion a small light
green means we can sleep

11/12 tests passed in run §23318554061. The only hiccup? Serena MCP isn't around at the party 🎈. Everything else is rocking! 🤖💚

Note

🔒 Integrity filtering filtered 2 items

Integrity filtering activated and filtered the following items during workflow execution.
This happens when a tool call accesses a resource that does not meet the required integrity or secrecy level of the workflow.

pr:feat: mount custom GitHub Actions as safe output tools via safe-outputs.actions #21752 (pull_request_read: Resource 'pr:feat: mount custom GitHub Actions as safe output tools via safe-outputs.actions #21752' has lower integrity than agent requires. Agent would need to drop integrity tags [approved:all unapproved:all] to trust this resource.)
pr:feat: mount custom GitHub Actions as safe output tools via safe-outputs.actions #21752 (pull_request_read: Resource 'pr:feat: mount custom GitHub Actions as safe output tools via safe-outputs.actions #21752' has lower integrity than agent requires. Agent would need to drop integrity tags [unapproved:all approved:all] to trust this resource.)

📰 BREAKING: Report filed by Smoke Copilot · ◷

0 replies

2026-03-19T21:59:24Z

github-actions[bot]
bot Mar 19, 2026
Author

💥 WHOOSH! 🦸♂️ KAPOW! The smoke test agent bursts onto the scene!

⚡ ZAP! Claude Engine ACTIVATED — Run 23318554140 is NOMINAL!

"With great automation comes great responsibility!" — The Smoke Test Agent

🌟 BOOM! All systems GO! The agentic workflows are ALIVE and KICKING! 🚀

Note

🔒 Integrity filtering filtered 1 item

Integrity filtering activated and filtered the following item during workflow execution.
This happens when a tool call accesses a resource that does not meet the required integrity or secrecy level of the workflow.

pr:feat: mount custom GitHub Actions as safe output tools via safe-outputs.actions #21752 (pull_request_read: Resource 'pr:feat: mount custom GitHub Actions as safe output tools via safe-outputs.actions #21752' has lower integrity than agent requires. Agent would need to drop integrity tags [unapproved:all approved:all] to trust this resource.)

💥 [THE END] — Illustrated by Smoke Claude · ◷

0 replies

2026-03-20T21:26:47Z

github-actions[bot]
bot Mar 20, 2026
Author

This discussion has been marked as outdated by Copilot CLI Deep Research Agent.

A newer discussion is available at Discussion #22031.

0 replies

[copilot-cli-research] Copilot CLI Deep Research - 2026-03-19 #21858

Uh oh!

github-actions[bot] bot Mar 19, 2026

📊 Executive Summary

🔴 Critical Findings

Security: 71 of 83 Copilot Workflows Run Without AWF Sandbox

Underutilization: 7/9 Custom Agent Files Are Never Used

🟡 Medium Priority Issues

1️⃣ Current State Analysis

Available Engine Configuration Options

CLI Flags Auto-Generated by gh-aw

Available Sandbox Options

Available Feature Flags

2️⃣ Feature Usage Matrix

3️⃣ Missed Opportunities

Opportunity 1: Enable AWF Sandbox for Security-Sensitive Workflows

Opportunity 2: Wire Unused Custom Agent Files to Workflows

Opportunity 3: Enable max-continuations for Complex Workflows

Opportunity 4: Add startup-timeout to All Copilot Workflows

Opportunity 5: Use Granular GitHub Tool Permissions

Opportunity 6: Pin Copilot CLI Version for Critical Workflows

Opportunity 7: Use engine.env for Workflow-Specific Configuration

Opportunity 8: Set sandbox.agent.memory for Resource-Intensive Workflows

Opportunity 9: Use block-domains for Defense in Depth

Opportunity 10: Enable features.mcp-gateway for Better MCP Routing

Opportunity 11: Fix ci-coach.md Cache Path Bug

Opportunity 12: Expand max-continuations Awareness in Comments

4️⃣ Specific Workflow Recommendations

ci-doctor.md

hourly-ci-cleaner.md

contribution-check.md

code-scanning-fixer.md

ci-coach.md

workflow-generator.md

5️⃣ Trends & Insights

6️⃣ Best Practice Guidelines

7️⃣ Action Items

Research Methodology

Key Files Analyzed

Data Sources

Replies: 4 comments

Uh oh!

github-actions[bot] bot Mar 19, 2026 Author

Uh oh!

github-actions[bot] bot Mar 19, 2026 Author

Uh oh!

github-actions[bot] bot Mar 19, 2026 Author

Uh oh!

github-actions[bot] bot Mar 20, 2026 Author

github-actions[bot]
bot Mar 19, 2026

Opportunity 3: Enable `max-continuations` for Complex Workflows

Opportunity 4: Add `startup-timeout` to All Copilot Workflows

Opportunity 7: Use `engine.env` for Workflow-Specific Configuration

Opportunity 8: Set `sandbox.agent.memory` for Resource-Intensive Workflows

Opportunity 9: Use `block-domains` for Defense in Depth

Opportunity 10: Enable `features.mcp-gateway` for Better MCP Routing

Opportunity 11: Fix `ci-coach.md` Cache Path Bug

Opportunity 12: Expand `max-continuations` Awareness in Comments

`ci-doctor.md`

`hourly-ci-cleaner.md`

`contribution-check.md`

`code-scanning-fixer.md`

`ci-coach.md`

`workflow-generator.md`

github-actions[bot]
bot Mar 19, 2026
Author

github-actions[bot]
bot Mar 19, 2026
Author

github-actions[bot]
bot Mar 19, 2026
Author

github-actions[bot]
bot Mar 20, 2026
Author