[aw-failures] Workflow timing out at 40min — MCP get_file_contents 37–71s per call, LLM turns 4–10min

Hi,
We are seeing consistent 40-minute timeouts on our scaffold workflow starting today (Apr 21). Two of our runs hit the hard timeout before the agent could commit/push the scaffold branch. This was working reliably in ~15 minutes last week.

Affected runs (today):

```
09:26:47  ● Read module standard
09:29:11  ● Create feature branch              (2m 24s gap)
09:31:50  ● List AVM module files              (2m 39s gap)
09:34:00  ● Check existing renovate            (2m 10s gap)
09:35:36  ● Create root outputs.tf
09:37:51  ● Create examples/data.tf            (2m 15s gap)
09:45:21  ● Create storage_account.tftest.hcl  (4m 36s gap)
09:47:42  ● Create README.md                   (2m 21s gap)
09:48:46  ● List all terraform files
09:59:14  ● Response was interrupted due to a server error. Retrying... (10m 28s gap)
10:06:01  ##[error] The action 'Execute GitHub Copilot CLI' has timed out after 40 minutes.
```

Working run for comparison (Apr 16, different module):

```
13:33:42  ● Read module standard
13:34:30  ● Check current git state            (48s gap — longest)
13:34:36  ● Check AVM module variables         (6s gap)
13:38:29  ● Create feature branch              (LLM planning)
13:38:35  ● Create required directories        (6s)
13:39:20  ● Create examples/provider.tf        (22s)
... all gaps under 48s, completed in 11m 38s total
```

gh-aw version: v0.68.3

Direct timing comparison:

Metric | Apr 16 (working, 11m38s) | Apr 21 (failing, 40m timeout)
-- | -- | --
LLM turn time (between tool calls) | 0–48 seconds max | 4–10 minutes
AVM module file reads | 6–19s (shell curl/gh api) | 37–71s each (get_file_contents MCP)
File creation tool calls | 0–22s | 1–4 minutes between each
Server errors | None | Response was interrupted due to a server error. Retrying... at 09:59
Completed? | ✅ Yes | ❌ Both timed out

Notable: last week the agent read AVM module files via shell (curl/gh api) taking 6–19s each. Today it routes through get_file_contents (MCP: github) taking 37–71s each — 10–12x slower. LLM response times are 20–30x worse.

Observed symptoms:

get_file_contents (MCP: github) calls taking 37–71 seconds each — reading individual files from Azure/terraform-azurerm-avm-res-storage-storageaccount via the GitHub MCP server. Example from run 1:

variables.tf → 71s
variables.storage.tf → 51s (then failed with path-not-found)
variables.storageaccount.tf → 62s
LLM turns taking 4–10 minutes — e.g. gap from 09:48:46 to 09:59:14 (10m28s) with a Response was interrupted due to a server error. Retrying... at the end.

Neither run completed — both exhausted the full 40-minute budget without reaching the git commit/push step.

Environment:

Lock file pins: ghcr.io/github/gh-aw-mcpg:v0.2.19, github-mcp-server:v0.32.0, gh-aw-firewall:0.25.20
gh-aw-actions@ba90f2186d7ad780ec640f364005fa24e797b360
No changes on our side between Apr 16 (working) and Apr 21 (failing)

Suspected cause: Copilot API rate limiting / degradation noted in [#27339](vscode-file://vscode-app/Applications/Visual%20Studio%20Code.app/Contents/Resources/app/out/vs/code/electron-browser/workbench/workbench.html) (P1: rate limits) combined with elevated GitHub MCP server response times. The combination means a workflow that normally completes in 15 minutes cannot finish within the 40-minute cap.



Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[aw-failures] Workflow timing out at 40min — MCP get_file_contents 37–71s per call, LLM turns 4–10min #27556

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Metric	Apr 16 (working, 11m38s)	Apr 21 (failing, 40m timeout)
LLM turn time (between tool calls)	0–48 seconds max	4–10 minutes
AVM module file reads	6–19s (shell curl/gh api)	37–71s each (get_file_contents MCP)
File creation tool calls	0–22s	1–4 minutes between each
Server errors	None	Response was interrupted due to a server error. Retrying... at 09:59
Completed?	✅ Yes	❌ Both timed out

[aw-failures] Workflow timing out at 40min — MCP get_file_contents 37–71s per call, LLM turns 4–10min #27556

Description

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions