Skip to content

Rate-limit circuit breaker for GitHub MCP backend tool calls #3787

@lpcox

Description

@lpcox

Problem

The MCP Structural Analysis report (Apr 14, 2026) detected the first-ever GitHub MCP installation rate-limit event: 4 tools hit the 15,000 request/reset limit simultaneously. Without a circuit breaker or throttling strategy, high-frequency workflows trigger cascading rate limits across multiple agents sharing the same GitHub App installation token.

Source: github/gh-aw#26239

Current behavior

The gateway has no rate-limit awareness for tool calls:

Component Status
isTransientHTTPError() detects 429 ✅ But only used for config schema fetch, not tool calls
X-RateLimit-* headers propagated in proxy mode ✅ But not inspected or acted upon
Circuit breaker ❌ Not implemented
Retry with backoff for tool calls ❌ Not implemented
Throttling / request budget ❌ Not implemented

When the GitHub MCP server returns a rate-limited response, callBackendTool (unified.go) and the proxy handler propagate the error directly to the agent. The agent retries immediately, worsening the rate-limit storm.

Affected code paths

  • Gateway mode: internal/server/unified.gocallBackendTool()executeBackendToolCall() — no 429 handling
  • Proxy mode: internal/proxy/handler.gocopyResponseHeaders() propagates X-RateLimit-* headers but does not inspect them for backoff decisions

Proposed solution

Phase 1: Rate-limit aware backoff (both modes)

Gateway mode (internal/server/unified.go):

  • In executeBackendToolCall, inspect the backend MCP response for rate-limit indicators
  • When the GitHub MCP server returns a tool result indicating rate limiting (error code or X-RateLimit-Remaining: 0), apply exponential backoff before retrying (up to 3 attempts)
  • Log rate-limit events at ERROR level with the X-RateLimit-Reset timestamp so operators can see when the limit resets

Proxy mode (internal/proxy/handler.go):

  • After copyResponseHeaders, inspect X-RateLimit-Remaining from the upstream response
  • When remaining is 0 (or the response is HTTP 429), inject a Retry-After header into the response to the agent
  • Log rate-limit events at ERROR level with reset time and the tool that triggered it

Phase 2: Per-backend circuit breaker

Add a circuit breaker per backend server ID in the gateway:

States: CLOSED → OPEN → HALF-OPEN → CLOSED
  • CLOSED (normal): requests pass through
  • OPEN (tripped): after N consecutive rate-limit errors, reject requests immediately with a descriptive error and the X-RateLimit-Reset time — no upstream call made
  • HALF-OPEN (probe): after the reset time elapses, allow one probe request. If it succeeds, transition to CLOSED; if rate-limited again, stay OPEN

Configuration (per-server in TOML/JSON):

[servers.github]
type = "http"
url = "..."
# Circuit breaker settings
rate_limit_threshold = 3      # consecutive 429s before opening circuit
rate_limit_cooldown = 60      # seconds to stay OPEN before probing

Phase 3: Request budget / throttling (optional)

Per-session or per-workflow request budget:

  • Track request count per (sessionID, serverID) pair
  • When approaching the rate limit (e.g., X-RateLimit-Remaining < 100), throttle by adding artificial delay between requests
  • Surface budget usage in the gateway health endpoint

Implementation notes

  • The isTransientHTTPError() function in internal/config/validation_schema.go already correctly classifies 429 as transient — this logic should be reused
  • The X-RateLimit-* headers are already captured in httpRequestResult.Header (http_transport.go:391) — they just need to be inspected
  • The copyResponseHeaders() in proxy mode already forwards rate-limit headers — adding inspection is a small change
  • Circuit breaker state should live on the UnifiedServer struct, keyed by server ID
  • The existing lockable pattern from the logger package could be used for the circuit breaker mutex

Metadata

Metadata

Assignees

Labels

No labels
No labels

Type

No type
No fields configured for issues without a type.

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions