BerriAI · michelligabriele · May 18, 2026 · May 18, 2026 · May 18, 2026 · May 18, 2026
diff --git a/docs/a2a.md b/docs/a2a.md
@@ -211,12 +211,46 @@ POST /a2a/{agent_name}/message/send
 
 ### Authentication
 
-Include your LiteLLM Virtual Key in the `Authorization` header:
+Include your LiteLLM Virtual Key in either of two headers — `x-litellm-api-key` is preferred when the inbound `Authorization` header may carry a token destined for the backend agent (e.g. when using the [convention-based passthrough](./a2a_agent_headers#method-3--convention-based-forwarding) to forward the caller's identity).
 
 ```
 Authorization: Bearer sk-your-litellm-key
+# or
+x-litellm-api-key: Bearer sk-your-litellm-key
 ```
 
+#### Per-agent permission check
+
+After the virtual key is authenticated, LiteLLM checks whether the calling key (and its team) is allowed to invoke the requested agent. If not, the response is HTTP 403. See [Agent Permission Management](./a2a_agent_permissions) for the full intersection model and access groups.
+
+#### Trace ID enforcement (optional, per-agent)
+
+An agent can require every inbound request to carry a trace ID for cross-system audit threading. Set `require_trace_id_on_calls_to_agent: true` in the agent's `litellm_params`. When set, requests missing `x-litellm-trace-id` (or `x-litellm-session-id`) are rejected with HTTP 400.
+
+```bash title="Register an agent that requires inbound trace IDs" showLineNumbers
+curl -X POST http://localhost:4000/v1/agents \
+  -H "Authorization: Bearer sk-master-key" \
+  -H "Content-Type: application/json" \
+  -d '{
+    "agent_name": "audit-critical-agent",
+    "agent_card_params": { ... },
+    "litellm_params": {
+      "require_trace_id_on_calls_to_agent": true
+    }
+  }'
+```
+
+The reverse direction — enforcing trace ID on **outbound** calls made by a key owned by an agent — is controlled by `require_trace_id_on_calls_by_agent` on the same `litellm_params` block.
+
+#### Sub-agent identity propagation
+
+When the backend agent itself calls LiteLLM (for chat completions or to invoke a sub-agent), LiteLLM forwards two headers to maintain trace continuity:
+
+- `X-LiteLLM-Trace-Id` — links all calls in the chain to a single trace
+- `X-LiteLLM-Agent-Id` — attributes spend to the originating agent
+
+The caller's **virtual key** and **end-user ID** are not automatically forwarded. If the downstream agent needs the user's identity, propagate it explicitly via [`extra_headers` or the `x-a2a-{agent_name_or_id}-{header}` convention](./a2a_agent_headers).
+
 ### Request Format
 
 LiteLLM follows the [A2A JSON-RPC 2.0 specification](https://github.com/google/A2A):

diff --git a/docs/a2a_agent_permissions.md b/docs/a2a_agent_permissions.md
@@ -208,35 +208,72 @@ curl -X POST "http://localhost:4000/a2a/agent-456" \
   -d '{"message": {"role": "user", "parts": [{"type": "text", "text": "Hello"}]}}'
 ```
 
+## Agent Access Groups
+
+Granting individual agents to every key or team gets unwieldy as the agent catalog grows. **Agent access groups** let you tag agents with logical labels in the dashboard, then grant the **group** to a key or team — adding a new agent to the group automatically makes it available to every key/team that holds the group.
+
+### 1. Tag the agent with one or more groups
+
+In the LiteLLM dashboard:
+
+1. Go to **Agents**.
+2. Create or edit an agent.
+3. Under **Access Groups**, type a group name (e.g. `clinical-tools`) and press Enter.
+
+:::note
+Tagging an agent with access groups is currently a dashboard-only operation. The `POST /v1/agents` body schema does not expose `agent_access_groups` as a top-level field; the group tags persist via the underlying DB column and are consumed during permission resolution.
+:::
+
+### 2. Grant a key or team the group
+
+```bash title="Key with access to two agent groups" showLineNumbers
+curl -X POST "http://localhost:4000/key/generate" \
+  -H "Authorization: Bearer sk-master-key" \
+  -H "Content-Type: application/json" \
+  -d '{
+    "object_permission": {
+      "agent_access_groups": ["clinical-tools", "research-tools"]
+    }
+  }'
+```
+
+The key now has access to every agent tagged with either group — no per-agent enumeration required. The same `agent_access_groups` field is also valid on a team's `object_permission`.
+
+When a key has **both** a direct `agents` list and `agent_access_groups`, the union is computed (any agent reached by either path is allowed), and then the team-level intersection is applied as described below.
+
 ## How It Works
 
 ```mermaid
 flowchart TD
     A[Request to invoke agent] --> B{LiteLLM Virtual Key has agent restrictions?}
     B -->|Yes| C{LiteLLM Team has agent restrictions?}
     B -->|No| D{LiteLLM Team has agent restrictions?}
-    
+
     C -->|Yes| E[Use intersection of key + team permissions]
     C -->|No| F[Use key permissions only]
-    
+
     D -->|Yes| G[Inherit team permissions]
     D -->|No| H[Allow ALL agents]
-    
+
     E --> I{Agent in allowed list?}
     F --> I
     G --> I
     H --> J[Allow request]
-    
+
     I -->|Yes| J
     I -->|No| K[Return 403 Forbidden]
 ```
 
+A2A permission resolution operates over two levels: Key and Team. (MCP's [permission hierarchy](./mcp_control#permission-hierarchy) extends to End-user / Agent / Org additionally — agent permissions are a narrower model today.)
+
 | Key Permissions | Team Permissions | Result | Notes |
 |-----------------|------------------|--------|-------|
 | None | None | Key can access **all** agents | Open access by default when no restrictions are set |
 | `["agent-1", "agent-2"]` | None | Key can access `agent-1` and `agent-2` | Key uses its own permissions |
 | None | `["agent-1", "agent-3"]` | Key can access `agent-1` and `agent-3` | Key inherits team's permissions |
 | `["agent-1", "agent-2"]` | `["agent-1", "agent-3"]` | Key can access `agent-1` only | Intersection of both lists (most restrictive wins) |
+| `agent_access_groups: ["clinical"]` | None | Key can access every agent tagged `clinical` | Access groups resolved to concrete agent IDs |
+| `agent_access_groups: ["clinical"]` | `agents: ["agent-1"]` | Intersection of (every agent tagged `clinical`) and `["agent-1"]` | Mixing direct and group grants is supported |
 
 ## Viewing Permissions
 

diff --git a/docs/auth_overview.md b/docs/auth_overview.md
@@ -0,0 +1,161 @@
+import Tabs from '@theme/Tabs';
+import TabItem from '@theme/TabItem';
+
+# AuthN/AuthZ Reference — MCP and A2A Side-by-Side
+
+LiteLLM exposes two gateway surfaces that share most authentication and authorization primitives but diverge in a few important places. This page is the side-by-side reference: which header does what, where the two surfaces are symmetric, and where they're not. Each section links out to the dedicated page for the deep dive.
+
+| Surface | Endpoints | Dedicated docs |
+|---|---|---|
+| **MCP Gateway** | `/mcp`, `/{server}/mcp`, `/toolset/{name}/mcp`, `/sse`, `/v1/mcp/...`, `/mcp-rest/...` | [MCP Overview](./mcp) |
+| **A2A Agent Gateway** | `/a2a/{agent_id}`, `/a2a/{agent_id}/message/send`, `/v1/agents/...` | [A2A Overview](./a2a) |
+
+---
+
+## 1. Client → LiteLLM (authenticating the caller)
+
+Both surfaces accept the same LiteLLM Virtual Key headers and the same identification headers. The one place they diverge: the MCP **ASGI** routes (the streamable MCP endpoints at `/mcp`, `/{name}/mcp`, `/toolset/{name}/mcp`, `/sse`) bypass the standard FastAPI auth dependency and only check `x-litellm-api-key` and `Authorization`. The MCP **REST/management** routes (`/v1/mcp/...`, `/mcp-rest/...`) and **all** A2A routes accept the full six-header set.
+
+| Header | Purpose | MCP ASGI | MCP REST + A2A |
+|---|---|---|---|
+| `x-litellm-api-key: Bearer sk-...` | Preferred LiteLLM Virtual Key header. Use whenever the inbound `Authorization` header may carry a different token (OAuth passthrough, OBO, A2A per-user forwarding). | ✓ | ✓ |
+| `Authorization: Bearer sk-...` | Standard fallback. Stripped of the `Bearer ` prefix before lookup. | ✓ | ✓ |
+| `API-Key`, `x-api-key`, `x-goog-api-key`, `Ocp-Apim-Subscription-Key` | Vendor-specific aliases (Azure, Anthropic, Google AI Studio, Azure APIM). | — | ✓ |
+| `x-litellm-end-user-id` | End-user identification. Layers per-end-user budgets, MCP access intersection, and audit log entries on top of the key. `x-litellm-customer-id` is an accepted alias. | ✓ | ✓ |
+| `x-litellm-trace-id` | Cross-request correlation ID. Falls back to `x-litellm-session-id` or any matching `x-<vendor>-session-id` header. | ✓ | ✓ |
+| `x-litellm-session-id` | Session grouping. Same parse path as trace-id, lower priority. | ✓ | ✓ |
+| `x-litellm-tags` | Comma-separated tags for spend-log labeling and tag-based routing. Body field `tags` takes precedence. | — (not parsed on MCP ASGI) | ✓ |
+| `x-litellm-mcp-debug: true` | Returns masked diagnostic response headers (`x-mcp-debug-*`). See [MCP OAuth — Debugging](./mcp_oauth#debugging-oauth). | ✓ | — |
+| `x-mcp-servers` | Scope a request to specific MCP servers (comma-separated). | ✓ | — |
+
+---
+
+## 2. LiteLLM → Backend (authenticating the gateway to the agent or MCP server)
+
+This is the section where MCP and A2A diverge most. MCP has a first-class `auth_type` field on each server registration. **A2A has no `auth_type` field at all** — the outbound auth mode is inferred from what's present in `litellm_params`.
+
+### MCP — `auth_type` enum
+
+Nine values. The MCP server's outbound `Authorization` header (or per-request SigV4 signature) is determined by `auth_type`. See [MCP Overview — Add HTTP MCP Server](./mcp#add-http-mcp-server) for the full table.
+
+| `auth_type` | Mechanism | Dedicated docs |
+|---|---|---|
+| `none` | No auth header added | — |
+| `api_key` / `bearer_token` / `basic` / `authorization` / `token` | Static header, sent verbatim per call | [MCP Overview](./mcp) |
+| `oauth2` | PKCE (interactive) or M2M `client_credentials`. Discriminated by `oauth2_flow`. | [MCP OAuth](./mcp_oauth) |
+| `oauth2_token_exchange` | RFC 8693 On-Behalf-Of (OBO) — exchange the caller's bearer token for a scoped MCP token | [MCP OBO Auth](./mcp_obo_auth) |
+| `aws_sigv4` | Per-request SigV4 signature using a dedicated MCP-side credential chain | [MCP AWS SigV4](./mcp_aws_sigv4) |
+
+### A2A — auth mode inferred from `litellm_params`
+
+There is no `auth_type` field on an agent. The provider handler picks the auth mechanism from the contents of `litellm_params`:
+
+| Mode | When it fires | Send to backend |
+|---|---|---|
+| **Bearer / JWT** | `litellm_params.api_key` is set | `Authorization: Bearer <api_key>` |
+| **SigV4** (AgentCore only) | `litellm_params.api_key` is unset and the provider is `bedrock` | Per-request SigV4 via the full `base_aws_llm` credential chain (six entry points: web-identity+role, role alone, profile, session-token triple, key+secret, env-vars / IRSA fallback). See [Bedrock AgentCore — A2A Gateway Authentication](./providers/bedrock_agentcore#a2a-gateway-authentication). |
+| **Provider-native** | `litellm_params.custom_llm_provider` matches a non-Bedrock provider (Vertex AI Agent Engine, LangGraph, Azure AI Foundry, Pydantic AI) | The provider's normal auth path |
+
+The dual JWT-vs-SigV4 mode is specific to AgentCore. Other A2A providers (Vertex, LangGraph, Azure Foundry) use the provider's own credential conventions — see the relevant provider page under [Providers](./providers).
+
+### Zero-trust add-on (MCP-only today)
+
+If the MCP server needs to **cryptographically verify** the request came through LiteLLM, layer the [MCP JWT Signer](./mcp_zero_trust) guardrail on top. It signs every outbound tool call with a short-lived RS256 JWT and publishes a JWKS endpoint the MCP server can verify against. This is a guardrail (`guardrail: mcp_jwt_signer`, `mode: pre_mcp_call`), not an `auth_type` — it composes with any `auth_type`.
+
+---
+
+## 3. Per-user header passthrough
+
+Both surfaces let clients forward credentials destined for a specific backend server/agent without admin pre-configuration. The conventions look symmetric but parse differently — be precise when copy-pasting.
+
+| Surface | Prefix | Parse rule | Match against | Example |
+|---|---|---|---|---|
+| **MCP** | `x-mcp-` | Split on the **first dash** after the prefix → `(server_alias, header_name)` | Server's `alias`, then `server_name` (case-insensitive) | `x-mcp-github-authorization: Bearer ghp_...` → server `github`, header `Authorization` |
+| **A2A** | `x-a2a-` | Exact-prefix match against `x-a2a-{agent_id_lower}-` or `x-a2a-{agent_name_lower}-`; everything after the trailing dash is the header name | Agent's UUID **and** human-readable name (both tried) | `x-a2a-my-agent-x-api-key: secret` → agent `my-agent`, header `x-api-key` |
+
+Both surfaces also support admin-controlled alternatives that compose with the user passthrough:
+
+| Mechanism | MCP | A2A | Notes |
+|---|---|---|---|
+| `static_headers: {K: V}` | ✓ | ✓ | Always sent. **Wins over user passthrough** on key conflicts. |
+| `extra_headers: [name, name, ...]` | ✓ | ✓ | Admin-allowlist of client header names to forward verbatim. |
+| `x-<surface>-<id>-<header>` convention | ✓ (`x-mcp-`) | ✓ (`x-a2a-`) | Client-driven, no admin config needed. |
+
+See [MCP Overview — Forwarding Custom Headers](./mcp#forwarding-custom-headers-to-mcp-servers) and [A2A Agent Authentication Headers](./a2a_agent_headers) for the full mechanics.
+
+---
+
+## 4. AuthZ — RBAC and access groups
+
+Both surfaces use the `object_permission` model with intersection-style resolution, but at different depths today. MCP resolves across five levels; A2A across two. The detailed flowcharts and tables live on the dedicated pages:
+
+- [MCP Permission Hierarchy](./mcp_control#permission-hierarchy)
+- [A2A Agent Permission Management — How It Works](./a2a_agent_permissions#how-it-works)
+
+| Level | MCP field | A2A field |
+|---|---|---|
+| **Key** | `object_permission.mcp_servers`, `object_permission.mcp_access_groups`, `object_permission.mcp_tool_permissions` | `object_permission.agents`, `object_permission.agent_access_groups` |
+| **Team** | Same | Same (inheritance-first: if the key has no list, it inherits the team's) |
+| **End user** | Same (via `x-litellm-end-user-id`) | — not resolved today |
+| **Agent** | Same (via `x-litellm-agent-id`) | — not applicable (the agent is the target) |
+| **Org** | Same — acts as a **ceiling** | — not resolved today |
+
+| Concern | MCP | A2A |
+|---|---|---|
+| Per-server / per-agent allowlist | `object_permission.mcp_servers` | `object_permission.agents` |
+| Access groups (tag-based grants) | `object_permission.mcp_access_groups` | `object_permission.agent_access_groups` |
+| Per-server tool-level allowlist | `object_permission.mcp_tool_permissions: {server_id: [tool, ...]}` | n/a (tools live inside the agent) |
+| Server-registration allowlist (admin-static) | `allowed_tools` / `disallowed_tools` on the MCP server | n/a |
+| Param-level allowlist | `allowed_params: {tool_name: [param, ...]}` on the MCP server | n/a |
+| Reject behaviour | `list_tools` filters out hidden servers; `call_tool` returns error | `GET /v1/agents` filters; `POST /a2a/{agent_id}` returns HTTP **403** |
+
+---
+
+## 5. Trace IDs and identity propagation
+
+`x-litellm-trace-id` is **accepted** on every request and threaded through logging on both surfaces. A few A2A-specific extras:
+
+| Setting | Scope | Behaviour |
+|---|---|---|
+| `require_trace_id_on_calls_to_agent: true` | Per-agent, on the agent's `litellm_params` | Reject inbound `/a2a/{agent_id}` calls missing `x-litellm-trace-id` (or `x-litellm-session-id` fallback) with **HTTP 400**. See [A2A Overview — Trace ID enforcement](./a2a#trace-id-enforcement-optional-per-agent). |
+| `require_trace_id_on_calls_by_agent: true` | Per-agent, on the agent's `litellm_params` | Reverse direction — when a key **owned by** that agent makes outbound calls, require a trace ID on those. |
+
+**Sub-agent identity propagation** — when LiteLLM dispatches a downstream call as part of an A2A invocation, it forwards `X-LiteLLM-Trace-Id` and `X-LiteLLM-Agent-Id` to maintain trace continuity and spend attribution. The original virtual key and end-user identity are **not** auto-forwarded. Use `extra_headers` or the `x-a2a-{agent_name_or_id}-{header}` convention to thread identity explicitly. See [A2A Overview — Sub-agent identity propagation](./a2a#sub-agent-identity-propagation).
+
+---
+
+## 6. Guardrails on the gateway path
+
+| Concern | MCP | A2A |
+|---|---|---|
+| Pre-call input guardrails (Presidio, Bedrock, Lakera, Aporia, etc.) | `mode: pre_mcp_call` | Standard chat-completion guardrails apply to the underlying LLM calls the agent makes |
+| During-call intervention | `mode: during_mcp_call` | — |
+| Zero-trust JWT signing | [`mcp_jwt_signer` guardrail](./mcp_zero_trust) | — (not applicable to A2A today) |
+| Documentation | [MCP Guardrails](./mcp_guardrail), [MCP Zero Trust](./mcp_zero_trust) | Standard [guardrails docs](./proxy/guardrails) apply via the agent's underlying model calls |
+
+---
+
+## 7. Cheatsheet — what header does what
+
+For copy-paste, the high-frequency request headers across both surfaces:
+
+```http
+# Always (LiteLLM-side auth and identification)
+x-litellm-api-key: Bearer sk-...
+# or
+Authorization: Bearer sk-...
+
+x-litellm-end-user-id: user-42
+x-litellm-trace-id: 8f4a-2b1c-d3e5-...
+
+# MCP — server scoping / per-user passthrough
+x-mcp-servers: github,zapier
+x-mcp-github-authorization: Bearer ghp_<user-token>     # user passthrough to github_mcp
+x-litellm-mcp-debug: true                                # diagnostic response headers
+
+# A2A — per-user passthrough
+x-a2a-my-agent-authorization: Bearer <user-token>        # caller's token to my-agent
+x-a2a-my-agent-x-api-key: <user-key>                     # additional per-agent header
+```
+
+For the deep dives, follow the cross-links above into the dedicated pages.
diff --git a/docs/mcp.md b/docs/mcp.md
@@ -226,13 +226,19 @@ mcp_servers:
 - **Description**: Optional description for the server
 - **Auth Type**: Optional authentication type. Supported values:
 
-  | Value | Header sent |
+  | Value | Header sent (managed SSE/HTTP transport) |
   |-------|-------------|
+  | `none` | No auth header added |
   | `api_key` | `X-API-Key: <auth_value>` |
   | `bearer_token` | `Authorization: Bearer <auth_value>` |
   | `basic` | `Authorization: Basic <auth_value>` |
-  | `authorization` | `Authorization: <auth_value>` |
-  | `aws_sigv4` | Per-request AWS SigV4 signature ([details](./mcp_aws_sigv4.md)) |
+  | `authorization` | `Authorization: <auth_value>` (verbatim, no prefix) |
+  | `token` | `Authorization: token <auth_value>` (GitHub-style) |
+  | `oauth2` | `Authorization: Bearer <resolved_token>` — PKCE or M2M `client_credentials`. See [MCP OAuth](./mcp_oauth.md) |
+  | `oauth2_token_exchange` | `Authorization: Bearer <exchanged_token>` — RFC 8693 On-Behalf-Of. See [MCP OBO Auth](./mcp_obo_auth.md) |
+  | `aws_sigv4` | Per-request AWS SigV4 signature. See [MCP AWS SigV4](./mcp_aws_sigv4.md) |
+
+  Note: the header table above describes the managed SSE/HTTP transport path. The OpenAPI-tool path emits `Authorization: ApiKey <value>` instead of `X-API-Key` for `auth_type: api_key`; the deprecated `x-mcp-auth` broadcast header also uses the `ApiKey` form.
 
 - **Extra Headers**: Optional list of additional header names that should be forwarded from client to the MCP server
 - **Static Headers**: Optional map of header key/value pairs to include every request to the MCP server.