Skip to content
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
36 changes: 35 additions & 1 deletion docs/a2a.md
Original file line number Diff line number Diff line change
Expand Up @@ -211,12 +211,46 @@ POST /a2a/{agent_name}/message/send

### Authentication

Include your LiteLLM Virtual Key in the `Authorization` header:
Include your LiteLLM Virtual Key in either of two headers — `x-litellm-api-key` is preferred when the inbound `Authorization` header may carry a token destined for the backend agent (e.g. when using the [convention-based passthrough](./a2a_agent_headers#method-3--convention-based-forwarding) to forward the caller's identity).

```
Authorization: Bearer sk-your-litellm-key
# or
x-litellm-api-key: Bearer sk-your-litellm-key
```

#### Per-agent permission check

After the virtual key is authenticated, LiteLLM checks whether the calling key (and its team) is allowed to invoke the requested agent. If not, the response is HTTP 403. See [Agent Permission Management](./a2a_agent_permissions) for the full intersection model and access groups.

#### Trace ID enforcement (optional, per-agent)

An agent can require every inbound request to carry a trace ID for cross-system audit threading. Set `require_trace_id_on_calls_to_agent: true` in the agent's `litellm_params`. When set, requests missing `x-litellm-trace-id` (or `x-litellm-session-id`) are rejected with HTTP 400.

```bash title="Register an agent that requires inbound trace IDs" showLineNumbers
curl -X POST http://localhost:4000/v1/agents \
-H "Authorization: Bearer sk-master-key" \
-H "Content-Type: application/json" \
-d '{
"agent_name": "audit-critical-agent",
"agent_card_params": { ... },
"litellm_params": {
"require_trace_id_on_calls_to_agent": true
}
}'
```

The reverse direction — enforcing trace ID on **outbound** calls made by a key owned by an agent — is controlled by `require_trace_id_on_calls_by_agent` on the same `litellm_params` block.

#### Sub-agent identity propagation

When the backend agent itself calls LiteLLM (for chat completions or to invoke a sub-agent), LiteLLM forwards two headers to maintain trace continuity:

- `X-LiteLLM-Trace-Id` — links all calls in the chain to a single trace
- `X-LiteLLM-Agent-Id` — attributes spend to the originating agent

The caller's **virtual key** and **end-user ID** are not automatically forwarded. If the downstream agent needs the user's identity, propagate it explicitly via [`extra_headers` or the `x-a2a-{agent_name_or_id}-{header}` convention](./a2a_agent_headers).

### Request Format

LiteLLM follows the [A2A JSON-RPC 2.0 specification](https://github.com/google/A2A):
Expand Down
45 changes: 41 additions & 4 deletions docs/a2a_agent_permissions.md
Original file line number Diff line number Diff line change
Expand Up @@ -208,35 +208,72 @@ curl -X POST "http://localhost:4000/a2a/agent-456" \
-d '{"message": {"role": "user", "parts": [{"type": "text", "text": "Hello"}]}}'
```

## Agent Access Groups

Granting individual agents to every key or team gets unwieldy as the agent catalog grows. **Agent access groups** let you tag agents with logical labels in the dashboard, then grant the **group** to a key or team — adding a new agent to the group automatically makes it available to every key/team that holds the group.

### 1. Tag the agent with one or more groups

In the LiteLLM dashboard:

1. Go to **Agents**.
2. Create or edit an agent.
3. Under **Access Groups**, type a group name (e.g. `clinical-tools`) and press Enter.

:::note
Tagging an agent with access groups is currently a dashboard-only operation. The `POST /v1/agents` body schema does not expose `agent_access_groups` as a top-level field; the group tags persist via the underlying DB column and are consumed during permission resolution.
:::

### 2. Grant a key or team the group

```bash title="Key with access to two agent groups" showLineNumbers
curl -X POST "http://localhost:4000/key/generate" \
-H "Authorization: Bearer sk-master-key" \
-H "Content-Type: application/json" \
-d '{
"object_permission": {
"agent_access_groups": ["clinical-tools", "research-tools"]
}
}'
```

The key now has access to every agent tagged with either group — no per-agent enumeration required. The same `agent_access_groups` field is also valid on a team's `object_permission`.

When a key has **both** a direct `agents` list and `agent_access_groups`, the union is computed (any agent reached by either path is allowed), and then the team-level intersection is applied as described below.

## How It Works

```mermaid
flowchart TD
A[Request to invoke agent] --> B{LiteLLM Virtual Key has agent restrictions?}
B -->|Yes| C{LiteLLM Team has agent restrictions?}
B -->|No| D{LiteLLM Team has agent restrictions?}

C -->|Yes| E[Use intersection of key + team permissions]
C -->|No| F[Use key permissions only]

D -->|Yes| G[Inherit team permissions]
D -->|No| H[Allow ALL agents]

E --> I{Agent in allowed list?}
F --> I
G --> I
H --> J[Allow request]

I -->|Yes| J
I -->|No| K[Return 403 Forbidden]
```

A2A permission resolution operates over two levels: Key and Team. (MCP's [permission hierarchy](./mcp_control#permission-hierarchy) extends to End-user / Agent / Org additionally — agent permissions are a narrower model today.)

| Key Permissions | Team Permissions | Result | Notes |
|-----------------|------------------|--------|-------|
| None | None | Key can access **all** agents | Open access by default when no restrictions are set |
| `["agent-1", "agent-2"]` | None | Key can access `agent-1` and `agent-2` | Key uses its own permissions |
| None | `["agent-1", "agent-3"]` | Key can access `agent-1` and `agent-3` | Key inherits team's permissions |
| `["agent-1", "agent-2"]` | `["agent-1", "agent-3"]` | Key can access `agent-1` only | Intersection of both lists (most restrictive wins) |
| `agent_access_groups: ["clinical"]` | None | Key can access every agent tagged `clinical` | Access groups resolved to concrete agent IDs |
| `agent_access_groups: ["clinical"]` | `agents: ["agent-1"]` | Intersection of (every agent tagged `clinical`) and `["agent-1"]` | Mixing direct and group grants is supported |

## Viewing Permissions

Expand Down
161 changes: 161 additions & 0 deletions docs/auth_overview.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,161 @@
import Tabs from '@theme/Tabs';
import TabItem from '@theme/TabItem';

# AuthN/AuthZ Reference — MCP and A2A Side-by-Side

LiteLLM exposes two gateway surfaces that share most authentication and authorization primitives but diverge in a few important places. This page is the side-by-side reference: which header does what, where the two surfaces are symmetric, and where they're not. Each section links out to the dedicated page for the deep dive.

| Surface | Endpoints | Dedicated docs |
|---|---|---|
| **MCP Gateway** | `/mcp`, `/{server}/mcp`, `/toolset/{name}/mcp`, `/sse`, `/v1/mcp/...`, `/mcp-rest/...` | [MCP Overview](./mcp) |
| **A2A Agent Gateway** | `/a2a/{agent_id}`, `/a2a/{agent_id}/message/send`, `/v1/agents/...` | [A2A Overview](./a2a) |

---

## 1. Client → LiteLLM (authenticating the caller)

Both surfaces accept the same LiteLLM Virtual Key headers and the same identification headers. The one place they diverge: the MCP **ASGI** routes (the streamable MCP endpoints at `/mcp`, `/{name}/mcp`, `/toolset/{name}/mcp`, `/sse`) bypass the standard FastAPI auth dependency and only check `x-litellm-api-key` and `Authorization`. The MCP **REST/management** routes (`/v1/mcp/...`, `/mcp-rest/...`) and **all** A2A routes accept the full six-header set.

| Header | Purpose | MCP ASGI | MCP REST + A2A |
|---|---|---|---|
| `x-litellm-api-key: Bearer sk-...` | Preferred LiteLLM Virtual Key header. Use whenever the inbound `Authorization` header may carry a different token (OAuth passthrough, OBO, A2A per-user forwarding). | ✓ | ✓ |
| `Authorization: Bearer sk-...` | Standard fallback. Stripped of the `Bearer ` prefix before lookup. | ✓ | ✓ |
| `API-Key`, `x-api-key`, `x-goog-api-key`, `Ocp-Apim-Subscription-Key` | Vendor-specific aliases (Azure, Anthropic, Google AI Studio, Azure APIM). | — | ✓ |
| `x-litellm-end-user-id` | End-user identification. Layers per-end-user budgets, MCP access intersection, and audit log entries on top of the key. `x-litellm-customer-id` is an accepted alias. | ✓ | ✓ |
| `x-litellm-trace-id` | Cross-request correlation ID. Falls back to `x-litellm-session-id` or any matching `x-<vendor>-session-id` header. | ✓ | ✓ |
| `x-litellm-session-id` | Session grouping. Same parse path as trace-id, lower priority. | ✓ | ✓ |
| `x-litellm-tags` | Comma-separated tags for spend-log labeling and tag-based routing. Body field `tags` takes precedence. | — (not parsed on MCP ASGI) | ✓ |
| `x-litellm-mcp-debug: true` | Returns masked diagnostic response headers (`x-mcp-debug-*`). See [MCP OAuth — Debugging](./mcp_oauth#debugging-oauth). | ✓ | — |
| `x-mcp-servers` | Scope a request to specific MCP servers (comma-separated). | ✓ | — |

---

## 2. LiteLLM → Backend (authenticating the gateway to the agent or MCP server)

This is the section where MCP and A2A diverge most. MCP has a first-class `auth_type` field on each server registration. **A2A has no `auth_type` field at all** — the outbound auth mode is inferred from what's present in `litellm_params`.

### MCP — `auth_type` enum

Nine values. The MCP server's outbound `Authorization` header (or per-request SigV4 signature) is determined by `auth_type`. See [MCP Overview — Add HTTP MCP Server](./mcp#add-http-mcp-server) for the full table.

| `auth_type` | Mechanism | Dedicated docs |
|---|---|---|
| `none` | No auth header added | — |
| `api_key` / `bearer_token` / `basic` / `authorization` / `token` | Static header, sent verbatim per call | [MCP Overview](./mcp) |
| `oauth2` | PKCE (interactive) or M2M `client_credentials`. Discriminated by `oauth2_flow`. | [MCP OAuth](./mcp_oauth) |
| `oauth2_token_exchange` | RFC 8693 On-Behalf-Of (OBO) — exchange the caller's bearer token for a scoped MCP token | [MCP OBO Auth](./mcp_obo_auth) |
| `aws_sigv4` | Per-request SigV4 signature using a dedicated MCP-side credential chain | [MCP AWS SigV4](./mcp_aws_sigv4) |

### A2A — auth mode inferred from `litellm_params`

There is no `auth_type` field on an agent. The provider handler picks the auth mechanism from the contents of `litellm_params`:

| Mode | When it fires | Send to backend |
|---|---|---|
| **Bearer / JWT** | `litellm_params.api_key` is set | `Authorization: Bearer <api_key>` |
| **SigV4** (AgentCore only) | `litellm_params.api_key` is unset and the provider is `bedrock` | Per-request SigV4 via the full `base_aws_llm` credential chain (six entry points: web-identity+role, role alone, profile, session-token triple, key+secret, env-vars / IRSA fallback). See [Bedrock AgentCore — A2A Gateway Authentication](./providers/bedrock_agentcore#a2a-gateway-authentication). |
| **Provider-native** | `litellm_params.custom_llm_provider` matches a non-Bedrock provider (Vertex AI Agent Engine, LangGraph, Azure AI Foundry, Pydantic AI) | The provider's normal auth path |

The dual JWT-vs-SigV4 mode is specific to AgentCore. Other A2A providers (Vertex, LangGraph, Azure Foundry) use the provider's own credential conventions — see the relevant provider page under [Providers](./providers).

### Zero-trust add-on (MCP-only today)

If the MCP server needs to **cryptographically verify** the request came through LiteLLM, layer the [MCP JWT Signer](./mcp_zero_trust) guardrail on top. It signs every outbound tool call with a short-lived RS256 JWT and publishes a JWKS endpoint the MCP server can verify against. This is a guardrail (`guardrail: mcp_jwt_signer`, `mode: pre_mcp_call`), not an `auth_type` — it composes with any `auth_type`.

---

## 3. Per-user header passthrough

Both surfaces let clients forward credentials destined for a specific backend server/agent without admin pre-configuration. The conventions look symmetric but parse differently — be precise when copy-pasting.

| Surface | Prefix | Parse rule | Match against | Example |
|---|---|---|---|---|
| **MCP** | `x-mcp-` | Split on the **first dash** after the prefix → `(server_alias, header_name)` | Server's `alias`, then `server_name` (case-insensitive) | `x-mcp-github-authorization: Bearer ghp_...` → server `github`, header `Authorization` |
| **A2A** | `x-a2a-` | Exact-prefix match against `x-a2a-{agent_id_lower}-` or `x-a2a-{agent_name_lower}-`; everything after the trailing dash is the header name | Agent's UUID **and** human-readable name (both tried) | `x-a2a-my-agent-x-api-key: secret` → agent `my-agent`, header `x-api-key` |

Both surfaces also support admin-controlled alternatives that compose with the user passthrough:

| Mechanism | MCP | A2A | Notes |
|---|---|---|---|
| `static_headers: {K: V}` | ✓ | ✓ | Always sent. **Wins over user passthrough** on key conflicts. |
| `extra_headers: [name, name, ...]` | ✓ | ✓ | Admin-allowlist of client header names to forward verbatim. |
| `x-<surface>-<id>-<header>` convention | ✓ (`x-mcp-`) | ✓ (`x-a2a-`) | Client-driven, no admin config needed. |

See [MCP Overview — Forwarding Custom Headers](./mcp#forwarding-custom-headers-to-mcp-servers) and [A2A Agent Authentication Headers](./a2a_agent_headers) for the full mechanics.

---

## 4. AuthZ — RBAC and access groups

Both surfaces use the `object_permission` model with intersection-style resolution, but at different depths today. MCP resolves across five levels; A2A across two. The detailed flowcharts and tables live on the dedicated pages:

- [MCP Permission Hierarchy](./mcp_control#permission-hierarchy)
- [A2A Agent Permission Management — How It Works](./a2a_agent_permissions#how-it-works)

| Level | MCP field | A2A field |
|---|---|---|
| **Key** | `object_permission.mcp_servers`, `object_permission.mcp_access_groups`, `object_permission.mcp_tool_permissions` | `object_permission.agents`, `object_permission.agent_access_groups` |
| **Team** | Same | Same (inheritance-first: if the key has no list, it inherits the team's) |
| **End user** | Same (via `x-litellm-end-user-id`) | — not resolved today |
| **Agent** | Same (via `x-litellm-agent-id`) | — not applicable (the agent is the target) |
| **Org** | Same — acts as a **ceiling** | — not resolved today |

| Concern | MCP | A2A |
|---|---|---|
| Per-server / per-agent allowlist | `object_permission.mcp_servers` | `object_permission.agents` |
| Access groups (tag-based grants) | `object_permission.mcp_access_groups` | `object_permission.agent_access_groups` |
| Per-server tool-level allowlist | `object_permission.mcp_tool_permissions: {server_id: [tool, ...]}` | n/a (tools live inside the agent) |
| Server-registration allowlist (admin-static) | `allowed_tools` / `disallowed_tools` on the MCP server | n/a |
| Param-level allowlist | `allowed_params: {tool_name: [param, ...]}` on the MCP server | n/a |
| Reject behaviour | `list_tools` filters out hidden servers; `call_tool` returns error | `GET /v1/agents` filters; `POST /a2a/{agent_id}` returns HTTP **403** |

---

## 5. Trace IDs and identity propagation

`x-litellm-trace-id` is **accepted** on every request and threaded through logging on both surfaces. A few A2A-specific extras:

| Setting | Scope | Behaviour |
|---|---|---|
| `require_trace_id_on_calls_to_agent: true` | Per-agent, on the agent's `litellm_params` | Reject inbound `/a2a/{agent_id}` calls missing `x-litellm-trace-id` (or `x-litellm-session-id` fallback) with **HTTP 400**. See [A2A Overview — Trace ID enforcement](./a2a#trace-id-enforcement-optional-per-agent). |
| `require_trace_id_on_calls_by_agent: true` | Per-agent, on the agent's `litellm_params` | Reverse direction — when a key **owned by** that agent makes outbound calls, require a trace ID on those. |

**Sub-agent identity propagation** — when LiteLLM dispatches a downstream call as part of an A2A invocation, it forwards `X-LiteLLM-Trace-Id` and `X-LiteLLM-Agent-Id` to maintain trace continuity and spend attribution. The original virtual key and end-user identity are **not** auto-forwarded. Use `extra_headers` or the `x-a2a-{agent_name_or_id}-{header}` convention to thread identity explicitly. See [A2A Overview — Sub-agent identity propagation](./a2a#sub-agent-identity-propagation).

---

## 6. Guardrails on the gateway path

| Concern | MCP | A2A |
|---|---|---|
| Pre-call input guardrails (Presidio, Bedrock, Lakera, Aporia, etc.) | `mode: pre_mcp_call` | Standard chat-completion guardrails apply to the underlying LLM calls the agent makes |
| During-call intervention | `mode: during_mcp_call` | — |
| Zero-trust JWT signing | [`mcp_jwt_signer` guardrail](./mcp_zero_trust) | — (not applicable to A2A today) |
| Documentation | [MCP Guardrails](./mcp_guardrail), [MCP Zero Trust](./mcp_zero_trust) | Standard [guardrails docs](./proxy/guardrails) apply via the agent's underlying model calls |

---

## 7. Cheatsheet — what header does what

For copy-paste, the high-frequency request headers across both surfaces:

```http
# Always (LiteLLM-side auth and identification)
x-litellm-api-key: Bearer sk-...
# or
Authorization: Bearer sk-...

x-litellm-end-user-id: user-42
x-litellm-trace-id: 8f4a-2b1c-d3e5-...

# MCP — server scoping / per-user passthrough
x-mcp-servers: github,zapier
x-mcp-github-authorization: Bearer ghp_<user-token> # user passthrough to github_mcp
x-litellm-mcp-debug: true # diagnostic response headers

# A2A — per-user passthrough
x-a2a-my-agent-authorization: Bearer <user-token> # caller's token to my-agent
x-a2a-my-agent-x-api-key: <user-key> # additional per-agent header
```

For the deep dives, follow the cross-links above into the dedicated pages.
12 changes: 9 additions & 3 deletions docs/mcp.md
Original file line number Diff line number Diff line change
Expand Up @@ -226,13 +226,19 @@ mcp_servers:
- **Description**: Optional description for the server
- **Auth Type**: Optional authentication type. Supported values:

| Value | Header sent |
| Value | Header sent (managed SSE/HTTP transport) |
|-------|-------------|
| `none` | No auth header added |
| `api_key` | `X-API-Key: <auth_value>` |
| `bearer_token` | `Authorization: Bearer <auth_value>` |
| `basic` | `Authorization: Basic <auth_value>` |
| `authorization` | `Authorization: <auth_value>` |
| `aws_sigv4` | Per-request AWS SigV4 signature ([details](./mcp_aws_sigv4.md)) |
| `authorization` | `Authorization: <auth_value>` (verbatim, no prefix) |
| `token` | `Authorization: token <auth_value>` (GitHub-style) |
| `oauth2` | `Authorization: Bearer <resolved_token>` — PKCE or M2M `client_credentials`. See [MCP OAuth](./mcp_oauth.md) |
| `oauth2_token_exchange` | `Authorization: Bearer <exchanged_token>` — RFC 8693 On-Behalf-Of. See [MCP OBO Auth](./mcp_obo_auth.md) |
| `aws_sigv4` | Per-request AWS SigV4 signature. See [MCP AWS SigV4](./mcp_aws_sigv4.md) |

Note: the header table above describes the managed SSE/HTTP transport path. The OpenAPI-tool path emits `Authorization: ApiKey <value>` instead of `X-API-Key` for `auth_type: api_key`; the deprecated `x-mcp-auth` broadcast header also uses the `ApiKey` form.

- **Extra Headers**: Optional list of additional header names that should be forwarded from client to the MCP server
- **Static Headers**: Optional map of header key/value pairs to include every request to the MCP server.
Expand Down
Loading