From 248e49f00d54c2b9c506a43984b2a5246fe47058 Mon Sep 17 00:00:00 2001 From: Gabriele Michelli Date: Mon, 18 May 2026 19:15:03 +0200 Subject: [PATCH 1/5] docs(mcp): complete auth_type table, OAuth config reference, RBAC intersection model, hub-vs-public-internet distinction --- docs/mcp.md | 8 ++- docs/mcp_control.md | 140 +++++++++++++++++++++++++++++++++++- docs/mcp_oauth.md | 16 +++-- docs/mcp_public_internet.md | 27 +++++++ 4 files changed, 182 insertions(+), 9 deletions(-) diff --git a/docs/mcp.md b/docs/mcp.md index f6fe01ac2..ccda31d62 100644 --- a/docs/mcp.md +++ b/docs/mcp.md @@ -228,11 +228,15 @@ mcp_servers: | Value | Header sent | |-------|-------------| + | `none` | No auth header added | | `api_key` | `X-API-Key: ` | | `bearer_token` | `Authorization: Bearer ` | | `basic` | `Authorization: Basic ` | - | `authorization` | `Authorization: ` | - | `aws_sigv4` | Per-request AWS SigV4 signature ([details](./mcp_aws_sigv4.md)) | + | `authorization` | `Authorization: ` (verbatim, no prefix) | + | `token` | `Authorization: token ` (GitHub-style) | + | `oauth2` | `Authorization: Bearer ` — PKCE or M2M `client_credentials`. See [MCP OAuth](./mcp_oauth.md) | + | `oauth2_token_exchange` | `Authorization: Bearer ` — RFC 8693 On-Behalf-Of. See [MCP OBO Auth](./mcp_obo_auth.md) | + | `aws_sigv4` | Per-request AWS SigV4 signature. See [MCP AWS SigV4](./mcp_aws_sigv4.md) | - **Extra Headers**: Optional list of additional header names that should be forwarded from client to the MCP server - **Static Headers**: Optional map of header key/value pairs to include every request to the MCP server. diff --git a/docs/mcp_control.md b/docs/mcp_control.md index ccaa37f94..0bceda4ab 100644 --- a/docs/mcp_control.md +++ b/docs/mcp_control.md @@ -35,6 +35,50 @@ When Creating a Key, Team, or Organization, you can select the allowed MCP Serve style={{width: '80%', display: 'block', margin: '0'}} /> +## Permission Hierarchy + +Permissions can be set at five distinct levels. When more than one level applies to a request, LiteLLM **intersects** the lists (most-restrictive wins) — except for the organization level, which acts as a **ceiling**. + +| Level | Source | How it composes | +|---|---|---| +| **Key** | `object_permission.mcp_servers` / `object_permission.mcp_access_groups` on the virtual key | If the key has an explicit list, it's used. | +| **Team** | Same fields on the team | If both key and team have lists, the result is the **intersection** (only servers in both). If only the team has a list, the key inherits it. | +| **End user** | Same fields on the `LiteLLM_EndUserTable` row matching `x-litellm-end-user-id` | Intersected with the running result. Skipped if no end-user-id is present on the request. | +| **Agent** | Same fields on the agent identified by `x-litellm-agent-id` | Intersected with the running result. Skipped if no agent-id is present. | +| **Organization** | Same fields on the org owning the key/team | Acts as a **ceiling** — the final allowed-server set is intersected with the org's list. If the org has no list, no additional restriction. | + +If no level has a list, the request can access **every** MCP server (open by default). + +```mermaid +flowchart TD + A[Inbound MCP request] --> B{Key has mcp_servers list?} + B -->|Yes| C[Start with key's list] + B -->|No| D[Start with: all servers] + C --> E{Team has list?} + D --> E + E -->|Yes, key also had list| F[Intersect with team's list] + E -->|Yes, key had no list| G[Use team's list] + E -->|No| H[Keep current] + F --> I + G --> I + H --> I + I[Running set] --> J{end-user-id present and end-user has list?} + J -->|Yes| K[Intersect with end-user list] + J -->|No| L[Keep current] + K --> M + L --> M + M{agent-id present and agent has list?} + M -->|Yes| N[Intersect with agent list] + M -->|No| O[Keep current] + N --> P + O --> P + P{Org has list?} + P -->|Yes| Q[Cap final set to org's list] + P -->|No| R[Final set] + Q --> R +``` + +The same intersection model applies to the per-server tool-level dict `mcp_tool_permissions` (see [Per-entity Tool-Level Permissions](#per-entity-tool-level-permissions) below). ## Allow/Disallow MCP Tools @@ -624,17 +668,109 @@ When creating API keys, you can assign them to specific access groups for permis style={{width: '80%', display: 'block', margin: '0'}} /> +#### Per-request Access Group Scoping — `x-mcp-access-groups` Header +In addition to the `x-mcp-servers` header (which targets servers by name), clients can scope a request to one or more **access groups** using the `x-mcp-access-groups` header. LiteLLM resolves the group names to concrete server IDs and intersects with the caller's normal permissions — the header narrows the scope, it does not grant access. -## Set Allowed Tools for a Key, Team, or Organization +```bash title="Scope this request to two access groups" showLineNumbers +curl -X POST "/mcp" \ + -H "x-litellm-api-key: Bearer sk-..." \ + -H "x-mcp-access-groups: dev_group,research_tools" \ + -H "Content-Type: application/json" \ + -d '{"jsonrpc":"2.0","id":1,"method":"tools/list","params":{}}' +``` + +Comma-separated list. Combine with `x-mcp-servers` to further narrow the set; the two headers are intersected before the per-entity intersection model runs. -Control which tools different teams can access from the same MCP server. For example, give your Engineering team access to `list_repositories`, `create_issue`, and `search_code`, while Sales only gets `search_code` and `close_issue`. +## Per-entity Tool-Level Permissions {#per-entity-tool-level-permissions} + +Control which tools different teams can access from the same MCP server. For example, give your Engineering team access to `list_repositories`, `create_issue`, and `search_code`, while Sales only gets `search_code` and `close_issue`. + This video shows how to set allowed tools for a Key, Team, or Organization. +### `mcp_tool_permissions` API + +`object_permission.mcp_tool_permissions` is a `Dict[server_id, List[tool_name]]` on the key, team, end-user, agent, or organization. It's evaluated **after** server-level access has been resolved (see [Permission Hierarchy](#permission-hierarchy) above) and applies the same five-level intersection — most-restrictive wins, organization acts as a ceiling. + +This is distinct from the server-registration-level `allowed_tools` / `disallowed_tools` (which apply to **every** caller of the server). `mcp_tool_permissions` lets you carve out per-team subsets without changing the server config. + + + + +```bash title="Engineering key — full GitHub access" showLineNumbers +curl -X POST "http://localhost:4000/key/generate" \ + -H "Authorization: Bearer sk-master-key" \ + -H "Content-Type: application/json" \ + -d '{ + "object_permission": { + "mcp_servers": ["github_mcp"], + "mcp_tool_permissions": { + "github_mcp": ["list_repositories", "create_issue", "search_code"] + } + } + }' +``` + +```bash title="Sales key — read-only on the same server" showLineNumbers +curl -X POST "http://localhost:4000/key/generate" \ + -H "Authorization: Bearer sk-master-key" \ + -H "Content-Type: application/json" \ + -d '{ + "object_permission": { + "mcp_servers": ["github_mcp"], + "mcp_tool_permissions": { + "github_mcp": ["search_code", "close_issue"] + } + } + }' +``` + + + + +```bash title="Team-wide tool subset (all keys inherit)" showLineNumbers +curl -X POST "http://localhost:4000/team/new" \ + -H "Authorization: Bearer sk-master-key" \ + -H "Content-Type: application/json" \ + -d '{ + "team_alias": "engineering", + "object_permission": { + "mcp_servers": ["github_mcp", "deepwiki_mcp"], + "mcp_tool_permissions": { + "github_mcp": ["list_repositories", "create_issue", "search_code"] + } + } + }' +``` + +When the key also sets `mcp_tool_permissions` for `github_mcp`, the resulting tool list is the **intersection** of the two. + + + + +When an agent (identified by `x-litellm-agent-id`) calls MCP tools, the agent's own `mcp_tool_permissions` participate in the intersection. Useful for capping what an autonomous agent can do regardless of which key originally invoked it. + +```bash showLineNumbers +curl -X PATCH "http://localhost:4000/v1/agents/{agent_id}" \ + -H "Authorization: Bearer sk-master-key" \ + -H "Content-Type: application/json" \ + -d '{ + "object_permission": { + "mcp_servers": ["github_mcp"], + "mcp_tool_permissions": { + "github_mcp": ["search_code"] + } + } + }' +``` + + + + ## Dashboard View Modes diff --git a/docs/mcp_oauth.md b/docs/mcp_oauth.md index 7980a1927..dbcca3bfe 100644 --- a/docs/mcp_oauth.md +++ b/docs/mcp_oauth.md @@ -296,11 +296,17 @@ curl http://localhost:4000/mcp-rest/tools/call \ | Field | Required | Description | |-------|----------|-------------| -| `auth_type` | Yes | Must be `oauth2` | -| `client_id` | Yes | OAuth2 client ID. Supports `os.environ/VAR_NAME` | -| `client_secret` | Yes | OAuth2 client secret. Supports `os.environ/VAR_NAME` | -| `token_url` | Yes | Token endpoint URL | -| `scopes` | No | List of scopes to request | +| `auth_type` | Yes | Must be `oauth2`. For RFC 8693 On-Behalf-Of, use `oauth2_token_exchange` instead — see [MCP OBO Auth](./mcp_obo_auth.md). | +| `oauth2_flow` | No | Explicit flow selector. One of `"client_credentials"` (M2M) or `"authorization_code"` (interactive PKCE). If omitted, LiteLLM infers from the other fields: `authorization_url` present → interactive; only `token_url` + `client_id` + `client_secret` → client credentials. Set explicitly when in doubt — for example, when a legacy DB row has both `authorization_url` and `token_url` but you want M2M. | +| `client_id` | Yes for M2M, optional for interactive | OAuth2 client ID. Required for `client_credentials`. For interactive flows, can be obtained via Dynamic Client Registration (RFC 7591) at `POST /{server_name}/register` if the upstream supports it. Supports `os.environ/VAR_NAME`. | +| `client_secret` | Yes for M2M, optional for interactive | OAuth2 client secret. Same applicability as `client_id`. Supports `os.environ/VAR_NAME`. | +| `token_url` | Yes for M2M, optional for interactive | Token endpoint URL. LiteLLM POSTs to this for `client_credentials` and for the authorization-code exchange. | +| `authorization_url` | Interactive only | Upstream authorization endpoint. When present, LiteLLM treats the server as interactive PKCE and proxies `GET /{server_name}/authorize` to this URL. | +| `registration_url` | Optional | Upstream Dynamic Client Registration endpoint (RFC 7591). When present, `POST /{server_name}/register` proxies through to this URL. | +| `scopes` | No | List of scopes to request. For M2M, joined into the `scope` parameter on the token request. For interactive, forwarded on the authorize request. | +| `token_validation` | No | Dict of key-value rules checked against the OAuth token response after the `/token` exchange. Any rule mismatch fails the exchange with `token_validation_failed`. Useful for asserting a tenant claim like `{"team.enterprise_id": "T12345"}`. | +| `token_storage_ttl_seconds` | No | Override the TTL for the per-user token cache (interactive flow). If unset, LiteLLM uses `expires_in - buffer` from the token response. | +| `delegate_auth_to_upstream` | No | When `true`, skip LiteLLM's own API-key / SSO check and let the client's PKCE flow run end-to-end with the upstream MCP server. See [Delegate Auth to Upstream](#delegate-auth-to-upstream-pkce-passthrough) above. | ## Debugging OAuth diff --git a/docs/mcp_public_internet.md b/docs/mcp_public_internet.md index 69dd74646..ffc9145c6 100644 --- a/docs/mcp_public_internet.md +++ b/docs/mcp_public_internet.md @@ -249,3 +249,30 @@ general_settings: ``` When empty, the standard private ranges are used (`10.0.0.0/8`, `172.16.0.0/12`, `192.168.0.0/16`, `127.0.0.0/8`). + +--- + +## Public Internet vs MCP Hub Visibility + +`available_on_public_internet` and the **MCP Hub strict whitelist** are two separate mechanisms that are easy to confuse: + +| Concern | Controlled by | Default | +|---|---|---| +| Can an external (non-private-CIDR) caller see this server? | `available_on_public_internet` on the server | `false` (internal only) | +| Does this server appear in the published MCP Hub registry (`/v1/mcp/registry.json` when `enable_mcp_registry: true`)? | `litellm.public_mcp_servers` list, gated by `litellm.public_mcp_hub_strict_whitelist` | Hub strict whitelist is **on** by default — only servers explicitly listed in `public_mcp_servers` are advertised | + +In the **default strict-whitelist mode**, `available_on_public_internet: true` does not make a server appear in the hub. You also need to add it to `public_mcp_servers`: + +```yaml title="Both flags set — visible to external callers AND on the hub" showLineNumbers +litellm_settings: + public_mcp_servers: + - deepwiki + # public_mcp_hub_strict_whitelist defaults to true + +mcp_servers: + deepwiki: + url: https://mcp.deepwiki.com/mcp + available_on_public_internet: true +``` + +If you set `public_mcp_hub_strict_whitelist: false`, the hub falls back to advertising every server that has `available_on_public_internet: true` — but the IP-based access filter on this page still applies independently. From 61432fb15ae9dabe205f94cb8492452a3c9bd6d1 Mon Sep 17 00:00:00 2001 From: Gabriele Michelli Date: Mon, 18 May 2026 19:16:07 +0200 Subject: [PATCH 2/5] docs(a2a): document x-litellm-api-key, trace-id enforcement, sub-agent propagation, agent access groups and full intersection model --- docs/a2a.md | 36 +++++++++++++- docs/a2a_agent_permissions.md | 88 ++++++++++++++++++++++++++++------- 2 files changed, 106 insertions(+), 18 deletions(-) diff --git a/docs/a2a.md b/docs/a2a.md index 9c86d0de3..bd29f5629 100644 --- a/docs/a2a.md +++ b/docs/a2a.md @@ -211,12 +211,46 @@ POST /a2a/{agent_name}/message/send ### Authentication -Include your LiteLLM Virtual Key in the `Authorization` header: +Include your LiteLLM Virtual Key in either of two headers — `x-litellm-api-key` is preferred when the inbound `Authorization` header may carry a token destined for the backend agent (e.g. when using the [convention-based passthrough](./a2a_agent_headers#method-3--convention-based-forwarding) to forward the caller's identity). ``` Authorization: Bearer sk-your-litellm-key +# or +x-litellm-api-key: Bearer sk-your-litellm-key ``` +#### Per-agent permission check + +After the virtual key is authenticated, LiteLLM checks whether the calling key (and its team) is allowed to invoke the requested agent. If not, the response is HTTP 403. See [Agent Permission Management](./a2a_agent_permissions) for the full intersection model and access groups. + +#### Trace ID enforcement (optional, per-agent) + +An agent can require every inbound request to carry a trace ID for cross-system audit threading. Set `require_trace_id_on_calls_to_agent: true` in the agent's `litellm_params`. When set, requests missing `x-litellm-trace-id` (or `x-litellm-session-id`) are rejected with HTTP 400. + +```bash title="Register an agent that requires inbound trace IDs" showLineNumbers +curl -X POST http://localhost:4000/v1/agents \ + -H "Authorization: Bearer sk-master-key" \ + -H "Content-Type: application/json" \ + -d '{ + "agent_name": "audit-critical-agent", + "agent_card_params": { ... }, + "litellm_params": { + "require_trace_id_on_calls_to_agent": true + } + }' +``` + +The reverse direction — enforcing trace ID on **outbound** calls made by a key owned by an agent — is controlled by `require_trace_id_on_calls_by_agent` on the same `litellm_params` block. + +#### Sub-agent identity propagation + +When the backend agent itself calls LiteLLM (for chat completions or to invoke a sub-agent), LiteLLM forwards two headers to maintain trace continuity: + +- `X-LiteLLM-Trace-Id` — links all calls in the chain to a single trace +- `X-LiteLLM-Agent-Id` — attributes spend to the originating agent + +The caller's **virtual key** and **end-user ID** are not automatically forwarded. If the downstream agent needs the user's identity, propagate it explicitly via [`extra_headers` or the `x-a2a-{agent_name_or_id}-{header}` convention](./a2a_agent_headers). + ### Request Format LiteLLM follows the [A2A JSON-RPC 2.0 specification](https://github.com/google/A2A): diff --git a/docs/a2a_agent_permissions.md b/docs/a2a_agent_permissions.md index 93f367f43..70c527dac 100644 --- a/docs/a2a_agent_permissions.md +++ b/docs/a2a_agent_permissions.md @@ -208,35 +208,89 @@ curl -X POST "http://localhost:4000/a2a/agent-456" \ -d '{"message": {"role": "user", "parts": [{"type": "text", "text": "Hello"}]}}' ``` +## Agent Access Groups + +Granting individual agents to every key or team gets unwieldy as the agent catalog grows. **Agent access groups** let you tag agents with logical labels, then grant the **group** to a key or team — adding a new agent to the group automatically makes it available to every key/team that holds the group. + +### 1. Tag the agent with one or more groups + + + + +1. Go to **Agents** in the LiteLLM dashboard. +2. Create or edit an agent. +3. Under **Access Groups**, type a group name (e.g. `clinical-tools`) and press Enter. + + + + +```bash showLineNumbers +curl -X POST http://localhost:4000/v1/agents \ + -H "Authorization: Bearer sk-master-key" \ + -H "Content-Type: application/json" \ + -d '{ + "agent_name": "patient-lookup", + "agent_card_params": { ... }, + "agent_access_groups": ["clinical-tools", "phi-allowed"] + }' +``` + + + + +### 2. Grant a key or team the group + +```bash title="Key with access to two groups" showLineNumbers +curl -X POST "http://localhost:4000/key/generate" \ + -H "Authorization: Bearer sk-master-key" \ + -H "Content-Type: application/json" \ + -d '{ + "object_permission": { + "agent_access_groups": ["clinical-tools", "research-tools"] + } + }' +``` + +The key now has access to every agent tagged with either group — no per-agent enumeration required. The same `agent_access_groups` field is also valid on a team's `object_permission`. + +When a key has **both** a direct `agents` list and `agent_access_groups`, the union is computed (any agent reached by either path is allowed), and then the team-level intersection is applied as described below. + ## How It Works ```mermaid flowchart TD - A[Request to invoke agent] --> B{LiteLLM Virtual Key has agent restrictions?} - B -->|Yes| C{LiteLLM Team has agent restrictions?} - B -->|No| D{LiteLLM Team has agent restrictions?} - - C -->|Yes| E[Use intersection of key + team permissions] - C -->|No| F[Use key permissions only] - - D -->|Yes| G[Inherit team permissions] - D -->|No| H[Allow ALL agents] - - E --> I{Agent in allowed list?} - F --> I - G --> I - H --> J[Allow request] - - I -->|Yes| J - I -->|No| K[Return 403 Forbidden] + A[Request to invoke agent] --> B{Key allowlist: agents + agent_access_groups} + B --> C{Team allowlist: agents + agent_access_groups} + C -->|Both empty| D[Allow ALL agents] + C -->|Key only| E[Use key's allowlist] + C -->|Team only| F[Inherit team's allowlist] + C -->|Both set| G[Intersect key and team allowlists] + D --> H{End-user has allowlist?} + E --> H + F --> H + G --> H + H -->|Yes| I[Intersect with end-user allowlist] + H -->|No| J[Keep current] + I --> K{Org has allowlist?} + J --> K + K -->|Yes| L[Cap final set to org's allowlist - org is a ceiling] + K -->|No| M[Final allowlist] + L --> M + M --> N{Requested agent in final list?} + N -->|Yes| O[Allow request] + N -->|No| P[Return 403 Forbidden] ``` +The model mirrors the [MCP RBAC intersection](./mcp_control#permission-hierarchy): per-level lists are intersected (most-restrictive wins) except for the organization level, which acts as a **ceiling**. + | Key Permissions | Team Permissions | Result | Notes | |-----------------|------------------|--------|-------| | None | None | Key can access **all** agents | Open access by default when no restrictions are set | | `["agent-1", "agent-2"]` | None | Key can access `agent-1` and `agent-2` | Key uses its own permissions | | None | `["agent-1", "agent-3"]` | Key can access `agent-1` and `agent-3` | Key inherits team's permissions | | `["agent-1", "agent-2"]` | `["agent-1", "agent-3"]` | Key can access `agent-1` only | Intersection of both lists (most restrictive wins) | +| `agent_access_groups: ["clinical"]` | None | Key can access every agent tagged `clinical` | Access groups resolved to concrete agent IDs | +| `agent_access_groups: ["clinical"]` | `agents: ["agent-1"]` | Intersection of (every agent tagged `clinical`) and `["agent-1"]` | Mixing direct and group grants is supported | ## Viewing Permissions From 883d44a687dbde87e991e2b185861636c8f6fc1f Mon Sep 17 00:00:00 2001 From: Gabriele Michelli Date: Mon, 18 May 2026 19:17:16 +0200 Subject: [PATCH 3/5] =?UTF-8?q?docs(bedrock=5Fagentcore):=20add=20LiteLLM?= =?UTF-8?q?=20A2A=20Gateway=20section=20=E2=80=94=20fixes=20broken=20ancho?= =?UTF-8?q?r=20from=20a2a.md,=20documents=20dual=20JWT/SigV4=20auth=20mode?= =?UTF-8?q?s=20and=20full=20credential=20chain?= MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit --- docs/providers/bedrock_agentcore.md | 146 ++++++++++++++++++++++++++++ 1 file changed, 146 insertions(+) diff --git a/docs/providers/bedrock_agentcore.md b/docs/providers/bedrock_agentcore.md index 7802624fc..96a8265a6 100644 --- a/docs/providers/bedrock_agentcore.md +++ b/docs/providers/bedrock_agentcore.md @@ -245,8 +245,154 @@ model_list: | `qualifier` | string | Optional runtime qualifier/version to invoke a specific version of the agent runtime | | `runtimeSessionId` | string | Optional custom session ID (must be 33+ characters). If not provided, LiteLLM generates one automatically | +## LiteLLM A2A Gateway {#litellm-a2a-gateway} + +Register a Bedrock AgentCore runtime as a first-class A2A agent on the LiteLLM [Agent Gateway](../a2a). This gives you per-agent RBAC, access groups, trace-ID enforcement, and the `x-a2a-{agent_name_or_id}-{header}` per-user passthrough convention — same surface as any other A2A provider. + +This path is distinct from the chat-completions invocation above. Pick one based on your client: + +| You want to call AgentCore via... | Use this path | +|---|---| +| `/v1/chat/completions` with `model: bedrock/agentcore/` | Chat completions (covered above) | +| `POST /a2a/{agent_id}` with A2A JSON-RPC 2.0 (`message/send` or `message/stream`) | A2A Gateway (this section) | + +### 1. Register the agent + + + + +1. Go to **Agents** → **Add Agent**. +2. Select **Bedrock AgentCore** as the provider. +3. Paste the AgentCore Runtime ARN as the agent URL. +4. Configure AWS credentials (or leave blank to use the proxy's ambient credential chain — see [Authentication](#a2a-gateway-authentication) below). + + + + +```bash showLineNumbers +curl -X POST http://localhost:4000/v1/agents \ + -H "Authorization: Bearer sk-admin" \ + -H "Content-Type: application/json" \ + -d '{ + "agent_name": "my-agentcore-runtime", + "agent_card_params": { + "name": "my-agentcore-runtime", + "description": "Internal research agent", + "url": "bedrock/agentcore/arn:aws:bedrock-agentcore:us-east-1:123456789012:runtime/my-runtime" + }, + "litellm_params": { + "custom_llm_provider": "bedrock", + "aws_role_name": "arn:aws:iam::123456789012:role/LiteLLMAgentCoreInvoker", + "aws_region_name": "us-east-1" + } + }' +``` + + + + +### 2. Invoke via A2A + +```bash showLineNumbers +curl -X POST http://localhost:4000/a2a/my-agentcore-runtime/message/send \ + -H "x-litellm-api-key: Bearer sk-client-key" \ + -H "Content-Type: application/json" \ + -d '{ + "jsonrpc": "2.0", + "id": "1", + "method": "message/send", + "params": { + "message": { + "role": "user", + "parts": [{"kind": "text", "text": "Summarize the latest clinical trial results"}], + "messageId": "msg-1" + } + } + }' +``` + +### Authentication {#a2a-gateway-authentication} + +The AgentCore A2A path supports **two distinct outbound auth modes**, picked automatically based on what's in `litellm_params`: + +| Mode | When it fires | What's sent to AgentCore | +|---|---|---| +| **Bearer / JWT** | `litellm_params.api_key` is set (any value) | `Authorization: Bearer ` — SigV4 is bypassed entirely | +| **SigV4** | `litellm_params.api_key` is **not** set | Per-request SigV4 signature using the full AWS credential chain (below) | + +#### SigV4 credential resolution + +When SigV4 mode is active, credentials are resolved in this priority order: + +1. **`aws_web_identity_token` + `aws_role_name` + `aws_session_name`** → `sts:AssumeRoleWithWebIdentity`. Cross-account IRSA path. +2. **`aws_role_name` alone** → `sts:AssumeRole`. The proxy's ambient credentials (instance profile, IRSA, env vars) are the source identity. Session name auto-generated if omitted. +3. **`aws_profile_name`** → resolved via the boto3 profile loader (`~/.aws/credentials`). +4. **`aws_access_key_id` + `aws_secret_access_key` + `aws_session_token`** → explicit temporary credentials. +5. **`aws_access_key_id` + `aws_secret_access_key`** → explicit long-lived credentials. +6. **No credentials configured** → boto3 default chain (env vars, IRSA via `AWS_WEB_IDENTITY_TOKEN_FILE` + `AWS_ROLE_ARN`, instance metadata). + +Recognized fields on `litellm_params` for SigV4: + +| Field | Description | +|---|---| +| `aws_role_name` | IAM role ARN to assume via STS | +| `aws_session_name` | Session name for the AssumeRole call (auto-generated if omitted) | +| `aws_external_id` | ExternalId passed to `sts:AssumeRole` for cross-account trust policies | +| `aws_web_identity_token` | OIDC token for `AssumeRoleWithWebIdentity` (set explicitly or via `AWS_WEB_IDENTITY_TOKEN_FILE` env) | +| `aws_profile_name` | AWS CLI profile name | +| `aws_sts_endpoint` | Custom STS endpoint (VPC endpoints, FIPS endpoints) | +| `aws_access_key_id` / `aws_secret_access_key` / `aws_session_token` | Explicit credentials | +| `aws_region_name` | AWS region. If omitted, detected from the runtime ARN in `agent_card_params.url`. | + +#### IRSA on EKS + +For Kubernetes deployments using [IAM Roles for Service Accounts](https://docs.aws.amazon.com/eks/latest/userguide/iam-roles-for-service-accounts.html), no explicit credential configuration is needed — boto3's default chain picks up `AWS_WEB_IDENTITY_TOKEN_FILE` and `AWS_ROLE_ARN` from the pod environment automatically. + +If you want the invocation to assume a **second** role (e.g. separate the pod's identity from the agent-invocation identity for CloudTrail attribution), combine IRSA with `aws_role_name`: + +```bash showLineNumbers +curl -X POST http://localhost:4000/v1/agents \ + -H "Authorization: Bearer sk-admin" \ + -H "Content-Type: application/json" \ + -d '{ + "agent_name": "production-runtime", + "agent_card_params": { + "name": "production-runtime", + "url": "bedrock/agentcore/arn:aws:bedrock-agentcore:us-east-1:123456789012:runtime/prod" + }, + "litellm_params": { + "custom_llm_provider": "bedrock", + "aws_role_name": "arn:aws:iam::123456789012:role/AgentCoreInvocationRole", + "aws_session_name": "litellm-prod" + } + }' +``` + +The proxy pod's IRSA role serves as the source identity for the AssumeRole call; the assumed role's CloudTrail entries reflect the agent invocation. + +### Per-user header passthrough + +The standard A2A header forwarding mechanisms apply — see [A2A Agent Authentication Headers](../a2a_agent_headers) for the full reference. All three methods work with AgentCore: + +- **`static_headers`** — always sent to AgentCore (e.g. a custom `X-Tenant-Id`) +- **`extra_headers`** — admin-configured allowlist of client headers to forward +- **`x-a2a-{agent_name_or_id}-{header}` convention** — caller-driven forwarding without admin config + +Note that the SigV4 / Bearer auth handled by `litellm_params` is **separate** from the agent-level header forwarding above. Auth headers are computed per-request by the AWS signer; user passthrough headers are merged into the request after signing. + +### RBAC and trace IDs + +All standard A2A controls apply: +- **Per-agent RBAC** — [Agent Permission Management](../a2a_agent_permissions). Returns HTTP 403 when the calling key/team isn't authorized for the AgentCore agent. +- **Access groups** — tag the agent with `agent_access_groups: ["clinical-tools"]` and grant the group to a team. +- **Trace ID enforcement** — set `require_trace_id_on_calls_to_agent: true` on `litellm_params` to require `x-litellm-trace-id` on every inbound call. See [A2A Overview — Trace ID enforcement](../a2a#trace-id-enforcement-optional-per-agent). + ## Further Reading - [AWS Bedrock AgentCore Documentation](https://docs.aws.amazon.com/bedrock/latest/APIReference/API_agentcore_InvokeAgentRuntime.html) - [LiteLLM Authentication to Bedrock](https://docs.litellm.ai/docs/providers/bedrock#boto3---authentication) +- [LiteLLM A2A Gateway Overview](../a2a) +- [A2A Agent Authentication Headers](../a2a_agent_headers) +- [A2A Agent Permission Management](../a2a_agent_permissions) +- [MCP AWS SigV4](../mcp_aws_sigv4) — for the AgentCore-hosted MCP servers path (separate from the agent runtimes path) From 28c6463c33c39e5b459e7ec7fe8f271869a56da8 Mon Sep 17 00:00:00 2001 From: Gabriele Michelli Date: Mon, 18 May 2026 19:19:13 +0200 Subject: [PATCH 4/5] docs: add AuthN/AuthZ overview page side-by-siding MCP and A2A gateways --- docs/auth_overview.md | 163 ++++++++++++++++++++++++++++++++++++++++++ sidebars.js | 1 + 2 files changed, 164 insertions(+) create mode 100644 docs/auth_overview.md diff --git a/docs/auth_overview.md b/docs/auth_overview.md new file mode 100644 index 000000000..d8daeea36 --- /dev/null +++ b/docs/auth_overview.md @@ -0,0 +1,163 @@ +import Tabs from '@theme/Tabs'; +import TabItem from '@theme/TabItem'; + +# AuthN/AuthZ Reference — MCP and A2A Side-by-Side + +LiteLLM exposes two gateway surfaces that share most authentication and authorization primitives but diverge in a few important places. This page is the side-by-side reference: which header does what, where the two surfaces are symmetric, and where they're not. Each section links out to the dedicated page for the deep dive. + +| Surface | Endpoints | Dedicated docs | +|---|---|---| +| **MCP Gateway** | `/mcp`, `/{server}/mcp`, `/toolset/{name}/mcp`, `/sse`, `/v1/mcp/...`, `/mcp-rest/...` | [MCP Overview](./mcp) | +| **A2A Agent Gateway** | `/a2a/{agent_id}`, `/a2a/{agent_id}/message/send`, `/v1/agents/...` | [A2A Overview](./a2a) | + +--- + +## 1. Client → LiteLLM (authenticating the caller) + +Both surfaces accept the same LiteLLM Virtual Key headers and the same identification headers. The one place they diverge: the MCP **ASGI** routes (the streamable MCP endpoints at `/mcp`, `/{name}/mcp`, `/toolset/{name}/mcp`, `/sse`) bypass the standard FastAPI auth dependency and only check `x-litellm-api-key` and `Authorization`. The MCP **REST/management** routes (`/v1/mcp/...`, `/mcp-rest/...`) and **all** A2A routes accept the full six-header set. + +| Header | Purpose | MCP ASGI | MCP REST + A2A | +|---|---|---|---| +| `x-litellm-api-key: Bearer sk-...` | Preferred LiteLLM Virtual Key header. Use whenever the inbound `Authorization` header may carry a different token (OAuth passthrough, OBO, A2A per-user forwarding). | ✓ | ✓ | +| `Authorization: Bearer sk-...` | Standard fallback. Stripped of the `Bearer ` prefix before lookup. | ✓ | ✓ | +| `API-Key`, `x-api-key`, `x-goog-api-key`, `Ocp-Apim-Subscription-Key` | Vendor-specific aliases (Azure, Anthropic, Google AI Studio, Azure APIM). | — | ✓ | +| `x-litellm-end-user-id` | End-user identification. Layers per-end-user budgets, MCP access intersection, and audit log entries on top of the key. `x-litellm-customer-id` is an accepted alias. | ✓ | ✓ | +| `x-litellm-trace-id` | Cross-request correlation ID. Falls back to `x-litellm-session-id` or any matching `x--session-id` header. | ✓ | ✓ | +| `x-litellm-session-id` | Session grouping. Same parse path as trace-id, lower priority. | ✓ | ✓ | +| `x-litellm-tags` | Comma-separated tags for spend-log labeling and tag-based routing. Body field `tags` takes precedence. | — (not parsed on MCP ASGI) | ✓ | +| `x-litellm-mcp-debug: true` | Returns masked diagnostic response headers (`x-mcp-debug-*`). See [MCP OAuth — Debugging](./mcp_oauth#debugging-oauth). | ✓ | — | +| `x-mcp-servers` | Scope a request to specific MCP servers (comma-separated). | ✓ | — | +| `x-mcp-access-groups` | Scope a request to specific MCP access groups. See [MCP Permission Management — Per-request Access Group Scoping](./mcp_control#per-request-access-group-scoping--x-mcp-access-groups-header). | ✓ | — | + +--- + +## 2. LiteLLM → Backend (authenticating the gateway to the agent or MCP server) + +This is the section where MCP and A2A diverge most. MCP has a first-class `auth_type` field on each server registration. **A2A has no `auth_type` field at all** — the outbound auth mode is inferred from what's present in `litellm_params`. + +### MCP — `auth_type` enum + +Nine values. The MCP server's outbound `Authorization` header (or per-request SigV4 signature) is determined by `auth_type`. See [MCP Overview — Add HTTP MCP Server](./mcp#add-http-mcp-server) for the full table. + +| `auth_type` | Mechanism | Dedicated docs | +|---|---|---| +| `none` | No auth header added | — | +| `api_key` / `bearer_token` / `basic` / `authorization` / `token` | Static header, sent verbatim per call | [MCP Overview](./mcp) | +| `oauth2` | PKCE (interactive) or M2M `client_credentials`. Discriminated by `oauth2_flow`. | [MCP OAuth](./mcp_oauth) | +| `oauth2_token_exchange` | RFC 8693 On-Behalf-Of (OBO) — exchange the caller's bearer token for a scoped MCP token | [MCP OBO Auth](./mcp_obo_auth) | +| `aws_sigv4` | Per-request SigV4 signature using a dedicated MCP-side credential chain | [MCP AWS SigV4](./mcp_aws_sigv4) | + +### A2A — auth mode inferred from `litellm_params` + +There is no `auth_type` field on an agent. The provider handler picks the auth mechanism from the contents of `litellm_params`: + +| Mode | When it fires | Send to backend | +|---|---|---| +| **Bearer / JWT** | `litellm_params.api_key` is set | `Authorization: Bearer ` | +| **SigV4** (AgentCore only) | `litellm_params.api_key` is unset and the provider is `bedrock` | Per-request SigV4 via the full `base_aws_llm` credential chain (six entry points: web-identity+role, role alone, profile, session-token triple, key+secret, env-vars / IRSA fallback). See [Bedrock AgentCore — A2A Gateway Authentication](./providers/bedrock_agentcore#a2a-gateway-authentication). | +| **Provider-native** | `litellm_params.custom_llm_provider` matches a non-Bedrock provider (Vertex AI Agent Engine, LangGraph, Azure AI Foundry, Pydantic AI) | The provider's normal auth path | + +The dual JWT-vs-SigV4 mode is specific to AgentCore. Other A2A providers (Vertex, LangGraph, Azure Foundry) use the provider's own credential conventions — see the relevant provider page under [Providers](./providers). + +### Zero-trust add-on (MCP-only today) + +If the MCP server needs to **cryptographically verify** the request came through LiteLLM, layer the [MCP JWT Signer](./mcp_zero_trust) guardrail on top. It signs every outbound tool call with a short-lived RS256 JWT and publishes a JWKS endpoint the MCP server can verify against. This is a guardrail (`guardrail: mcp_jwt_signer`, `mode: pre_mcp_call`), not an `auth_type` — it composes with any `auth_type`. + +--- + +## 3. Per-user header passthrough + +Both surfaces let clients forward credentials destined for a specific backend server/agent without admin pre-configuration. The conventions look symmetric but parse differently — be precise when copy-pasting. + +| Surface | Prefix | Parse rule | Match against | Example | +|---|---|---|---|---| +| **MCP** | `x-mcp-` | Split on the **first dash** after the prefix → `(server_alias, header_name)` | Server's `alias`, then `server_name` (case-insensitive) | `x-mcp-github-authorization: Bearer ghp_...` → server `github`, header `Authorization` | +| **A2A** | `x-a2a-` | Exact-prefix match against `x-a2a-{agent_id_lower}-` or `x-a2a-{agent_name_lower}-`; everything after the trailing dash is the header name | Agent's UUID **and** human-readable name (both tried) | `x-a2a-my-agent-x-api-key: secret` → agent `my-agent`, header `x-api-key` | + +Both surfaces also support admin-controlled alternatives that compose with the user passthrough: + +| Mechanism | MCP | A2A | Notes | +|---|---|---|---| +| `static_headers: {K: V}` | ✓ | ✓ | Always sent. **Wins over user passthrough** on key conflicts. | +| `extra_headers: [name, name, ...]` | ✓ | ✓ | Admin-allowlist of client header names to forward verbatim. | +| `x---
` convention | ✓ (`x-mcp-`) | ✓ (`x-a2a-`) | Client-driven, no admin config needed. | + +See [MCP Overview — Forwarding Custom Headers](./mcp#forwarding-custom-headers-to-mcp-servers) and [A2A Agent Authentication Headers](./a2a_agent_headers) for the full mechanics. + +--- + +## 4. AuthZ — RBAC and access groups + +Both surfaces use the same `object_permission` model with a five-level intersection. The detailed flowcharts and tables live on the dedicated pages: + +- [MCP Permission Hierarchy](./mcp_control#permission-hierarchy) +- [A2A Agent Permission Management — How It Works](./a2a_agent_permissions#how-it-works) + +| Level | MCP field | A2A field | +|---|---|---| +| **Key** | `object_permission.mcp_servers`, `object_permission.mcp_access_groups`, `object_permission.mcp_tool_permissions` | `object_permission.agents`, `object_permission.agent_access_groups` | +| **Team** | Same | Same | +| **End user** | Same (via `x-litellm-end-user-id`) | Same (via `x-litellm-end-user-id`) | +| **Agent** | Same (via `x-litellm-agent-id`) | n/a (you're already targeting an agent) | +| **Org** | Same — acts as a **ceiling** | Same — acts as a **ceiling** | + +| Concern | MCP | A2A | +|---|---|---| +| Per-server / per-agent allowlist | `object_permission.mcp_servers` | `object_permission.agents` | +| Access groups (tag-based grants) | `object_permission.mcp_access_groups` | `object_permission.agent_access_groups` | +| Per-server tool-level allowlist | `object_permission.mcp_tool_permissions: {server_id: [tool, ...]}` | n/a (tools live inside the agent) | +| Server-registration allowlist (admin-static) | `allowed_tools` / `disallowed_tools` on the MCP server | n/a | +| Param-level allowlist | `allowed_params: {tool_name: [param, ...]}` on the MCP server | n/a | +| Reject behaviour | `list_tools` filters out hidden servers; `call_tool` returns error | `GET /v1/agents` filters; `POST /a2a/{agent_id}` returns HTTP **403** | + +--- + +## 5. Trace IDs and identity propagation + +`x-litellm-trace-id` is **accepted** on every request and threaded through logging on both surfaces. A few A2A-specific extras: + +| Setting | Scope | Behaviour | +|---|---|---| +| `require_trace_id_on_calls_to_agent: true` | Per-agent, on the agent's `litellm_params` | Reject inbound `/a2a/{agent_id}` calls missing `x-litellm-trace-id` (or `x-litellm-session-id` fallback) with **HTTP 400**. See [A2A Overview — Trace ID enforcement](./a2a#trace-id-enforcement-optional-per-agent). | +| `require_trace_id_on_calls_by_agent: true` | Per-agent, on the agent's `litellm_params` | Reverse direction — when a key **owned by** that agent makes outbound calls, require a trace ID on those. | + +**Sub-agent identity propagation** — when LiteLLM dispatches a downstream call as part of an A2A invocation, it forwards `X-LiteLLM-Trace-Id` and `X-LiteLLM-Agent-Id` to maintain trace continuity and spend attribution. The original virtual key and end-user identity are **not** auto-forwarded. Use `extra_headers` or the `x-a2a-{agent_name_or_id}-{header}` convention to thread identity explicitly. See [A2A Overview — Sub-agent identity propagation](./a2a#sub-agent-identity-propagation). + +--- + +## 6. Guardrails on the gateway path + +| Concern | MCP | A2A | +|---|---|---| +| Pre-call input guardrails (Presidio, Bedrock, Lakera, Aporia, etc.) | `mode: pre_mcp_call` | Standard chat-completion guardrails apply to the underlying LLM calls the agent makes | +| During-call intervention | `mode: during_mcp_call` | — | +| Zero-trust JWT signing | [`mcp_jwt_signer` guardrail](./mcp_zero_trust) | — (not applicable to A2A today) | +| Documentation | [MCP Guardrails](./mcp_guardrail), [MCP Zero Trust](./mcp_zero_trust) | Standard [guardrails docs](./proxy/guardrails) apply via the agent's underlying model calls | + +--- + +## 7. Cheatsheet — what header does what + +For copy-paste, the high-frequency request headers across both surfaces: + +```http +# Always (LiteLLM-side auth and identification) +x-litellm-api-key: Bearer sk-... +# or +Authorization: Bearer sk-... + +x-litellm-end-user-id: user-42 +x-litellm-trace-id: 8f4a-2b1c-d3e5-... + +# MCP — server scoping / per-user passthrough +x-mcp-servers: github,zapier +x-mcp-access-groups: dev_group +x-mcp-github-authorization: Bearer ghp_ # user passthrough to github_mcp +x-litellm-mcp-debug: true # diagnostic response headers + +# A2A — per-user passthrough +x-a2a-my-agent-authorization: Bearer # caller's token to my-agent +x-a2a-my-agent-x-api-key: # additional per-agent header +``` + +For the deep dives, follow the cross-links above into the dedicated pages. diff --git a/sidebars.js b/sidebars.js index 6ff0c72fa..813b58bd8 100644 --- a/sidebars.js +++ b/sidebars.js @@ -310,6 +310,7 @@ const sidebars = { type: "category", label: "Agent & MCP Gateway", items: [ + "auth_overview", { type: "category", label: "A2A Agent Gateway", From d878975df6380a8e63e080bffa169d6c9b38af85 Mon Sep 17 00:00:00 2001 From: Gabriele Michelli Date: Mon, 18 May 2026 19:43:28 +0200 Subject: [PATCH 5/5] =?UTF-8?q?docs(fixup):=20corrections=20from=20code-re?= =?UTF-8?q?view=20pass=20=E2=80=94=20verified=20against=20current=20LiteLL?= =?UTF-8?q?M=20source?= MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit --- docs/a2a_agent_permissions.md | 67 +++++++++++------------------ docs/auth_overview.md | 12 +++--- docs/mcp.md | 4 +- docs/mcp_control.md | 15 ------- docs/mcp_oauth.md | 1 - docs/mcp_public_internet.md | 14 +++--- docs/providers/bedrock_agentcore.md | 4 +- 7 files changed, 42 insertions(+), 75 deletions(-) diff --git a/docs/a2a_agent_permissions.md b/docs/a2a_agent_permissions.md index 70c527dac..ef721955e 100644 --- a/docs/a2a_agent_permissions.md +++ b/docs/a2a_agent_permissions.md @@ -210,37 +210,23 @@ curl -X POST "http://localhost:4000/a2a/agent-456" \ ## Agent Access Groups -Granting individual agents to every key or team gets unwieldy as the agent catalog grows. **Agent access groups** let you tag agents with logical labels, then grant the **group** to a key or team — adding a new agent to the group automatically makes it available to every key/team that holds the group. +Granting individual agents to every key or team gets unwieldy as the agent catalog grows. **Agent access groups** let you tag agents with logical labels in the dashboard, then grant the **group** to a key or team — adding a new agent to the group automatically makes it available to every key/team that holds the group. ### 1. Tag the agent with one or more groups - - +In the LiteLLM dashboard: -1. Go to **Agents** in the LiteLLM dashboard. +1. Go to **Agents**. 2. Create or edit an agent. 3. Under **Access Groups**, type a group name (e.g. `clinical-tools`) and press Enter. - - - -```bash showLineNumbers -curl -X POST http://localhost:4000/v1/agents \ - -H "Authorization: Bearer sk-master-key" \ - -H "Content-Type: application/json" \ - -d '{ - "agent_name": "patient-lookup", - "agent_card_params": { ... }, - "agent_access_groups": ["clinical-tools", "phi-allowed"] - }' -``` - - - +:::note +Tagging an agent with access groups is currently a dashboard-only operation. The `POST /v1/agents` body schema does not expose `agent_access_groups` as a top-level field; the group tags persist via the underlying DB column and are consumed during permission resolution. +::: ### 2. Grant a key or team the group -```bash title="Key with access to two groups" showLineNumbers +```bash title="Key with access to two agent groups" showLineNumbers curl -X POST "http://localhost:4000/key/generate" \ -H "Authorization: Bearer sk-master-key" \ -H "Content-Type: application/json" \ @@ -259,29 +245,26 @@ When a key has **both** a direct `agents` list and `agent_access_groups`, the un ```mermaid flowchart TD - A[Request to invoke agent] --> B{Key allowlist: agents + agent_access_groups} - B --> C{Team allowlist: agents + agent_access_groups} - C -->|Both empty| D[Allow ALL agents] - C -->|Key only| E[Use key's allowlist] - C -->|Team only| F[Inherit team's allowlist] - C -->|Both set| G[Intersect key and team allowlists] - D --> H{End-user has allowlist?} - E --> H - F --> H - G --> H - H -->|Yes| I[Intersect with end-user allowlist] - H -->|No| J[Keep current] - I --> K{Org has allowlist?} - J --> K - K -->|Yes| L[Cap final set to org's allowlist - org is a ceiling] - K -->|No| M[Final allowlist] - L --> M - M --> N{Requested agent in final list?} - N -->|Yes| O[Allow request] - N -->|No| P[Return 403 Forbidden] + A[Request to invoke agent] --> B{LiteLLM Virtual Key has agent restrictions?} + B -->|Yes| C{LiteLLM Team has agent restrictions?} + B -->|No| D{LiteLLM Team has agent restrictions?} + + C -->|Yes| E[Use intersection of key + team permissions] + C -->|No| F[Use key permissions only] + + D -->|Yes| G[Inherit team permissions] + D -->|No| H[Allow ALL agents] + + E --> I{Agent in allowed list?} + F --> I + G --> I + H --> J[Allow request] + + I -->|Yes| J + I -->|No| K[Return 403 Forbidden] ``` -The model mirrors the [MCP RBAC intersection](./mcp_control#permission-hierarchy): per-level lists are intersected (most-restrictive wins) except for the organization level, which acts as a **ceiling**. +A2A permission resolution operates over two levels: Key and Team. (MCP's [permission hierarchy](./mcp_control#permission-hierarchy) extends to End-user / Agent / Org additionally — agent permissions are a narrower model today.) | Key Permissions | Team Permissions | Result | Notes | |-----------------|------------------|--------|-------| diff --git a/docs/auth_overview.md b/docs/auth_overview.md index d8daeea36..de13c244b 100644 --- a/docs/auth_overview.md +++ b/docs/auth_overview.md @@ -27,7 +27,6 @@ Both surfaces accept the same LiteLLM Virtual Key headers and the same identific | `x-litellm-tags` | Comma-separated tags for spend-log labeling and tag-based routing. Body field `tags` takes precedence. | — (not parsed on MCP ASGI) | ✓ | | `x-litellm-mcp-debug: true` | Returns masked diagnostic response headers (`x-mcp-debug-*`). See [MCP OAuth — Debugging](./mcp_oauth#debugging-oauth). | ✓ | — | | `x-mcp-servers` | Scope a request to specific MCP servers (comma-separated). | ✓ | — | -| `x-mcp-access-groups` | Scope a request to specific MCP access groups. See [MCP Permission Management — Per-request Access Group Scoping](./mcp_control#per-request-access-group-scoping--x-mcp-access-groups-header). | ✓ | — | --- @@ -88,7 +87,7 @@ See [MCP Overview — Forwarding Custom Headers](./mcp#forwarding-custom-headers ## 4. AuthZ — RBAC and access groups -Both surfaces use the same `object_permission` model with a five-level intersection. The detailed flowcharts and tables live on the dedicated pages: +Both surfaces use the `object_permission` model with intersection-style resolution, but at different depths today. MCP resolves across five levels; A2A across two. The detailed flowcharts and tables live on the dedicated pages: - [MCP Permission Hierarchy](./mcp_control#permission-hierarchy) - [A2A Agent Permission Management — How It Works](./a2a_agent_permissions#how-it-works) @@ -96,10 +95,10 @@ Both surfaces use the same `object_permission` model with a five-level intersect | Level | MCP field | A2A field | |---|---|---| | **Key** | `object_permission.mcp_servers`, `object_permission.mcp_access_groups`, `object_permission.mcp_tool_permissions` | `object_permission.agents`, `object_permission.agent_access_groups` | -| **Team** | Same | Same | -| **End user** | Same (via `x-litellm-end-user-id`) | Same (via `x-litellm-end-user-id`) | -| **Agent** | Same (via `x-litellm-agent-id`) | n/a (you're already targeting an agent) | -| **Org** | Same — acts as a **ceiling** | Same — acts as a **ceiling** | +| **Team** | Same | Same (inheritance-first: if the key has no list, it inherits the team's) | +| **End user** | Same (via `x-litellm-end-user-id`) | — not resolved today | +| **Agent** | Same (via `x-litellm-agent-id`) | — not applicable (the agent is the target) | +| **Org** | Same — acts as a **ceiling** | — not resolved today | | Concern | MCP | A2A | |---|---|---| @@ -151,7 +150,6 @@ x-litellm-trace-id: 8f4a-2b1c-d3e5-... # MCP — server scoping / per-user passthrough x-mcp-servers: github,zapier -x-mcp-access-groups: dev_group x-mcp-github-authorization: Bearer ghp_ # user passthrough to github_mcp x-litellm-mcp-debug: true # diagnostic response headers diff --git a/docs/mcp.md b/docs/mcp.md index ccda31d62..af9b0f502 100644 --- a/docs/mcp.md +++ b/docs/mcp.md @@ -226,7 +226,7 @@ mcp_servers: - **Description**: Optional description for the server - **Auth Type**: Optional authentication type. Supported values: - | Value | Header sent | + | Value | Header sent (managed SSE/HTTP transport) | |-------|-------------| | `none` | No auth header added | | `api_key` | `X-API-Key: ` | @@ -238,6 +238,8 @@ mcp_servers: | `oauth2_token_exchange` | `Authorization: Bearer ` — RFC 8693 On-Behalf-Of. See [MCP OBO Auth](./mcp_obo_auth.md) | | `aws_sigv4` | Per-request AWS SigV4 signature. See [MCP AWS SigV4](./mcp_aws_sigv4.md) | + Note: the header table above describes the managed SSE/HTTP transport path. The OpenAPI-tool path emits `Authorization: ApiKey ` instead of `X-API-Key` for `auth_type: api_key`; the deprecated `x-mcp-auth` broadcast header also uses the `ApiKey` form. + - **Extra Headers**: Optional list of additional header names that should be forwarded from client to the MCP server - **Static Headers**: Optional map of header key/value pairs to include every request to the MCP server. - **Spec Version**: Optional MCP specification version (defaults to `2025-06-18`) diff --git a/docs/mcp_control.md b/docs/mcp_control.md index 0bceda4ab..07c60a58d 100644 --- a/docs/mcp_control.md +++ b/docs/mcp_control.md @@ -668,21 +668,6 @@ When creating API keys, you can assign them to specific access groups for permis style={{width: '80%', display: 'block', margin: '0'}} /> -#### Per-request Access Group Scoping — `x-mcp-access-groups` Header - -In addition to the `x-mcp-servers` header (which targets servers by name), clients can scope a request to one or more **access groups** using the `x-mcp-access-groups` header. LiteLLM resolves the group names to concrete server IDs and intersects with the caller's normal permissions — the header narrows the scope, it does not grant access. - -```bash title="Scope this request to two access groups" showLineNumbers -curl -X POST "/mcp" \ - -H "x-litellm-api-key: Bearer sk-..." \ - -H "x-mcp-access-groups: dev_group,research_tools" \ - -H "Content-Type: application/json" \ - -d '{"jsonrpc":"2.0","id":1,"method":"tools/list","params":{}}' -``` - -Comma-separated list. Combine with `x-mcp-servers` to further narrow the set; the two headers are intersected before the per-entity intersection model runs. - - ## Per-entity Tool-Level Permissions {#per-entity-tool-level-permissions} diff --git a/docs/mcp_oauth.md b/docs/mcp_oauth.md index dbcca3bfe..cb809cdfd 100644 --- a/docs/mcp_oauth.md +++ b/docs/mcp_oauth.md @@ -306,7 +306,6 @@ curl http://localhost:4000/mcp-rest/tools/call \ | `scopes` | No | List of scopes to request. For M2M, joined into the `scope` parameter on the token request. For interactive, forwarded on the authorize request. | | `token_validation` | No | Dict of key-value rules checked against the OAuth token response after the `/token` exchange. Any rule mismatch fails the exchange with `token_validation_failed`. Useful for asserting a tenant claim like `{"team.enterprise_id": "T12345"}`. | | `token_storage_ttl_seconds` | No | Override the TTL for the per-user token cache (interactive flow). If unset, LiteLLM uses `expires_in - buffer` from the token response. | -| `delegate_auth_to_upstream` | No | When `true`, skip LiteLLM's own API-key / SSO check and let the client's PKCE flow run end-to-end with the upstream MCP server. See [Delegate Auth to Upstream](#delegate-auth-to-upstream-pkce-passthrough) above. | ## Debugging OAuth diff --git a/docs/mcp_public_internet.md b/docs/mcp_public_internet.md index ffc9145c6..8e04004a8 100644 --- a/docs/mcp_public_internet.md +++ b/docs/mcp_public_internet.md @@ -254,16 +254,16 @@ When empty, the standard private ranges are used (`10.0.0.0/8`, `172.16.0.0/12`, ## Public Internet vs MCP Hub Visibility -`available_on_public_internet` and the **MCP Hub strict whitelist** are two separate mechanisms that are easy to confuse: +`available_on_public_internet` and the **MCP Hub** (`GET /public/mcp_hub`) are two separate mechanisms that are easy to confuse: | Concern | Controlled by | Default | |---|---|---| -| Can an external (non-private-CIDR) caller see this server? | `available_on_public_internet` on the server | `false` (internal only) | -| Does this server appear in the published MCP Hub registry (`/v1/mcp/registry.json` when `enable_mcp_registry: true`)? | `litellm.public_mcp_servers` list, gated by `litellm.public_mcp_hub_strict_whitelist` | Hub strict whitelist is **on** by default — only servers explicitly listed in `public_mcp_servers` are advertised | +| Can an external (non-private-CIDR) caller see this server at the MCP tool endpoints (list/call)? | `available_on_public_internet` on the server | `True` (visible by default; toggle to `false` to restrict to private CIDRs) | +| Does this server appear in the unauthenticated `GET /public/mcp_hub` advertisement? | `litellm.public_mcp_servers` list, gated by `litellm.public_mcp_hub_strict_whitelist` | Hub strict whitelist is **on** by default — only servers explicitly listed in `public_mcp_servers` are advertised | -In the **default strict-whitelist mode**, `available_on_public_internet: true` does not make a server appear in the hub. You also need to add it to `public_mcp_servers`: +In the **default strict-whitelist mode**, `available_on_public_internet: true` (the default) does not make a server appear in the hub. To advertise a server on the hub you also need to add it to `public_mcp_servers`: -```yaml title="Both flags set — visible to external callers AND on the hub" showLineNumbers +```yaml title="Server on the hub AND visible to external callers (the default)" showLineNumbers litellm_settings: public_mcp_servers: - deepwiki @@ -272,7 +272,7 @@ litellm_settings: mcp_servers: deepwiki: url: https://mcp.deepwiki.com/mcp - available_on_public_internet: true + # available_on_public_internet defaults to true ``` -If you set `public_mcp_hub_strict_whitelist: false`, the hub falls back to advertising every server that has `available_on_public_internet: true` — but the IP-based access filter on this page still applies independently. +If you set `litellm.public_mcp_hub_strict_whitelist: false`, the hub falls back to advertising every server that has `available_on_public_internet: true` — but the IP-based access filter on this page still applies independently to the actual tool endpoints. diff --git a/docs/providers/bedrock_agentcore.md b/docs/providers/bedrock_agentcore.md index 96a8265a6..17a362b13 100644 --- a/docs/providers/bedrock_agentcore.md +++ b/docs/providers/bedrock_agentcore.md @@ -328,7 +328,7 @@ When SigV4 mode is active, credentials are resolved in this priority order: 2. **`aws_role_name` alone** → `sts:AssumeRole`. The proxy's ambient credentials (instance profile, IRSA, env vars) are the source identity. Session name auto-generated if omitted. 3. **`aws_profile_name`** → resolved via the boto3 profile loader (`~/.aws/credentials`). 4. **`aws_access_key_id` + `aws_secret_access_key` + `aws_session_token`** → explicit temporary credentials. -5. **`aws_access_key_id` + `aws_secret_access_key`** → explicit long-lived credentials. +5. **`aws_access_key_id` + `aws_secret_access_key` + `aws_region_name`** → explicit long-lived credentials. All three must be set; without `aws_region_name` this branch is skipped. 6. **No credentials configured** → boto3 default chain (env vars, IRSA via `AWS_WEB_IDENTITY_TOKEN_FILE` + `AWS_ROLE_ARN`, instance metadata). Recognized fields on `litellm_params` for SigV4: @@ -384,7 +384,7 @@ Note that the SigV4 / Bearer auth handled by `litellm_params` is **separate** fr All standard A2A controls apply: - **Per-agent RBAC** — [Agent Permission Management](../a2a_agent_permissions). Returns HTTP 403 when the calling key/team isn't authorized for the AgentCore agent. -- **Access groups** — tag the agent with `agent_access_groups: ["clinical-tools"]` and grant the group to a team. +- **Access groups** — tag the agent with one or more access groups in the LiteLLM dashboard, then grant the group to a team or key via `object_permission.agent_access_groups`. See [Agent Access Groups](../a2a_agent_permissions#agent-access-groups). - **Trace ID enforcement** — set `require_trace_id_on_calls_to_agent: true` on `litellm_params` to require `x-litellm-trace-id` on every inbound call. See [A2A Overview — Trace ID enforcement](../a2a#trace-id-enforcement-optional-per-agent). ## Further Reading