diff --git a/docs/a2a.md b/docs/a2a.md index 9c86d0de..bd29f562 100644 --- a/docs/a2a.md +++ b/docs/a2a.md @@ -211,12 +211,46 @@ POST /a2a/{agent_name}/message/send ### Authentication -Include your LiteLLM Virtual Key in the `Authorization` header: +Include your LiteLLM Virtual Key in either of two headers — `x-litellm-api-key` is preferred when the inbound `Authorization` header may carry a token destined for the backend agent (e.g. when using the [convention-based passthrough](./a2a_agent_headers#method-3--convention-based-forwarding) to forward the caller's identity). ``` Authorization: Bearer sk-your-litellm-key +# or +x-litellm-api-key: Bearer sk-your-litellm-key ``` +#### Per-agent permission check + +After the virtual key is authenticated, LiteLLM checks whether the calling key (and its team) is allowed to invoke the requested agent. If not, the response is HTTP 403. See [Agent Permission Management](./a2a_agent_permissions) for the full intersection model and access groups. + +#### Trace ID enforcement (optional, per-agent) + +An agent can require every inbound request to carry a trace ID for cross-system audit threading. Set `require_trace_id_on_calls_to_agent: true` in the agent's `litellm_params`. When set, requests missing `x-litellm-trace-id` (or `x-litellm-session-id`) are rejected with HTTP 400. + +```bash title="Register an agent that requires inbound trace IDs" showLineNumbers +curl -X POST http://localhost:4000/v1/agents \ + -H "Authorization: Bearer sk-master-key" \ + -H "Content-Type: application/json" \ + -d '{ + "agent_name": "audit-critical-agent", + "agent_card_params": { ... }, + "litellm_params": { + "require_trace_id_on_calls_to_agent": true + } + }' +``` + +The reverse direction — enforcing trace ID on **outbound** calls made by a key owned by an agent — is controlled by `require_trace_id_on_calls_by_agent` on the same `litellm_params` block. + +#### Sub-agent identity propagation + +When the backend agent itself calls LiteLLM (for chat completions or to invoke a sub-agent), LiteLLM forwards two headers to maintain trace continuity: + +- `X-LiteLLM-Trace-Id` — links all calls in the chain to a single trace +- `X-LiteLLM-Agent-Id` — attributes spend to the originating agent + +The caller's **virtual key** and **end-user ID** are not automatically forwarded. If the downstream agent needs the user's identity, propagate it explicitly via [`extra_headers` or the `x-a2a-{agent_name_or_id}-{header}` convention](./a2a_agent_headers). + ### Request Format LiteLLM follows the [A2A JSON-RPC 2.0 specification](https://github.com/google/A2A): diff --git a/docs/a2a_agent_permissions.md b/docs/a2a_agent_permissions.md index 93f367f4..ef721955 100644 --- a/docs/a2a_agent_permissions.md +++ b/docs/a2a_agent_permissions.md @@ -208,6 +208,39 @@ curl -X POST "http://localhost:4000/a2a/agent-456" \ -d '{"message": {"role": "user", "parts": [{"type": "text", "text": "Hello"}]}}' ``` +## Agent Access Groups + +Granting individual agents to every key or team gets unwieldy as the agent catalog grows. **Agent access groups** let you tag agents with logical labels in the dashboard, then grant the **group** to a key or team — adding a new agent to the group automatically makes it available to every key/team that holds the group. + +### 1. Tag the agent with one or more groups + +In the LiteLLM dashboard: + +1. Go to **Agents**. +2. Create or edit an agent. +3. Under **Access Groups**, type a group name (e.g. `clinical-tools`) and press Enter. + +:::note +Tagging an agent with access groups is currently a dashboard-only operation. The `POST /v1/agents` body schema does not expose `agent_access_groups` as a top-level field; the group tags persist via the underlying DB column and are consumed during permission resolution. +::: + +### 2. Grant a key or team the group + +```bash title="Key with access to two agent groups" showLineNumbers +curl -X POST "http://localhost:4000/key/generate" \ + -H "Authorization: Bearer sk-master-key" \ + -H "Content-Type: application/json" \ + -d '{ + "object_permission": { + "agent_access_groups": ["clinical-tools", "research-tools"] + } + }' +``` + +The key now has access to every agent tagged with either group — no per-agent enumeration required. The same `agent_access_groups` field is also valid on a team's `object_permission`. + +When a key has **both** a direct `agents` list and `agent_access_groups`, the union is computed (any agent reached by either path is allowed), and then the team-level intersection is applied as described below. + ## How It Works ```mermaid @@ -215,28 +248,32 @@ flowchart TD A[Request to invoke agent] --> B{LiteLLM Virtual Key has agent restrictions?} B -->|Yes| C{LiteLLM Team has agent restrictions?} B -->|No| D{LiteLLM Team has agent restrictions?} - + C -->|Yes| E[Use intersection of key + team permissions] C -->|No| F[Use key permissions only] - + D -->|Yes| G[Inherit team permissions] D -->|No| H[Allow ALL agents] - + E --> I{Agent in allowed list?} F --> I G --> I H --> J[Allow request] - + I -->|Yes| J I -->|No| K[Return 403 Forbidden] ``` +A2A permission resolution operates over two levels: Key and Team. (MCP's [permission hierarchy](./mcp_control#permission-hierarchy) extends to End-user / Agent / Org additionally — agent permissions are a narrower model today.) + | Key Permissions | Team Permissions | Result | Notes | |-----------------|------------------|--------|-------| | None | None | Key can access **all** agents | Open access by default when no restrictions are set | | `["agent-1", "agent-2"]` | None | Key can access `agent-1` and `agent-2` | Key uses its own permissions | | None | `["agent-1", "agent-3"]` | Key can access `agent-1` and `agent-3` | Key inherits team's permissions | | `["agent-1", "agent-2"]` | `["agent-1", "agent-3"]` | Key can access `agent-1` only | Intersection of both lists (most restrictive wins) | +| `agent_access_groups: ["clinical"]` | None | Key can access every agent tagged `clinical` | Access groups resolved to concrete agent IDs | +| `agent_access_groups: ["clinical"]` | `agents: ["agent-1"]` | Intersection of (every agent tagged `clinical`) and `["agent-1"]` | Mixing direct and group grants is supported | ## Viewing Permissions diff --git a/docs/auth_overview.md b/docs/auth_overview.md new file mode 100644 index 00000000..de13c244 --- /dev/null +++ b/docs/auth_overview.md @@ -0,0 +1,161 @@ +import Tabs from '@theme/Tabs'; +import TabItem from '@theme/TabItem'; + +# AuthN/AuthZ Reference — MCP and A2A Side-by-Side + +LiteLLM exposes two gateway surfaces that share most authentication and authorization primitives but diverge in a few important places. This page is the side-by-side reference: which header does what, where the two surfaces are symmetric, and where they're not. Each section links out to the dedicated page for the deep dive. + +| Surface | Endpoints | Dedicated docs | +|---|---|---| +| **MCP Gateway** | `/mcp`, `/{server}/mcp`, `/toolset/{name}/mcp`, `/sse`, `/v1/mcp/...`, `/mcp-rest/...` | [MCP Overview](./mcp) | +| **A2A Agent Gateway** | `/a2a/{agent_id}`, `/a2a/{agent_id}/message/send`, `/v1/agents/...` | [A2A Overview](./a2a) | + +--- + +## 1. Client → LiteLLM (authenticating the caller) + +Both surfaces accept the same LiteLLM Virtual Key headers and the same identification headers. The one place they diverge: the MCP **ASGI** routes (the streamable MCP endpoints at `/mcp`, `/{name}/mcp`, `/toolset/{name}/mcp`, `/sse`) bypass the standard FastAPI auth dependency and only check `x-litellm-api-key` and `Authorization`. The MCP **REST/management** routes (`/v1/mcp/...`, `/mcp-rest/...`) and **all** A2A routes accept the full six-header set. + +| Header | Purpose | MCP ASGI | MCP REST + A2A | +|---|---|---|---| +| `x-litellm-api-key: Bearer sk-...` | Preferred LiteLLM Virtual Key header. Use whenever the inbound `Authorization` header may carry a different token (OAuth passthrough, OBO, A2A per-user forwarding). | ✓ | ✓ | +| `Authorization: Bearer sk-...` | Standard fallback. Stripped of the `Bearer ` prefix before lookup. | ✓ | ✓ | +| `API-Key`, `x-api-key`, `x-goog-api-key`, `Ocp-Apim-Subscription-Key` | Vendor-specific aliases (Azure, Anthropic, Google AI Studio, Azure APIM). | — | ✓ | +| `x-litellm-end-user-id` | End-user identification. Layers per-end-user budgets, MCP access intersection, and audit log entries on top of the key. `x-litellm-customer-id` is an accepted alias. | ✓ | ✓ | +| `x-litellm-trace-id` | Cross-request correlation ID. Falls back to `x-litellm-session-id` or any matching `x--session-id` header. | ✓ | ✓ | +| `x-litellm-session-id` | Session grouping. Same parse path as trace-id, lower priority. | ✓ | ✓ | +| `x-litellm-tags` | Comma-separated tags for spend-log labeling and tag-based routing. Body field `tags` takes precedence. | — (not parsed on MCP ASGI) | ✓ | +| `x-litellm-mcp-debug: true` | Returns masked diagnostic response headers (`x-mcp-debug-*`). See [MCP OAuth — Debugging](./mcp_oauth#debugging-oauth). | ✓ | — | +| `x-mcp-servers` | Scope a request to specific MCP servers (comma-separated). | ✓ | — | + +--- + +## 2. LiteLLM → Backend (authenticating the gateway to the agent or MCP server) + +This is the section where MCP and A2A diverge most. MCP has a first-class `auth_type` field on each server registration. **A2A has no `auth_type` field at all** — the outbound auth mode is inferred from what's present in `litellm_params`. + +### MCP — `auth_type` enum + +Nine values. The MCP server's outbound `Authorization` header (or per-request SigV4 signature) is determined by `auth_type`. See [MCP Overview — Add HTTP MCP Server](./mcp#add-http-mcp-server) for the full table. + +| `auth_type` | Mechanism | Dedicated docs | +|---|---|---| +| `none` | No auth header added | — | +| `api_key` / `bearer_token` / `basic` / `authorization` / `token` | Static header, sent verbatim per call | [MCP Overview](./mcp) | +| `oauth2` | PKCE (interactive) or M2M `client_credentials`. Discriminated by `oauth2_flow`. | [MCP OAuth](./mcp_oauth) | +| `oauth2_token_exchange` | RFC 8693 On-Behalf-Of (OBO) — exchange the caller's bearer token for a scoped MCP token | [MCP OBO Auth](./mcp_obo_auth) | +| `aws_sigv4` | Per-request SigV4 signature using a dedicated MCP-side credential chain | [MCP AWS SigV4](./mcp_aws_sigv4) | + +### A2A — auth mode inferred from `litellm_params` + +There is no `auth_type` field on an agent. The provider handler picks the auth mechanism from the contents of `litellm_params`: + +| Mode | When it fires | Send to backend | +|---|---|---| +| **Bearer / JWT** | `litellm_params.api_key` is set | `Authorization: Bearer ` | +| **SigV4** (AgentCore only) | `litellm_params.api_key` is unset and the provider is `bedrock` | Per-request SigV4 via the full `base_aws_llm` credential chain (six entry points: web-identity+role, role alone, profile, session-token triple, key+secret, env-vars / IRSA fallback). See [Bedrock AgentCore — A2A Gateway Authentication](./providers/bedrock_agentcore#a2a-gateway-authentication). | +| **Provider-native** | `litellm_params.custom_llm_provider` matches a non-Bedrock provider (Vertex AI Agent Engine, LangGraph, Azure AI Foundry, Pydantic AI) | The provider's normal auth path | + +The dual JWT-vs-SigV4 mode is specific to AgentCore. Other A2A providers (Vertex, LangGraph, Azure Foundry) use the provider's own credential conventions — see the relevant provider page under [Providers](./providers). + +### Zero-trust add-on (MCP-only today) + +If the MCP server needs to **cryptographically verify** the request came through LiteLLM, layer the [MCP JWT Signer](./mcp_zero_trust) guardrail on top. It signs every outbound tool call with a short-lived RS256 JWT and publishes a JWKS endpoint the MCP server can verify against. This is a guardrail (`guardrail: mcp_jwt_signer`, `mode: pre_mcp_call`), not an `auth_type` — it composes with any `auth_type`. + +--- + +## 3. Per-user header passthrough + +Both surfaces let clients forward credentials destined for a specific backend server/agent without admin pre-configuration. The conventions look symmetric but parse differently — be precise when copy-pasting. + +| Surface | Prefix | Parse rule | Match against | Example | +|---|---|---|---|---| +| **MCP** | `x-mcp-` | Split on the **first dash** after the prefix → `(server_alias, header_name)` | Server's `alias`, then `server_name` (case-insensitive) | `x-mcp-github-authorization: Bearer ghp_...` → server `github`, header `Authorization` | +| **A2A** | `x-a2a-` | Exact-prefix match against `x-a2a-{agent_id_lower}-` or `x-a2a-{agent_name_lower}-`; everything after the trailing dash is the header name | Agent's UUID **and** human-readable name (both tried) | `x-a2a-my-agent-x-api-key: secret` → agent `my-agent`, header `x-api-key` | + +Both surfaces also support admin-controlled alternatives that compose with the user passthrough: + +| Mechanism | MCP | A2A | Notes | +|---|---|---|---| +| `static_headers: {K: V}` | ✓ | ✓ | Always sent. **Wins over user passthrough** on key conflicts. | +| `extra_headers: [name, name, ...]` | ✓ | ✓ | Admin-allowlist of client header names to forward verbatim. | +| `x---
` convention | ✓ (`x-mcp-`) | ✓ (`x-a2a-`) | Client-driven, no admin config needed. | + +See [MCP Overview — Forwarding Custom Headers](./mcp#forwarding-custom-headers-to-mcp-servers) and [A2A Agent Authentication Headers](./a2a_agent_headers) for the full mechanics. + +--- + +## 4. AuthZ — RBAC and access groups + +Both surfaces use the `object_permission` model with intersection-style resolution, but at different depths today. MCP resolves across five levels; A2A across two. The detailed flowcharts and tables live on the dedicated pages: + +- [MCP Permission Hierarchy](./mcp_control#permission-hierarchy) +- [A2A Agent Permission Management — How It Works](./a2a_agent_permissions#how-it-works) + +| Level | MCP field | A2A field | +|---|---|---| +| **Key** | `object_permission.mcp_servers`, `object_permission.mcp_access_groups`, `object_permission.mcp_tool_permissions` | `object_permission.agents`, `object_permission.agent_access_groups` | +| **Team** | Same | Same (inheritance-first: if the key has no list, it inherits the team's) | +| **End user** | Same (via `x-litellm-end-user-id`) | — not resolved today | +| **Agent** | Same (via `x-litellm-agent-id`) | — not applicable (the agent is the target) | +| **Org** | Same — acts as a **ceiling** | — not resolved today | + +| Concern | MCP | A2A | +|---|---|---| +| Per-server / per-agent allowlist | `object_permission.mcp_servers` | `object_permission.agents` | +| Access groups (tag-based grants) | `object_permission.mcp_access_groups` | `object_permission.agent_access_groups` | +| Per-server tool-level allowlist | `object_permission.mcp_tool_permissions: {server_id: [tool, ...]}` | n/a (tools live inside the agent) | +| Server-registration allowlist (admin-static) | `allowed_tools` / `disallowed_tools` on the MCP server | n/a | +| Param-level allowlist | `allowed_params: {tool_name: [param, ...]}` on the MCP server | n/a | +| Reject behaviour | `list_tools` filters out hidden servers; `call_tool` returns error | `GET /v1/agents` filters; `POST /a2a/{agent_id}` returns HTTP **403** | + +--- + +## 5. Trace IDs and identity propagation + +`x-litellm-trace-id` is **accepted** on every request and threaded through logging on both surfaces. A few A2A-specific extras: + +| Setting | Scope | Behaviour | +|---|---|---| +| `require_trace_id_on_calls_to_agent: true` | Per-agent, on the agent's `litellm_params` | Reject inbound `/a2a/{agent_id}` calls missing `x-litellm-trace-id` (or `x-litellm-session-id` fallback) with **HTTP 400**. See [A2A Overview — Trace ID enforcement](./a2a#trace-id-enforcement-optional-per-agent). | +| `require_trace_id_on_calls_by_agent: true` | Per-agent, on the agent's `litellm_params` | Reverse direction — when a key **owned by** that agent makes outbound calls, require a trace ID on those. | + +**Sub-agent identity propagation** — when LiteLLM dispatches a downstream call as part of an A2A invocation, it forwards `X-LiteLLM-Trace-Id` and `X-LiteLLM-Agent-Id` to maintain trace continuity and spend attribution. The original virtual key and end-user identity are **not** auto-forwarded. Use `extra_headers` or the `x-a2a-{agent_name_or_id}-{header}` convention to thread identity explicitly. See [A2A Overview — Sub-agent identity propagation](./a2a#sub-agent-identity-propagation). + +--- + +## 6. Guardrails on the gateway path + +| Concern | MCP | A2A | +|---|---|---| +| Pre-call input guardrails (Presidio, Bedrock, Lakera, Aporia, etc.) | `mode: pre_mcp_call` | Standard chat-completion guardrails apply to the underlying LLM calls the agent makes | +| During-call intervention | `mode: during_mcp_call` | — | +| Zero-trust JWT signing | [`mcp_jwt_signer` guardrail](./mcp_zero_trust) | — (not applicable to A2A today) | +| Documentation | [MCP Guardrails](./mcp_guardrail), [MCP Zero Trust](./mcp_zero_trust) | Standard [guardrails docs](./proxy/guardrails) apply via the agent's underlying model calls | + +--- + +## 7. Cheatsheet — what header does what + +For copy-paste, the high-frequency request headers across both surfaces: + +```http +# Always (LiteLLM-side auth and identification) +x-litellm-api-key: Bearer sk-... +# or +Authorization: Bearer sk-... + +x-litellm-end-user-id: user-42 +x-litellm-trace-id: 8f4a-2b1c-d3e5-... + +# MCP — server scoping / per-user passthrough +x-mcp-servers: github,zapier +x-mcp-github-authorization: Bearer ghp_ # user passthrough to github_mcp +x-litellm-mcp-debug: true # diagnostic response headers + +# A2A — per-user passthrough +x-a2a-my-agent-authorization: Bearer # caller's token to my-agent +x-a2a-my-agent-x-api-key: # additional per-agent header +``` + +For the deep dives, follow the cross-links above into the dedicated pages. diff --git a/docs/mcp.md b/docs/mcp.md index f6fe01ac..af9b0f50 100644 --- a/docs/mcp.md +++ b/docs/mcp.md @@ -226,13 +226,19 @@ mcp_servers: - **Description**: Optional description for the server - **Auth Type**: Optional authentication type. Supported values: - | Value | Header sent | + | Value | Header sent (managed SSE/HTTP transport) | |-------|-------------| + | `none` | No auth header added | | `api_key` | `X-API-Key: ` | | `bearer_token` | `Authorization: Bearer ` | | `basic` | `Authorization: Basic ` | - | `authorization` | `Authorization: ` | - | `aws_sigv4` | Per-request AWS SigV4 signature ([details](./mcp_aws_sigv4.md)) | + | `authorization` | `Authorization: ` (verbatim, no prefix) | + | `token` | `Authorization: token ` (GitHub-style) | + | `oauth2` | `Authorization: Bearer ` — PKCE or M2M `client_credentials`. See [MCP OAuth](./mcp_oauth.md) | + | `oauth2_token_exchange` | `Authorization: Bearer ` — RFC 8693 On-Behalf-Of. See [MCP OBO Auth](./mcp_obo_auth.md) | + | `aws_sigv4` | Per-request AWS SigV4 signature. See [MCP AWS SigV4](./mcp_aws_sigv4.md) | + + Note: the header table above describes the managed SSE/HTTP transport path. The OpenAPI-tool path emits `Authorization: ApiKey ` instead of `X-API-Key` for `auth_type: api_key`; the deprecated `x-mcp-auth` broadcast header also uses the `ApiKey` form. - **Extra Headers**: Optional list of additional header names that should be forwarded from client to the MCP server - **Static Headers**: Optional map of header key/value pairs to include every request to the MCP server. diff --git a/docs/mcp_control.md b/docs/mcp_control.md index ccaa37f9..07c60a58 100644 --- a/docs/mcp_control.md +++ b/docs/mcp_control.md @@ -35,6 +35,50 @@ When Creating a Key, Team, or Organization, you can select the allowed MCP Serve style={{width: '80%', display: 'block', margin: '0'}} /> +## Permission Hierarchy + +Permissions can be set at five distinct levels. When more than one level applies to a request, LiteLLM **intersects** the lists (most-restrictive wins) — except for the organization level, which acts as a **ceiling**. + +| Level | Source | How it composes | +|---|---|---| +| **Key** | `object_permission.mcp_servers` / `object_permission.mcp_access_groups` on the virtual key | If the key has an explicit list, it's used. | +| **Team** | Same fields on the team | If both key and team have lists, the result is the **intersection** (only servers in both). If only the team has a list, the key inherits it. | +| **End user** | Same fields on the `LiteLLM_EndUserTable` row matching `x-litellm-end-user-id` | Intersected with the running result. Skipped if no end-user-id is present on the request. | +| **Agent** | Same fields on the agent identified by `x-litellm-agent-id` | Intersected with the running result. Skipped if no agent-id is present. | +| **Organization** | Same fields on the org owning the key/team | Acts as a **ceiling** — the final allowed-server set is intersected with the org's list. If the org has no list, no additional restriction. | + +If no level has a list, the request can access **every** MCP server (open by default). + +```mermaid +flowchart TD + A[Inbound MCP request] --> B{Key has mcp_servers list?} + B -->|Yes| C[Start with key's list] + B -->|No| D[Start with: all servers] + C --> E{Team has list?} + D --> E + E -->|Yes, key also had list| F[Intersect with team's list] + E -->|Yes, key had no list| G[Use team's list] + E -->|No| H[Keep current] + F --> I + G --> I + H --> I + I[Running set] --> J{end-user-id present and end-user has list?} + J -->|Yes| K[Intersect with end-user list] + J -->|No| L[Keep current] + K --> M + L --> M + M{agent-id present and agent has list?} + M -->|Yes| N[Intersect with agent list] + M -->|No| O[Keep current] + N --> P + O --> P + P{Org has list?} + P -->|Yes| Q[Cap final set to org's list] + P -->|No| R[Final set] + Q --> R +``` + +The same intersection model applies to the per-server tool-level dict `mcp_tool_permissions` (see [Per-entity Tool-Level Permissions](#per-entity-tool-level-permissions) below). ## Allow/Disallow MCP Tools @@ -625,16 +669,93 @@ When creating API keys, you can assign them to specific access groups for permis /> - -## Set Allowed Tools for a Key, Team, or Organization +## Per-entity Tool-Level Permissions {#per-entity-tool-level-permissions} Control which tools different teams can access from the same MCP server. For example, give your Engineering team access to `list_repositories`, `create_issue`, and `search_code`, while Sales only gets `search_code` and `close_issue`. - This video shows how to set allowed tools for a Key, Team, or Organization. +### `mcp_tool_permissions` API + +`object_permission.mcp_tool_permissions` is a `Dict[server_id, List[tool_name]]` on the key, team, end-user, agent, or organization. It's evaluated **after** server-level access has been resolved (see [Permission Hierarchy](#permission-hierarchy) above) and applies the same five-level intersection — most-restrictive wins, organization acts as a ceiling. + +This is distinct from the server-registration-level `allowed_tools` / `disallowed_tools` (which apply to **every** caller of the server). `mcp_tool_permissions` lets you carve out per-team subsets without changing the server config. + + + + +```bash title="Engineering key — full GitHub access" showLineNumbers +curl -X POST "http://localhost:4000/key/generate" \ + -H "Authorization: Bearer sk-master-key" \ + -H "Content-Type: application/json" \ + -d '{ + "object_permission": { + "mcp_servers": ["github_mcp"], + "mcp_tool_permissions": { + "github_mcp": ["list_repositories", "create_issue", "search_code"] + } + } + }' +``` + +```bash title="Sales key — read-only on the same server" showLineNumbers +curl -X POST "http://localhost:4000/key/generate" \ + -H "Authorization: Bearer sk-master-key" \ + -H "Content-Type: application/json" \ + -d '{ + "object_permission": { + "mcp_servers": ["github_mcp"], + "mcp_tool_permissions": { + "github_mcp": ["search_code", "close_issue"] + } + } + }' +``` + + + + +```bash title="Team-wide tool subset (all keys inherit)" showLineNumbers +curl -X POST "http://localhost:4000/team/new" \ + -H "Authorization: Bearer sk-master-key" \ + -H "Content-Type: application/json" \ + -d '{ + "team_alias": "engineering", + "object_permission": { + "mcp_servers": ["github_mcp", "deepwiki_mcp"], + "mcp_tool_permissions": { + "github_mcp": ["list_repositories", "create_issue", "search_code"] + } + } + }' +``` + +When the key also sets `mcp_tool_permissions` for `github_mcp`, the resulting tool list is the **intersection** of the two. + + + + +When an agent (identified by `x-litellm-agent-id`) calls MCP tools, the agent's own `mcp_tool_permissions` participate in the intersection. Useful for capping what an autonomous agent can do regardless of which key originally invoked it. + +```bash showLineNumbers +curl -X PATCH "http://localhost:4000/v1/agents/{agent_id}" \ + -H "Authorization: Bearer sk-master-key" \ + -H "Content-Type: application/json" \ + -d '{ + "object_permission": { + "mcp_servers": ["github_mcp"], + "mcp_tool_permissions": { + "github_mcp": ["search_code"] + } + } + }' +``` + + + + ## Dashboard View Modes diff --git a/docs/mcp_oauth.md b/docs/mcp_oauth.md index 7980a192..cb809cdf 100644 --- a/docs/mcp_oauth.md +++ b/docs/mcp_oauth.md @@ -296,11 +296,16 @@ curl http://localhost:4000/mcp-rest/tools/call \ | Field | Required | Description | |-------|----------|-------------| -| `auth_type` | Yes | Must be `oauth2` | -| `client_id` | Yes | OAuth2 client ID. Supports `os.environ/VAR_NAME` | -| `client_secret` | Yes | OAuth2 client secret. Supports `os.environ/VAR_NAME` | -| `token_url` | Yes | Token endpoint URL | -| `scopes` | No | List of scopes to request | +| `auth_type` | Yes | Must be `oauth2`. For RFC 8693 On-Behalf-Of, use `oauth2_token_exchange` instead — see [MCP OBO Auth](./mcp_obo_auth.md). | +| `oauth2_flow` | No | Explicit flow selector. One of `"client_credentials"` (M2M) or `"authorization_code"` (interactive PKCE). If omitted, LiteLLM infers from the other fields: `authorization_url` present → interactive; only `token_url` + `client_id` + `client_secret` → client credentials. Set explicitly when in doubt — for example, when a legacy DB row has both `authorization_url` and `token_url` but you want M2M. | +| `client_id` | Yes for M2M, optional for interactive | OAuth2 client ID. Required for `client_credentials`. For interactive flows, can be obtained via Dynamic Client Registration (RFC 7591) at `POST /{server_name}/register` if the upstream supports it. Supports `os.environ/VAR_NAME`. | +| `client_secret` | Yes for M2M, optional for interactive | OAuth2 client secret. Same applicability as `client_id`. Supports `os.environ/VAR_NAME`. | +| `token_url` | Yes for M2M, optional for interactive | Token endpoint URL. LiteLLM POSTs to this for `client_credentials` and for the authorization-code exchange. | +| `authorization_url` | Interactive only | Upstream authorization endpoint. When present, LiteLLM treats the server as interactive PKCE and proxies `GET /{server_name}/authorize` to this URL. | +| `registration_url` | Optional | Upstream Dynamic Client Registration endpoint (RFC 7591). When present, `POST /{server_name}/register` proxies through to this URL. | +| `scopes` | No | List of scopes to request. For M2M, joined into the `scope` parameter on the token request. For interactive, forwarded on the authorize request. | +| `token_validation` | No | Dict of key-value rules checked against the OAuth token response after the `/token` exchange. Any rule mismatch fails the exchange with `token_validation_failed`. Useful for asserting a tenant claim like `{"team.enterprise_id": "T12345"}`. | +| `token_storage_ttl_seconds` | No | Override the TTL for the per-user token cache (interactive flow). If unset, LiteLLM uses `expires_in - buffer` from the token response. | ## Debugging OAuth diff --git a/docs/mcp_public_internet.md b/docs/mcp_public_internet.md index 69dd7464..8e04004a 100644 --- a/docs/mcp_public_internet.md +++ b/docs/mcp_public_internet.md @@ -249,3 +249,30 @@ general_settings: ``` When empty, the standard private ranges are used (`10.0.0.0/8`, `172.16.0.0/12`, `192.168.0.0/16`, `127.0.0.0/8`). + +--- + +## Public Internet vs MCP Hub Visibility + +`available_on_public_internet` and the **MCP Hub** (`GET /public/mcp_hub`) are two separate mechanisms that are easy to confuse: + +| Concern | Controlled by | Default | +|---|---|---| +| Can an external (non-private-CIDR) caller see this server at the MCP tool endpoints (list/call)? | `available_on_public_internet` on the server | `True` (visible by default; toggle to `false` to restrict to private CIDRs) | +| Does this server appear in the unauthenticated `GET /public/mcp_hub` advertisement? | `litellm.public_mcp_servers` list, gated by `litellm.public_mcp_hub_strict_whitelist` | Hub strict whitelist is **on** by default — only servers explicitly listed in `public_mcp_servers` are advertised | + +In the **default strict-whitelist mode**, `available_on_public_internet: true` (the default) does not make a server appear in the hub. To advertise a server on the hub you also need to add it to `public_mcp_servers`: + +```yaml title="Server on the hub AND visible to external callers (the default)" showLineNumbers +litellm_settings: + public_mcp_servers: + - deepwiki + # public_mcp_hub_strict_whitelist defaults to true + +mcp_servers: + deepwiki: + url: https://mcp.deepwiki.com/mcp + # available_on_public_internet defaults to true +``` + +If you set `litellm.public_mcp_hub_strict_whitelist: false`, the hub falls back to advertising every server that has `available_on_public_internet: true` — but the IP-based access filter on this page still applies independently to the actual tool endpoints. diff --git a/docs/providers/bedrock_agentcore.md b/docs/providers/bedrock_agentcore.md index 7802624f..17a362b1 100644 --- a/docs/providers/bedrock_agentcore.md +++ b/docs/providers/bedrock_agentcore.md @@ -245,8 +245,154 @@ model_list: | `qualifier` | string | Optional runtime qualifier/version to invoke a specific version of the agent runtime | | `runtimeSessionId` | string | Optional custom session ID (must be 33+ characters). If not provided, LiteLLM generates one automatically | +## LiteLLM A2A Gateway {#litellm-a2a-gateway} + +Register a Bedrock AgentCore runtime as a first-class A2A agent on the LiteLLM [Agent Gateway](../a2a). This gives you per-agent RBAC, access groups, trace-ID enforcement, and the `x-a2a-{agent_name_or_id}-{header}` per-user passthrough convention — same surface as any other A2A provider. + +This path is distinct from the chat-completions invocation above. Pick one based on your client: + +| You want to call AgentCore via... | Use this path | +|---|---| +| `/v1/chat/completions` with `model: bedrock/agentcore/` | Chat completions (covered above) | +| `POST /a2a/{agent_id}` with A2A JSON-RPC 2.0 (`message/send` or `message/stream`) | A2A Gateway (this section) | + +### 1. Register the agent + + + + +1. Go to **Agents** → **Add Agent**. +2. Select **Bedrock AgentCore** as the provider. +3. Paste the AgentCore Runtime ARN as the agent URL. +4. Configure AWS credentials (or leave blank to use the proxy's ambient credential chain — see [Authentication](#a2a-gateway-authentication) below). + + + + +```bash showLineNumbers +curl -X POST http://localhost:4000/v1/agents \ + -H "Authorization: Bearer sk-admin" \ + -H "Content-Type: application/json" \ + -d '{ + "agent_name": "my-agentcore-runtime", + "agent_card_params": { + "name": "my-agentcore-runtime", + "description": "Internal research agent", + "url": "bedrock/agentcore/arn:aws:bedrock-agentcore:us-east-1:123456789012:runtime/my-runtime" + }, + "litellm_params": { + "custom_llm_provider": "bedrock", + "aws_role_name": "arn:aws:iam::123456789012:role/LiteLLMAgentCoreInvoker", + "aws_region_name": "us-east-1" + } + }' +``` + + + + +### 2. Invoke via A2A + +```bash showLineNumbers +curl -X POST http://localhost:4000/a2a/my-agentcore-runtime/message/send \ + -H "x-litellm-api-key: Bearer sk-client-key" \ + -H "Content-Type: application/json" \ + -d '{ + "jsonrpc": "2.0", + "id": "1", + "method": "message/send", + "params": { + "message": { + "role": "user", + "parts": [{"kind": "text", "text": "Summarize the latest clinical trial results"}], + "messageId": "msg-1" + } + } + }' +``` + +### Authentication {#a2a-gateway-authentication} + +The AgentCore A2A path supports **two distinct outbound auth modes**, picked automatically based on what's in `litellm_params`: + +| Mode | When it fires | What's sent to AgentCore | +|---|---|---| +| **Bearer / JWT** | `litellm_params.api_key` is set (any value) | `Authorization: Bearer ` — SigV4 is bypassed entirely | +| **SigV4** | `litellm_params.api_key` is **not** set | Per-request SigV4 signature using the full AWS credential chain (below) | + +#### SigV4 credential resolution + +When SigV4 mode is active, credentials are resolved in this priority order: + +1. **`aws_web_identity_token` + `aws_role_name` + `aws_session_name`** → `sts:AssumeRoleWithWebIdentity`. Cross-account IRSA path. +2. **`aws_role_name` alone** → `sts:AssumeRole`. The proxy's ambient credentials (instance profile, IRSA, env vars) are the source identity. Session name auto-generated if omitted. +3. **`aws_profile_name`** → resolved via the boto3 profile loader (`~/.aws/credentials`). +4. **`aws_access_key_id` + `aws_secret_access_key` + `aws_session_token`** → explicit temporary credentials. +5. **`aws_access_key_id` + `aws_secret_access_key` + `aws_region_name`** → explicit long-lived credentials. All three must be set; without `aws_region_name` this branch is skipped. +6. **No credentials configured** → boto3 default chain (env vars, IRSA via `AWS_WEB_IDENTITY_TOKEN_FILE` + `AWS_ROLE_ARN`, instance metadata). + +Recognized fields on `litellm_params` for SigV4: + +| Field | Description | +|---|---| +| `aws_role_name` | IAM role ARN to assume via STS | +| `aws_session_name` | Session name for the AssumeRole call (auto-generated if omitted) | +| `aws_external_id` | ExternalId passed to `sts:AssumeRole` for cross-account trust policies | +| `aws_web_identity_token` | OIDC token for `AssumeRoleWithWebIdentity` (set explicitly or via `AWS_WEB_IDENTITY_TOKEN_FILE` env) | +| `aws_profile_name` | AWS CLI profile name | +| `aws_sts_endpoint` | Custom STS endpoint (VPC endpoints, FIPS endpoints) | +| `aws_access_key_id` / `aws_secret_access_key` / `aws_session_token` | Explicit credentials | +| `aws_region_name` | AWS region. If omitted, detected from the runtime ARN in `agent_card_params.url`. | + +#### IRSA on EKS + +For Kubernetes deployments using [IAM Roles for Service Accounts](https://docs.aws.amazon.com/eks/latest/userguide/iam-roles-for-service-accounts.html), no explicit credential configuration is needed — boto3's default chain picks up `AWS_WEB_IDENTITY_TOKEN_FILE` and `AWS_ROLE_ARN` from the pod environment automatically. + +If you want the invocation to assume a **second** role (e.g. separate the pod's identity from the agent-invocation identity for CloudTrail attribution), combine IRSA with `aws_role_name`: + +```bash showLineNumbers +curl -X POST http://localhost:4000/v1/agents \ + -H "Authorization: Bearer sk-admin" \ + -H "Content-Type: application/json" \ + -d '{ + "agent_name": "production-runtime", + "agent_card_params": { + "name": "production-runtime", + "url": "bedrock/agentcore/arn:aws:bedrock-agentcore:us-east-1:123456789012:runtime/prod" + }, + "litellm_params": { + "custom_llm_provider": "bedrock", + "aws_role_name": "arn:aws:iam::123456789012:role/AgentCoreInvocationRole", + "aws_session_name": "litellm-prod" + } + }' +``` + +The proxy pod's IRSA role serves as the source identity for the AssumeRole call; the assumed role's CloudTrail entries reflect the agent invocation. + +### Per-user header passthrough + +The standard A2A header forwarding mechanisms apply — see [A2A Agent Authentication Headers](../a2a_agent_headers) for the full reference. All three methods work with AgentCore: + +- **`static_headers`** — always sent to AgentCore (e.g. a custom `X-Tenant-Id`) +- **`extra_headers`** — admin-configured allowlist of client headers to forward +- **`x-a2a-{agent_name_or_id}-{header}` convention** — caller-driven forwarding without admin config + +Note that the SigV4 / Bearer auth handled by `litellm_params` is **separate** from the agent-level header forwarding above. Auth headers are computed per-request by the AWS signer; user passthrough headers are merged into the request after signing. + +### RBAC and trace IDs + +All standard A2A controls apply: +- **Per-agent RBAC** — [Agent Permission Management](../a2a_agent_permissions). Returns HTTP 403 when the calling key/team isn't authorized for the AgentCore agent. +- **Access groups** — tag the agent with one or more access groups in the LiteLLM dashboard, then grant the group to a team or key via `object_permission.agent_access_groups`. See [Agent Access Groups](../a2a_agent_permissions#agent-access-groups). +- **Trace ID enforcement** — set `require_trace_id_on_calls_to_agent: true` on `litellm_params` to require `x-litellm-trace-id` on every inbound call. See [A2A Overview — Trace ID enforcement](../a2a#trace-id-enforcement-optional-per-agent). + ## Further Reading - [AWS Bedrock AgentCore Documentation](https://docs.aws.amazon.com/bedrock/latest/APIReference/API_agentcore_InvokeAgentRuntime.html) - [LiteLLM Authentication to Bedrock](https://docs.litellm.ai/docs/providers/bedrock#boto3---authentication) +- [LiteLLM A2A Gateway Overview](../a2a) +- [A2A Agent Authentication Headers](../a2a_agent_headers) +- [A2A Agent Permission Management](../a2a_agent_permissions) +- [MCP AWS SigV4](../mcp_aws_sigv4) — for the AgentCore-hosted MCP servers path (separate from the agent runtimes path) diff --git a/sidebars.js b/sidebars.js index 6ff0c72f..813b58bd 100644 --- a/sidebars.js +++ b/sidebars.js @@ -310,6 +310,7 @@ const sidebars = { type: "category", label: "Agent & MCP Gateway", items: [ + "auth_overview", { type: "category", label: "A2A Agent Gateway",