Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions docs.json
Original file line number Diff line number Diff line change
Expand Up @@ -408,6 +408,7 @@
"integrations/llms/bedrock/bedrock-knowledgebase"
]
},
"integrations/llms/bedrock-mantle",
"integrations/llms/aws-sagemaker",
"integrations/llms/ollama",
{
Expand Down
4 changes: 4 additions & 0 deletions integrations/llms.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -21,6 +21,10 @@ description: "Portkey connects with all major LLM providers and orchestration fr
<Frame><img src="/images/supported-llm/amazon.avif" alt="AWS Bedrock" /></Frame>
</Card>

<Card title="Bedrock Mantle" href="/integrations/llms/bedrock-mantle">
<Frame><img src="/images/supported-llm/amazon.avif" alt="Bedrock Mantle" /></Frame>
</Card>

<Card title="Cerebras" href="/integrations/llms/cerebras">
<Frame><img src="/images/llms/cerebras.avif" alt="Cerebras" /></Frame>
</Card>
Expand Down
357 changes: 357 additions & 0 deletions integrations/llms/bedrock-mantle.mdx
Original file line number Diff line number Diff line change
@@ -0,0 +1,357 @@
---
title: "Amazon Bedrock Mantle"
description: "Use Amazon Bedrock Mantle's OpenAI-compatible endpoints through Portkey for chat completions, messages, and responses APIs."
---

Amazon Bedrock Mantle provides OpenAI-compatible API endpoints for model inference on AWS. Access models from Anthropic, Mistral, NVIDIA, Qwen, DeepSeek, Google, and more through familiar OpenAI SDK patterns.

<Card title="AWS Bedrock Mantle Documentation" href="https://docs.aws.amazon.com/bedrock/latest/userguide/bedrock-mantle.html" />

## Quick Start

<CodeGroup>
```python Python
from portkey_ai import Portkey

portkey = Portkey(
api_key="PORTKEY_API_KEY",
provider="@your-bedrock-mantle-provider"
)

response = portkey.chat.completions.create(
model="mistral.ministral-3-3b-instruct",
messages=[{"role": "user", "content": "Hello!"}],
max_tokens=50
)

print(response.choices[0].message.content)
```

```js JavaScript
import Portkey from 'portkey-ai'

const portkey = new Portkey({
apiKey: "PORTKEY_API_KEY",
provider: "@your-bedrock-mantle-provider"
})

const response = await portkey.chat.completions.create({
model: "mistral.ministral-3-3b-instruct",
messages: [{ role: "user", content: "Hello!" }],
max_tokens: 50
})

console.log(response.choices[0].message.content)
```

```sh cURL
curl -X POST "https://api.portkey.ai/v1/chat/completions" \
-H "Content-Type: application/json" \
-H "x-portkey-api-key: PORTKEY_API_KEY" \
-H "x-portkey-provider: bedrock-mantle" \
-H "x-portkey-virtual-key: your-bedrock-mantle-virtual-key" \
-d '{
"model": "mistral.ministral-3-3b-instruct",
"messages": [{"role": "user", "content": "Hello!"}],
"max_tokens": 50
}'
```
</CodeGroup>

## Add Provider in Model Catalog

1. Go to [Model Catalog](https://app.portkey.ai/model-catalog) in Portkey
2. Search for **Bedrock Mantle** and select it
3. Enter your AWS credentials

<CardGroup cols={2}>
<Card title="AWS Access Key" href="/integrations/llms/aws-bedrock#how-to-find-your-aws-credentials">
Use `AWS Access Key ID`, `AWS Secret Access Key`, and `AWS Region`.

[**Credential Guide**](/integrations/llms/aws-bedrock#how-to-find-your-aws-credentials)
</Card>
<Card title="AWS Assumed Role" href="/product/model-catalog/connect-bedrock-with-amazon-assumed-role">
Use `AWS Role ARN`, optional `External ID`, and `AWS Region`.

[**Setup Guide**](/product/model-catalog/connect-bedrock-with-amazon-assumed-role)
</Card>
</CardGroup>

---

## Supported Endpoints

Bedrock Mantle supports four API endpoints. Each model works on specific endpoints based on its provider.

### Chat Completions — `/v1/chat/completions`

Works with non-Anthropic models (Mistral, NVIDIA, Qwen, Google, DeepSeek, MiniMax, Moonshot, Z AI, Writer, OpenAI).

<CodeGroup>
```python Python
response = portkey.chat.completions.create(
model="mistral.mistral-large-3-675b-instruct",
messages=[{"role": "user", "content": "Explain quantum computing in one sentence."}],
max_tokens: 100
)
```

```js JavaScript
const response = await portkey.chat.completions.create({
model: "mistral.mistral-large-3-675b-instruct",
messages: [{ role: "user", content: "Explain quantum computing in one sentence." }],
max_tokens: 100
})
```

```sh cURL
curl -X POST "https://api.portkey.ai/v1/chat/completions" \
-H "Content-Type: application/json" \
-H "x-portkey-api-key: PORTKEY_API_KEY" \
-H "x-portkey-provider: bedrock-mantle" \
-H "x-portkey-virtual-key: your-virtual-key" \
-d '{
"model": "mistral.mistral-large-3-675b-instruct",
"messages": [{"role": "user", "content": "Explain quantum computing in one sentence."}],
"max_tokens": 100
}'
```
</CodeGroup>

### Messages — `/v1/messages`

Works with Anthropic models. Uses the Anthropic Messages API format.

<CodeGroup>
```python Python
import requests

response = requests.post(
"https://api.portkey.ai/v1/messages",
headers={
"Content-Type": "application/json",
"x-portkey-api-key": "PORTKEY_API_KEY",
"x-portkey-provider": "bedrock-mantle",
"x-portkey-virtual-key": "your-virtual-key"
},
json={
"model": "anthropic.claude-opus-4-7",
"max_tokens": 100,
"messages": [{"role": "user", "content": "Hello, Claude!"}]
}
)
```

```sh cURL
curl -X POST "https://api.portkey.ai/v1/messages" \
-H "Content-Type: application/json" \
-H "x-portkey-api-key: PORTKEY_API_KEY" \
-H "x-portkey-provider: bedrock-mantle" \
-H "x-portkey-virtual-key: your-virtual-key" \
-d '{
"model": "anthropic.claude-opus-4-7",
"max_tokens": 100,
"messages": [{"role": "user", "content": "Hello, Claude!"}]
}'
```
</CodeGroup>

#### Extended Thinking

<Note>Bedrock Mantle uses `thinking.type: "adaptive"` with `output_config.effort` instead of the standard Anthropic `thinking.type: "enabled"` with `budget_tokens`.</Note>

<CodeGroup>
```sh cURL
curl -X POST "https://api.portkey.ai/v1/messages" \
-H "Content-Type: application/json" \
-H "x-portkey-api-key: PORTKEY_API_KEY" \
-H "x-portkey-provider: bedrock-mantle" \
-H "x-portkey-virtual-key: your-virtual-key" \
-d '{
"model": "anthropic.claude-opus-4-7",
"max_tokens": 16000,
"thinking": {"type": "adaptive"},
"output_config": {"effort": "high"},
"messages": [{"role": "user", "content": "What is 27 * 453? Think step by step."}]
}'
```
</CodeGroup>

### Responses — `/v1/responses`

Works with select models (e.g., `openai.gpt-oss-120b`, `openai.gpt-oss-20b`). Supports create, get, and delete operations.

<CodeGroup>
```sh "Create Response"
curl -X POST "https://api.portkey.ai/v1/responses" \
-H "Content-Type: application/json" \
-H "x-portkey-api-key: PORTKEY_API_KEY" \
-H "x-portkey-provider: bedrock-mantle" \
-H "x-portkey-virtual-key: your-virtual-key" \
-d '{
"model": "openai.gpt-oss-120b",
"input": "Hello! How can you help me today?"
}'
```

```sh "Get Response"
curl -X GET "https://api.portkey.ai/v1/responses/resp_YOUR_RESPONSE_ID" \
-H "x-portkey-api-key: PORTKEY_API_KEY" \
-H "x-portkey-provider: bedrock-mantle" \
-H "x-portkey-virtual-key: your-virtual-key"
```

```sh "Delete Response"
curl -X DELETE "https://api.portkey.ai/v1/responses/resp_YOUR_RESPONSE_ID" \
-H "x-portkey-api-key: PORTKEY_API_KEY" \
-H "x-portkey-provider: bedrock-mantle" \
-H "x-portkey-virtual-key: your-virtual-key"
```
</CodeGroup>

### Count Tokens — `/v1/messages/count_tokens`

Count input tokens for Anthropic models without making an inference call.

<CodeGroup>
```sh cURL
curl -X POST "https://api.portkey.ai/v1/messages/count_tokens" \
-H "Content-Type: application/json" \
-H "x-portkey-api-key: PORTKEY_API_KEY" \
-H "x-portkey-provider: bedrock-mantle" \
-H "x-portkey-virtual-key: your-virtual-key" \
-d '{
"model": "anthropic.claude-opus-4-7",
"system": "You are a scientist",
"messages": [{"role": "user", "content": "Hello, Claude"}]
}'
```
</CodeGroup>

Response:
```json
{"input_tokens": 25}
```

---

## Endpoint-Model Compatibility

Not all models support all endpoints. Use this table as a quick reference:

| Model Family | Chat Completions | Messages | Responses | Count Tokens |
|-------------|:---:|:---:|:---:|:---:|
| `anthropic.*` (Claude) | — | Yes | — | Yes |
| `openai.gpt-oss-120b`, `openai.gpt-oss-20b` | Yes | — | Yes | — |
| `openai.gpt-oss-safeguard-*` | Yes | — | — | — |
| `mistral.*` | Yes | — | — | — |
| `nvidia.*` | Yes | — | — | — |
| `qwen.*` | Yes | — | — | — |
| `google.gemma-*` | Yes | — | — | — |
| `deepseek.*` | Yes | — | — | — |
| `minimax.*` | Yes | — | — | — |
| `moonshotai.*` | Yes | — | — | — |
| `zai.*` | Yes | — | — | — |
| `writer.*` | Yes | — | — | — |

<Note>Use the `/v1/models` endpoint on Mantle directly to discover the full list of models available in your AWS account and region.</Note>

---

## Streaming

Enable streaming by setting `stream: true`.

<CodeGroup>
```python "Chat Completions Stream"
response = portkey.chat.completions.create(
model="qwen.qwen3-32b",
messages=[{"role": "user", "content": "Tell me a story"}],
max_tokens=200,
stream=True
)

for chunk in response:
if chunk.choices[0].delta.content:
print(chunk.choices[0].delta.content, end="")
```

```sh "Messages Stream"
curl -X POST "https://api.portkey.ai/v1/messages" \
-H "Content-Type: application/json" \
-H "x-portkey-api-key: PORTKEY_API_KEY" \
-H "x-portkey-provider: bedrock-mantle" \
-H "x-portkey-virtual-key: your-virtual-key" \
-d '{
"model": "anthropic.claude-opus-4-7",
"max_tokens": 200,
"stream": true,
"messages": [{"role": "user", "content": "Tell me a story"}]
}'
```
</CodeGroup>

---

## Limitations

- **Not all models support all endpoints.** Each model on Mantle only works on specific endpoints based on its provider family. For example, Anthropic models only work on `/v1/messages`, not `/v1/chat/completions`. See the [compatibility table](#endpoint-model-compatibility) above.
- **Model availability is account and region specific.** The models available to you depend on your AWS account's access permissions and the region you're using. Some models (e.g., research previews) require explicit allowlisting by AWS.
- **Extended thinking uses a different format.** Mantle requires `thinking.type: "adaptive"` with `output_config.effort` instead of the standard Anthropic `thinking.type: "enabled"` with `budget_tokens`.
- **`/v1/responses` input items listing is not supported.** The `GET /v1/responses/:id/input_items` endpoint is not available on Mantle.
- **Prompt caching minimum threshold.** Anthropic prompt caching on Mantle requires the cached content to meet a minimum token threshold (typically 2048+ tokens). Smaller prompts won't trigger caching.

---

## Supported Models

Model availability depends on your AWS account and region. Discover available models with:

```sh
curl -X GET "https://bedrock-mantle.us-east-1.api.aws/v1/models" \
-H "Authorization: Bearer YOUR_API_KEY"
```

<Card title="Bedrock Mantle Model List" href="https://docs.aws.amazon.com/bedrock/latest/userguide/bedrock-mantle.html" />

---

## Supported Regions

| Region | Endpoint |
|--------|----------|
| US East (N. Virginia) | `bedrock-mantle.us-east-1.api.aws` |
| US East (Ohio) | `bedrock-mantle.us-east-2.api.aws` |
| US West (Oregon) | `bedrock-mantle.us-west-2.api.aws` |
| Asia Pacific (Mumbai) | `bedrock-mantle.ap-south-1.api.aws` |
| Asia Pacific (Tokyo) | `bedrock-mantle.ap-northeast-1.api.aws` |
| Asia Pacific (Jakarta) | `bedrock-mantle.ap-southeast-3.api.aws` |
| Europe (Frankfurt) | `bedrock-mantle.eu-central-1.api.aws` |
| Europe (Ireland) | `bedrock-mantle.eu-west-1.api.aws` |
| Europe (London) | `bedrock-mantle.eu-west-2.api.aws` |
| Europe (Milan) | `bedrock-mantle.eu-south-1.api.aws` |
| Europe (Stockholm) | `bedrock-mantle.eu-north-1.api.aws` |
| South America (São Paulo) | `bedrock-mantle.sa-east-1.api.aws` |

Set the region when creating your provider in [Model Catalog](https://app.portkey.ai/model-catalog).

---

## Next Steps

Explore Portkey features that work with Bedrock Mantle:

<CardGroup cols={2}>
<Card title="SDK Reference" href="/api-reference/sdk">
Python and Node.js SDK documentation.
</Card>
<Card title="Gateway Configs" href="/product/ai-gateway/configs">
Add fallbacks, retries, load balancing, and caching.
</Card>
<Card title="Observability" href="/product/observability">
Track logs, traces, costs, and latency.
</Card>
<Card title="Prompt Management" href="/product/prompt-engineering-studio">
Version and manage prompts across models.
</Card>
</CardGroup>