OneLLM.dev API documentation

Welcome to the official OneLLM.dev API documentation! Here you will find:

Authentication
API Endpoint
Request Body
Example
Supported Models
Error Handling
Important Notes

It is advised to properly read and understand the documentation before going ahead and making your product. Best of luck!

Authentication

All the API requests require an API Key. Get your API key at OneLLM dashboard after logging in.

Include the key in the Authorization header as a Bearer token:

Authorization: Bearer $APIKEY

An API key is a unique identifier used to authenticate and authorize a user or application when accessing an API (Application Programming Interface). It acts like a password, allowing the API to verify that the request is coming from a legitimate source.

API Endpoint

The BASE_URL for the API is https://onellm.dev/api. This is the primary endpoint for interacting with the language models. It allows you to send a chat conversation and receive a response from the specified model. POST https://onellm.dev/api Note: Streaming responses are not supported. Do not include a stream parameter.

Request Body

All the requests must be a JSON object with the following fields:

Parameter	Type	Required	Description
Model	String	Yes	The model ID to use for the request. See Supported Models for model information.
Messages	Array	Yes	Conversation history as an array of message objects. Each object requires `role` and `content`. See an example.
Temperature	Number	No	Controls randomness. 0-2, with 0 being too deterministic and 2 being very random. Default is 1.
Max_tokens	Integer	No	Maximum number of tokens to generate. Server adjusts automatically if it exceeds your balance.
top_p	Number	No	Nucleus sampling probability (0–1). Controls diversity.
stop_sequences	Array	No	Strings that will cause the model to stop generating tokens.

To view more parameters, please go visit the OpenAPI specifications.

Example

Example request:

{
  "model": "GPT-4.1",
  "messages": [
    { "role": "user", "content": "Tell me a joke about computers." }
  ],
  "max_tokens": 50
}

Example response:

{
  "provider": "openai",
  "model": "GPT-4.1",
  "role": "assistant",
  "content": "Why did the computer show up at work late? It had a hard drive!",
  "usage": {
    "input_tokens": 7,
    "output_tokens": 16,
    "total_tokens": 23
  },
  "finish_reason": "stop"
}

Models

Available model IDs (use in model field):

GPT Series: GPT-5, GPT-5-Mini, GPT-5-Nano, GPT-5-Chat-Latest, GPT-4.1, GPT-4.1-Mini, GPT-4.1-Nano, GPT-o3, GPT-o3-pro, GPT-o3-DeepResearch, GPT-o3-Mini, GPT-o4-mini, GPT-4o, GPT-4o-mini, GPT-o1, GPT-o1-Mini
Anthropic (Claude-like): Opus-4, Sonnet-4, Haiku-3.5, Opus-3, Sonnet-3.7, Haiku-3
DeepSeek: DeepSeek-Reasoner, DeepSeek-Chat
Flash/Preview: 2.5-Flash-preview, 2.5-Pro-preview, 2.0-Flash, 2.0-Flash-lite, 1.5-Flash, 1.5-Flash-8B, 1.5-Pro
Mistral & Variants: Mistral-Medium-3, Magistral-Medium, Codestral, Devstral-Medium, Mistral-Saba, Mistral-Large, Pixtral-Large, Ministral-8B-24.10, Ministral-3B-24.10, Mistral-Small-3.2, Magistral-Small, Devstral-Small, Pixtral-12B, Mistral-NeMo, Mistral-7B, Mixtral-8x7B, Mixtral-8x22B

Error Handling

The API may return different error codes. Common ones are:

Status Code	Meaning	Reason
400	Bad Request	Invalid JSON or missing required fields.
401	Unauthorized	API key missing or invalid.
403	Forbidden	You don't have access to the requested resource
404	Not Found	Invalid Endpoint.
405	Method not allowed	Endpoint does not support the HTTP method used.
422	Unprocessable Entity	Invalid type or format in the request body.
429	Too Many Requests	Rate limit exceeded. Try again later.
500	Internal Server Error	Something went wrong on the server side.

Please be sure to check the response body for extra error detail. If you encounter a error you cannot debug, please be sure to join our Discord Support server.

Important Notes

Streaming is not supported. The model sends complete response for now. But we do have someone working on it, so it won't be long before this feature is implemented.
max_tokens will auto-adjust if it’s larger than your allowed balance.
A minimum balance of $0.10 is required to make API requests.
The response includes usage info:
- input_tokens: number of tokens sent in the request.
- output_tokens: number of tokens generated by the model.
- total_tokens: combined count of both.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

OneLLM.dev API documentation

Table of Contents

Authentication

API Endpoint

Request Body

Example

Models

Error Handling

Important Notes

FilesExpand file tree

DOCS.md

Latest commit

History

DOCS.md

File metadata and controls

OneLLM.dev API documentation

Table of Contents

Authentication

API Endpoint

Request Body

Example

Models

Error Handling

Important Notes