gproxy

gproxy is a Rust-based multi-channel LLM proxy that exposes OpenAI / Claude / Gemini-style APIs through a unified gateway, with a built-in admin console, user/key management, and request/usage auditing.

Chinese version: README.zh.md

If you want to look at the full docs, click here.

Key Features

Unified multi-channel gateway: route requests to different upstreams by channel (builtin + custom).
Multi-protocol compatibility: one upstream can accept OpenAI/Claude/Gemini requests (controlled by dispatch rules).
Credential pool and health states: supports healthy / partial / dead with model-level cooldown retry.
OAuth and API Key support: OAuth channels (Codex, ClaudeCode, GeminiCli, Antigravity) and API Key channels.
Built-in Web console: available at /, supports English and Chinese.
Observability: records upstream/downstream requests and usage metrics (filterable by user/model/time).
Async batched storage writes: queue + aggregation to reduce database pressure under load.

Built-in Channels

Channel ID	Default Upstream	Auth Type
`openai`	`https://api.openai.com`	API Key
`anthropic`	`https://api.anthropic.com`	API Key
`aistudio`	`https://generativelanguage.googleapis.com`	API Key
`vertexexpress`	`https://aiplatform.googleapis.com`	API Key
`vertex`	`https://aiplatform.googleapis.com`	GCP service account (builtin object)
`geminicli`	`https://cloudcode-pa.googleapis.com`	OAuth (builtin object)
`claudecode`	`https://api.anthropic.com`	OAuth/Cookie (builtin object)
`codex`	`https://chatgpt.com/backend-api/codex`	OAuth (builtin object)
`antigravity`	`https://daily-cloudcode-pa.sandbox.googleapis.com`	OAuth (builtin object)
`nvidia`	`https://integrate.api.nvidia.com`	API Key
`deepseek`	`https://api.deepseek.com`	API Key
custom (for example `mycustom`)	your configured `base_url`	API Key (`secret`)

Quick Start

1. Prerequisites

Rust (must support edition = 2024)
SQLite (default DSN uses sqlite)
Optional: Node.js + pnpm (if you want to rebuild the admin frontend)

2. Prepare Config

cp gproxy.example.toml gproxy.toml

At minimum, set:

global.admin_key
at least one enabled channel credential (credentials.secret or builtin credential object)

Bootstrap login defaults:

username: admin
password: value of global.admin_key

3. Run

cargo run -p gproxy

On startup, gproxy prints:

listening address (default http://127.0.0.1:8787)
current admin key (password:)

If ./gproxy.toml does not exist, gproxy starts with in-memory defaults and auto-generates a 16-digit admin key (printed to stdout).

4. Minimal Verification

curl -sS http://127.0.0.1:8787/openai/v1/models \
  -H "x-api-key: <your user key or admin key>"

Get a user/admin API key via password login:

curl -sS http://127.0.0.1:8787/login \
  -H "content-type: application/json" \
  -d '{
    "name": "admin",
    "password": "<your admin_key>"
  }'

Deployment

Local deployment

Binary

Download the binary from Releases.
Prepare config:

cp gproxy.example.toml gproxy.toml

Run binary:

./gproxy

Docker

Pull prebuilt image (recommended):

docker pull ghcr.io/leenhawk/gproxy:latest

Build from local source (only if you need local code changes):

docker build -t gproxy:local .

Run:

docker run --rm -p 8787:8787 \
  -e GPROXY_HOST=0.0.0.0 \
  -e GPROXY_PORT=8787 \
  -e GPROXY_ADMIN_KEY=your-admin-key \
  -e DATABASE_SECRET_KEY='replace-with-long-random-string' \
  -e GPROXY_DSN='sqlite:///app/data/gproxy.db?mode=rwc' \
  -v $(pwd)/data:/app/data \
  ghcr.io/leenhawk/gproxy:latest

Set DATABASE_SECRET_KEY via env vars or your platform secret manager rather than committing it to the repo. Especially on free-tier or shared managed databases, configure it before the first bootstrap so sensitive fields are not stored in plaintext, and keep the same key on every instance using that database.

Cloud deployment

ClawCloud Run

Template file: claw.yaml
Use claw.yaml as a custom template in ClawCloud Run App Store -> My Apps -> Debugging.
Key inputs: admin_key (generated by default), proxy_url, rust_log, volume_size
Recommended persistence: mount /app/data as a persistent volume.

Release downloads and self-update (Cloudflare Pages)

Release CI publishes signed binaries and update manifests to a dedicated Cloudflare Pages downloads project.
Default public base URL: https://download-gproxy.leenhawk.com
Generated manifests:
- /manifest.json — full download index used by the docs downloads page
- /releases/manifest.json — stable self-update feed
- /staging/manifest.json — staging self-update feed
The admin UI Cloudflare update source and /admin/system/self_update read from this downloads site.
Required GitHub Actions secrets for the downloads deployment:
- CLOUDFLARE_API_TOKEN
- CLOUDFLARE_ACCOUNT_ID
- CLOUDFLARE_DOWNLOADS_PROJECT_NAME
Optional secrets:
- DOWNLOAD_PUBLIC_BASE_URL — custom public domain or Pages URL exposed in docs/manifests
- UPDATE_SIGNING_KEY_ID — manifest key id override (default gproxy-release-v1)
- UPDATE_SIGNING_PRIVATE_KEY_B64 and UPDATE_SIGNING_PUBLIC_KEY_B64 — checksum signature generation and verification

Admin Frontend

Console entry: GET /
Static assets: /assets/*
Frontend build output: apps/gproxy/frontend/dist
Backend embeds dist into the binary via rust-embed

If you changed frontend code, rebuild first:

cd apps/gproxy/frontend
pnpm install
pnpm build
cd ../../..
cargo run -p gproxy

Configuration (`gproxy.toml`)

Reference files:

gproxy.example.toml (minimal)
gproxy.example.full.toml (full)

`global`

Field	Description
`host`	Bind host, default `127.0.0.1`
`port`	Bind port, default `8787`
`proxy`	Upstream proxy (empty string means disabled)
`hf_token`	HuggingFace token (optional for tokenizer download)
`hf_url`	HuggingFace base URL, default `https://huggingface.co`
`admin_key`	Admin bootstrap credential; used as admin password and admin API key on bootstrap, auto-generated if empty
`mask_sensitive_info`	Redact sensitive request/response payloads in logs/events
`data_dir`	Data directory, default `./data`
`dsn`	Database DSN; if omitted and `data_dir` is changed, sqlite DSN is derived automatically

`runtime`

Field	Default	Description
`storage_write_queue_capacity`	`4096`	Storage write queue size
`storage_write_max_batch_size`	`1024`	Max events per aggregated storage batch
`storage_write_aggregate_window_ms`	`25`	Aggregation window (ms)

`channels`

Each channel is declared with [[channels]]:

id: channel id (for example openai, claude, mycustom)
enabled: runtime enable switch (false disables routing to this channel)
settings: channel settings (must include base_url)
dispatch: optional; defaults to channel-specific dispatch table when omitted
credentials: credential list (supports multi-credential retry/fallback)

Anthropic/ClaudeCode Cache Rewrite (`cache_breakpoints`)

For anthropic and claudecode, configure cache-control rewrite with:

setting key: channels.settings.cache_breakpoints
max 4 rules
targets: top_level (global alias), tools, system, messages
messages indexing uses flattened messages[*].content blocks after normalizing Claude shorthands (content: "..." becomes one text block)
for messages, you may also set content_position / content_index; when either field is present, position / index first select a message, then content_* selects a block inside that message
ttl: auto / 5m / 1h (auto means no ttl field is injected)
existing request-side cache_control is always preserved and counts toward the 4-rule limit

No-ttl default note:

anthropic: upstream default is 5m
claudecode: upstream default is 5m
use explicit ttl when you need deterministic behavior

Example:

[[channels]]
id = "anthropic"
enabled = true

[channels.settings]
base_url = "https://api.anthropic.com"
cache_breakpoints = [
  { target = "top_level", ttl = "auto" },
  { target = "messages", position = "last_nth", index = 1, ttl = "5m" },
  { target = "messages", position = "last_nth", index = 1, content_position = "last_nth", content_index = 1, ttl = "5m" }
]

[[channels]]
id = "claudecode"
enabled = true

[channels.settings]
base_url = "https://api.anthropic.com"
cache_breakpoints = [
  { target = "top_level", ttl = "auto" },
  { target = "messages", position = "last_nth", index = 1, content_position = "last_nth", content_index = 1, ttl = "1h" }
]

`channels.credentials`

Each credential can include:

id / label: optional identifiers
secret: for API key channels
builtin: structured credential object for OAuth/service-account channels
state: optional health-state seed

state.health.kind supports:

healthy
partial (with model cooldown list)
dead

Credential Selection and Cache Affinity

Provider credential routing is controlled by two settings in channels.settings:

credential_round_robin_enabled (default true)
credential_cache_affinity_enabled (default true, only effective when round-robin is enabled)
credential_cache_affinity_max_keys (default 4096, max retained affinity keys per channel)

Effective behavior:

credential_round_robin_enabled = false -> StickyNoCache
- no round-robin
- no cache affinity pool
- picks the smallest available credential id and keeps using it until unavailable/cooldown
credential_round_robin_enabled = true and credential_cache_affinity_enabled = true -> RoundRobinWithCache
- round-robin/random among eligible credentials
- enables internal cache affinity pool for cache-key sticky routing
credential_round_robin_enabled = true and credential_cache_affinity_enabled = false -> RoundRobinNoCache
- round-robin/random among eligible credentials
- no cache affinity pool

Example:

[[channels]]
id = "openai"
enabled = true

[channels.settings]
base_url = "https://api.openai.com"
credential_round_robin_enabled = true
credential_cache_affinity_enabled = true
credential_cache_affinity_max_keys = 4096

Legacy compatibility:

credential_pick_mode is still accepted for backward compatibility.

Detailed design and cache-hit strategy (OpenAI/Claude/Gemini):
https://gproxy.leenhawk.com/guides/credential-selection-cache-affinity/

CLI and Environment Overrides

Priority: CLI flags / env vars > gproxy.toml > defaults

Supported overrides:

--config / GPROXY_CONFIG_PATH
--host / GPROXY_HOST
--port / GPROXY_PORT
--proxy / GPROXY_PROXY
--admin-key / GPROXY_ADMIN_KEY
--bootstrap-force-config / GPROXY_BOOTSTRAP_FORCE_CONFIG
--mask-sensitive-info / GPROXY_MASK_SENSITIVE_INFO
--data-dir / GPROXY_DATA_DIR
--dsn / GPROXY_DSN
--storage-write-queue-capacity / GPROXY_STORAGE_WRITE_QUEUE_CAPACITY
--storage-write-max-batch-size / GPROXY_STORAGE_WRITE_MAX_BATCH_SIZE
--storage-write-aggregate-window-ms / GPROXY_STORAGE_WRITE_AGGREGATE_WINDOW_MS
--database-secret-key / DATABASE_SECRET_KEY

Configure the database-at-rest encryption key with --database-secret-key or the DATABASE_SECRET_KEY environment variable.

When DATABASE_SECRET_KEY is unset, gproxy stores and reads DB values as plaintext. When it is set, gproxy transparently encrypts at rest for credential.secret_json, user API keys, user passwords, admin_key, and hf_token.

Recommendations:

set the key before the first database bootstrap and keep it identical on every instance using the same database;
on free-tier or shared managed databases, strongly prefer setting the key so sensitive values are not stored in plaintext;
inject it via env vars / platform secrets instead of committing it to the repo;
do not change it casually after encrypted data exists, or older ciphertext will become unreadable without migration / re-encryption.

Bootstrap Source Mode

--bootstrap-force-config / GPROXY_BOOTSTRAP_FORCE_CONFIG is a startup-only switch (CLI/env only, not a gproxy.toml field).

default (false or unset):
- if DB is not initialized, bootstrap from gproxy.toml as usual.
- if DB is already initialized, prefer DB state and skip config-file channel/provider import.
- startup admin_key override is still honored.
true:
- force apply config-file channels/settings/credentials/global values on boot.
- this mode is useful when you intentionally want config file to overwrite existing DB bootstrap state.

API Overview

All errors return:

{ "error": "..." }

Auth Headers

POST /login uses JSON body { "name": "...", "password": "..." } and returns api_key
Admin/User APIs (except /login): use x-api-key
Provider APIs also accept:
- x-api-key
- x-goog-api-key
- Authorization: Bearer ...
- Gemini query key ?key=... (normalized into x-api-key)

Provider Routes

1) Scoped (recommended)

Provider is explicit in path, examples:

POST /openai/v1/chat/completions
POST /anthropic/v1/messages
POST /aistudio/v1beta/models/{model}:generateContent

2) Unscoped (single unified entry)

Provider is resolved from model prefix:

POST /v1/chat/completions
POST /v1/responses
POST /v1/messages
GET /v1/models
GET /v1/models/{provider}/{model}

Constraints:

For OpenAI/Claude-style request bodies, model must be <provider>/<model>, for example openai/gpt-4.1.
For Gemini target paths, provider must be included, for example models/aistudio/gemini-2.5-flash:generateContent.

OAuth and Upstream Usage

GET /{provider}/v1/oauth
GET /{provider}/v1/oauth/callback
GET /{provider}/v1/usage

OAuth-capable channels: codex, claudecode, geminicli, antigravity

Admin APIs (`/admin/*`)

Main groups:

Global settings: /admin/global-settings, /admin/global-settings/upsert
Config export/import: /admin/config/export-toml, /admin/config/import-toml
Self update: /admin/system/self_update
Providers/Credentials/CredentialStatuses: query/upsert/delete
Users: query/upsert/delete (/admin/users/upsert requires password)
UserKeys: query/generate/delete
Requests: /admin/requests/upstream/query, /admin/requests/downstream/query
Usage: /admin/usages/query, /admin/usages/summary

User APIs (`/user/*`)

POST /user/keys/query
POST /user/keys/generate
POST /user/keys/delete
POST /user/usages/query
POST /user/usages/summary

Request Examples

Scoped OpenAI Chat

curl -sS http://127.0.0.1:8787/openai/v1/chat/completions \
  -H "x-api-key: <key>" \
  -H "content-type: application/json" \
  -d '{
    "model": "gpt-4.1",
    "messages": [{"role":"user","content":"hello"}],
    "stream": false
  }'

Unscoped OpenAI Chat (model-prefixed routing)

curl -sS http://127.0.0.1:8787/v1/chat/completions \
  -H "x-api-key: <key>" \
  -H "content-type: application/json" \
  -d '{
    "model": "openai/gpt-4.1",
    "messages": [{"role":"user","content":"hello"}],
    "stream": false
  }'

Scoped Gemini GenerateContent

curl -sS "http://127.0.0.1:8787/aistudio/v1beta/models/gemini-2.5-flash:generateContent" \
  -H "x-api-key: <key>" \
  -H "content-type: application/json" \
  -d '{
    "contents":[{"role":"user","parts":[{"text":"hello"}]}]
  }'

Anthropic/ClaudeCode Prompt Cache Quick Check (4 curls)

Make sure both providers have at least one cache_breakpoints rule (for example { target = "top_level", ttl = "auto" }).

BASE="http://127.0.0.1:8787"
KEY="<your x-api-key>"
SYS="$(for i in $(seq 1 1800); do printf 'cache-prefix-%04d ' "$i"; done)"

# 1) Claude first request (cache write)
curl -sS "$BASE/anthropic/v1/messages" \
  -H "x-api-key: $KEY" \
  -H "content-type: application/json" \
  -H "anthropic-version: 2023-06-01" \
  --data-binary @- <<JSON | jq '.usage'
{
  "model": "claude-neptune-v3",
  "max_tokens": 128,
  "stream": false,
  "system": "$SYS",
  "messages": [
    { "role": "user", "content": "only reply: cache-ok" }
  ]
}
JSON

# 2) Claude second request (cache read)
curl -sS "$BASE/anthropic/v1/messages" \
  -H "x-api-key: $KEY" \
  -H "content-type: application/json" \
  -H "anthropic-version: 2023-06-01" \
  --data-binary @- <<JSON | jq '.usage'
{
  "model": "claude-neptune-v3",
  "max_tokens": 128,
  "stream": false,
  "system": "$SYS",
  "messages": [
    { "role": "user", "content": "only reply: cache-ok" }
  ]
}
JSON

# 3) ClaudeCode first request (cache write)
curl -sS "$BASE/claudecode/v1/messages" \
  -H "x-api-key: $KEY" \
  -H "content-type: application/json" \
  -H "anthropic-version: 2023-06-01" \
  --data-binary @- <<JSON | jq '.usage'
{
  "model": "claude-sonnet-4-6",
  "max_tokens": 128,
  "stream": false,
  "system": "$SYS",
  "messages": [
    { "role": "user", "content": "only reply: cache-ok" }
  ]
}
JSON

# 4) ClaudeCode second request (cache read)
curl -sS "$BASE/claudecode/v1/messages" \
  -H "x-api-key: $KEY" \
  -H "content-type: application/json" \
  -H "anthropic-version: 2023-06-01" \
  --data-binary @- <<JSON | jq '.usage'
{
  "model": "claude-sonnet-4-6",
  "max_tokens": 128,
  "stream": false,
  "system": "$SYS",
  "messages": [
    { "role": "user", "content": "only reply: cache-ok" }
  ]
}
JSON

Architecture

Workspace Layout

Path	Responsibility
`apps/gproxy`	Executable service entry (Axum + embedded admin frontend)
`crates/gproxy-core`	AppState, router orchestration, auth, request execution
`crates/gproxy-provider`	channel implementations, retry, OAuth, dispatch, tokenizers
`crates/gproxy-middleware`	protocol transform middleware, usage extraction
`crates/gproxy-protocol`	OpenAI/Claude/Gemini typed protocol models and transforms
`crates/gproxy-storage`	SeaORM storage layer, query models, async write queue
`crates/gproxy-admin`	admin/user domain operations

Runtime Flow

On bootstrap:
- load config and apply CLI/env overrides
- choose bootstrap source mode (bootstrap_force_config): prefer DB state by default once initialized
- connect database and sync schema automatically
- initialize provider registry, credentials, and credential states
- ensure admin principal (id=0) and admin key exist
On request:
- authenticate user key
- route + transform/forward according to dispatch table
- pick eligible credentials and retry/fallback
- persist upstream/downstream events and usage records

Credential States and Cooldown

healthy: available
partial: model-level cooldown
dead: unavailable

Default cooldowns:

rate limit: 60s
transient failure: 15s

Testing

Provider smoke/regression scripts:

tests/provider/curl_provider.sh
tests/provider/run_channel_regression.sh

Examples:

API_KEY='<key>' tests/provider/curl_provider.sh \
  --provider openai \
  --method openai_chat \
  --model gpt-4.1

API_KEY='<key>' tests/provider/run_channel_regression.sh \
  --provider openai \
  --model gpt-5-nano \
  --embedding-model text-embedding-3-small

Common Issues

1) `401 unauthorized`

Ensure x-api-key is provided for key-protected routes, and both the key and its owner user are enabled.
If you don't have a key yet, call POST /login with username/password first.

2) `403 forbidden` on admin routes

The key is not owned by the admin user (id=0).

3) `503 all eligible credentials exhausted`

Check:
- whether the channel has any available credential
- whether credential status is dead or currently partial for the target model
- whether upstream keeps returning 429/5xx

4) `model must be prefixed as <provider>/...`

You called an unscoped route without provider-prefixed model.

5) Realtime WebSocket unavailable

/v1/realtime is currently not implemented; use /v1/responses (HTTP) instead.

Security Notes

Set a strong admin_key in production.
Keep mask_sensitive_info = true unless you explicitly need full payload visibility for debugging.
If you configure an outbound proxy, ensure the proxy path is trusted and access-controlled.

Data and Directories

By default:

data dir: ./data
default DB: sqlite://./data/gproxy.db?mode=rwc
tokenizer cache: ./data/tokenizers

gproxy-storage supports sqlite / mysql / postgres via dsn.

Development Commands

# backend format/lint/check
cargo fmt
cargo check
cargo clippy --workspace --all-targets

# tests
cargo test --workspace

# run service
cargo run -p gproxy

Frontend:

cd apps/gproxy/frontend
pnpm install
pnpm typecheck
pnpm build

License

This project is licensed under AGPL-3.0-or-later (see LICENSE).

Name		Name	Last commit message	Last commit date
Latest commit History 499 Commits
.github		.github
apps/gproxy		apps/gproxy
crates		crates
docs		docs
scripts		scripts
tests		tests
.dockerignore		.dockerignore
.gitignore		.gitignore
Cargo.lock		Cargo.lock
Cargo.toml		Cargo.toml
Dockerfile		Dockerfile
Dockerfile.action		Dockerfile.action
LICENSE		LICENSE
README.md		README.md
README.zh.md		README.zh.md
RELEASE_NOTE.md		RELEASE_NOTE.md
claw.yaml		claw.yaml
gproxy.example.full.toml		gproxy.example.full.toml
gproxy.example.toml		gproxy.example.toml
package-lock.json		package-lock.json
release.sh		release.sh

Uh oh!

Folders and files

Latest commit

History

Repository files navigation

gproxy

Key Features

Built-in Channels

Quick Start

1. Prerequisites

2. Prepare Config

3. Run

4. Minimal Verification

Deployment

Local deployment

Binary

Docker

Cloud deployment

ClawCloud Run

Release downloads and self-update (Cloudflare Pages)

Admin Frontend

Configuration (gproxy.toml)

global

runtime

channels

Anthropic/ClaudeCode Cache Rewrite (cache_breakpoints)

channels.credentials

Credential Selection and Cache Affinity

CLI and Environment Overrides

Bootstrap Source Mode

API Overview

Auth Headers

Provider Routes

1) Scoped (recommended)

2) Unscoped (single unified entry)

OAuth and Upstream Usage

Admin APIs (/admin/*)

User APIs (/user/*)

Request Examples

Scoped OpenAI Chat

Unscoped OpenAI Chat (model-prefixed routing)

Scoped Gemini GenerateContent

Anthropic/ClaudeCode Prompt Cache Quick Check (4 curls)

Architecture

Workspace Layout

Runtime Flow

Credential States and Cooldown

Testing

Common Issues

1) 401 unauthorized

2) 403 forbidden on admin routes

3) 503 all eligible credentials exhausted

4) model must be prefixed as <provider>/...

5) Realtime WebSocket unavailable

Security Notes

Data and Directories

Development Commands

License

Star History

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases 70

Sponsor this project

Uh oh!

Packages 0

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Configuration (`gproxy.toml`)

`global`

`runtime`

`channels`

Anthropic/ClaudeCode Cache Rewrite (`cache_breakpoints`)

`channels.credentials`

Admin APIs (`/admin/*`)

User APIs (`/user/*`)

1) `401 unauthorized`

2) `403 forbidden` on admin routes

3) `503 all eligible credentials exhausted`

4) `model must be prefixed as <provider>/...`

Packages