gproxy is a Rust-based multi-channel LLM proxy that exposes OpenAI / Claude / Gemini-style APIs through a unified gateway, with a built-in admin console, user/key management, and request/usage auditing.
Chinese version: README.zh.md
If you want to look at the full docs, click here.
- Unified multi-channel gateway: route requests to different upstreams by
channel(builtin + custom). - Multi-protocol compatibility: one upstream can accept OpenAI/Claude/Gemini requests (controlled by dispatch rules).
- Credential pool and health states: supports
healthy / partial / deadwith model-level cooldown retry. - OAuth and API Key support: OAuth channels (Codex, ClaudeCode, GeminiCli, Antigravity) and API Key channels.
- Built-in Web console: available at
/, supports English and Chinese. - Observability: records upstream/downstream requests and usage metrics (filterable by user/model/time).
- Async batched storage writes: queue + aggregation to reduce database pressure under load.
| Channel ID | Default Upstream | Auth Type |
|---|---|---|
openai |
https://api.openai.com |
API Key |
anthropic |
https://api.anthropic.com |
API Key |
aistudio |
https://generativelanguage.googleapis.com |
API Key |
vertexexpress |
https://aiplatform.googleapis.com |
API Key |
vertex |
https://aiplatform.googleapis.com |
GCP service account (builtin object) |
geminicli |
https://cloudcode-pa.googleapis.com |
OAuth (builtin object) |
claudecode |
https://api.anthropic.com |
OAuth/Cookie (builtin object) |
codex |
https://chatgpt.com/backend-api/codex |
OAuth (builtin object) |
antigravity |
https://daily-cloudcode-pa.sandbox.googleapis.com |
OAuth (builtin object) |
nvidia |
https://integrate.api.nvidia.com |
API Key |
deepseek |
https://api.deepseek.com |
API Key |
custom (for example mycustom) |
your configured base_url |
API Key (secret) |
- Rust (must support
edition = 2024) - SQLite (default DSN uses sqlite)
- Optional: Node.js +
pnpm(if you want to rebuild the admin frontend)
cp gproxy.example.toml gproxy.tomlAt minimum, set:
global.admin_key- at least one enabled channel credential (
credentials.secretor builtin credential object)
Bootstrap login defaults:
- username:
admin - password: value of
global.admin_key
cargo run -p gproxyOn startup, gproxy prints:
- listening address (default
http://127.0.0.1:8787) - current admin key (
password:)
If
./gproxy.tomldoes not exist, gproxy starts with in-memory defaults and auto-generates a 16-digit admin key (printed to stdout).
curl -sS http://127.0.0.1:8787/openai/v1/models \
-H "x-api-key: <your user key or admin key>"Get a user/admin API key via password login:
curl -sS http://127.0.0.1:8787/login \
-H "content-type: application/json" \
-d '{
"name": "admin",
"password": "<your admin_key>"
}'- Download the binary from Releases.
- Prepare config:
cp gproxy.example.toml gproxy.toml- Run binary:
./gproxyPull prebuilt image (recommended):
docker pull ghcr.io/leenhawk/gproxy:latestBuild from local source (only if you need local code changes):
docker build -t gproxy:local .Run:
docker run --rm -p 8787:8787 \
-e GPROXY_HOST=0.0.0.0 \
-e GPROXY_PORT=8787 \
-e GPROXY_ADMIN_KEY=your-admin-key \
-e DATABASE_SECRET_KEY='replace-with-long-random-string' \
-e GPROXY_DSN='sqlite:///app/data/gproxy.db?mode=rwc' \
-v $(pwd)/data:/app/data \
ghcr.io/leenhawk/gproxy:latestSet
DATABASE_SECRET_KEYvia env vars or your platform secret manager rather than committing it to the repo. Especially on free-tier or shared managed databases, configure it before the first bootstrap so sensitive fields are not stored in plaintext, and keep the same key on every instance using that database.
- Template file:
claw.yaml - Use
claw.yamlas a custom template in ClawCloud Run App Store -> My Apps -> Debugging. - Key inputs:
admin_key(generated by default),proxy_url,rust_log,volume_size - Recommended persistence: mount
/app/dataas a persistent volume.
- Release CI publishes signed binaries and update manifests to a dedicated Cloudflare Pages downloads project.
- Default public base URL:
https://download-gproxy.leenhawk.com - Generated manifests:
/manifest.json— full download index used by the docs downloads page/releases/manifest.json— stable self-update feed/staging/manifest.json— staging self-update feed
- The admin UI
Cloudflareupdate source and/admin/system/self_updateread from this downloads site. - Required GitHub Actions secrets for the downloads deployment:
CLOUDFLARE_API_TOKENCLOUDFLARE_ACCOUNT_IDCLOUDFLARE_DOWNLOADS_PROJECT_NAME
- Optional secrets:
DOWNLOAD_PUBLIC_BASE_URL— custom public domain or Pages URL exposed in docs/manifestsUPDATE_SIGNING_KEY_ID— manifest key id override (defaultgproxy-release-v1)UPDATE_SIGNING_PRIVATE_KEY_B64andUPDATE_SIGNING_PUBLIC_KEY_B64— checksum signature generation and verification
- Console entry:
GET / - Static assets:
/assets/* - Frontend build output:
apps/gproxy/frontend/dist - Backend embeds
distinto the binary viarust-embed
If you changed frontend code, rebuild first:
cd apps/gproxy/frontend
pnpm install
pnpm build
cd ../../..
cargo run -p gproxyReference files:
gproxy.example.toml(minimal)gproxy.example.full.toml(full)
| Field | Description |
|---|---|
host |
Bind host, default 127.0.0.1 |
port |
Bind port, default 8787 |
proxy |
Upstream proxy (empty string means disabled) |
hf_token |
HuggingFace token (optional for tokenizer download) |
hf_url |
HuggingFace base URL, default https://huggingface.co |
admin_key |
Admin bootstrap credential; used as admin password and admin API key on bootstrap, auto-generated if empty |
mask_sensitive_info |
Redact sensitive request/response payloads in logs/events |
data_dir |
Data directory, default ./data |
dsn |
Database DSN; if omitted and data_dir is changed, sqlite DSN is derived automatically |
| Field | Default | Description |
|---|---|---|
storage_write_queue_capacity |
4096 |
Storage write queue size |
storage_write_max_batch_size |
1024 |
Max events per aggregated storage batch |
storage_write_aggregate_window_ms |
25 |
Aggregation window (ms) |
Each channel is declared with [[channels]]:
id: channel id (for exampleopenai,claude,mycustom)enabled: runtime enable switch (falsedisables routing to this channel)settings: channel settings (must includebase_url)dispatch: optional; defaults to channel-specific dispatch table when omittedcredentials: credential list (supports multi-credential retry/fallback)
For anthropic and claudecode, configure cache-control rewrite with:
- setting key:
channels.settings.cache_breakpoints - max 4 rules
- targets:
top_level(globalalias),tools,system,messages messagesindexing uses flattenedmessages[*].contentblocks after normalizing Claude shorthands (content: "..."becomes one text block)- for
messages, you may also setcontent_position/content_index; when either field is present,position/indexfirst select a message, thencontent_*selects a block inside that message ttl:auto/5m/1h(automeans no ttl field is injected)- existing request-side
cache_controlis always preserved and counts toward the 4-rule limit
No-ttl default note:
anthropic: upstream default is5mclaudecode: upstream default is5m- use explicit ttl when you need deterministic behavior
Example:
[[channels]]
id = "anthropic"
enabled = true
[channels.settings]
base_url = "https://api.anthropic.com"
cache_breakpoints = [
{ target = "top_level", ttl = "auto" },
{ target = "messages", position = "last_nth", index = 1, ttl = "5m" },
{ target = "messages", position = "last_nth", index = 1, content_position = "last_nth", content_index = 1, ttl = "5m" }
]
[[channels]]
id = "claudecode"
enabled = true
[channels.settings]
base_url = "https://api.anthropic.com"
cache_breakpoints = [
{ target = "top_level", ttl = "auto" },
{ target = "messages", position = "last_nth", index = 1, content_position = "last_nth", content_index = 1, ttl = "1h" }
]Each credential can include:
id/label: optional identifierssecret: for API key channelsbuiltin: structured credential object for OAuth/service-account channelsstate: optional health-state seed
state.health.kind supports:
healthypartial(with model cooldown list)dead
Provider credential routing is controlled by two settings in channels.settings:
credential_round_robin_enabled(defaulttrue)credential_cache_affinity_enabled(defaulttrue, only effective when round-robin is enabled)credential_cache_affinity_max_keys(default4096, max retained affinity keys per channel)
Effective behavior:
credential_round_robin_enabled = false->StickyNoCache- no round-robin
- no cache affinity pool
- picks the smallest available credential id and keeps using it until unavailable/cooldown
credential_round_robin_enabled = trueandcredential_cache_affinity_enabled = true->RoundRobinWithCache- round-robin/random among eligible credentials
- enables internal cache affinity pool for cache-key sticky routing
credential_round_robin_enabled = trueandcredential_cache_affinity_enabled = false->RoundRobinNoCache- round-robin/random among eligible credentials
- no cache affinity pool
Example:
[[channels]]
id = "openai"
enabled = true
[channels.settings]
base_url = "https://api.openai.com"
credential_round_robin_enabled = true
credential_cache_affinity_enabled = true
credential_cache_affinity_max_keys = 4096Legacy compatibility:
credential_pick_modeis still accepted for backward compatibility.
Detailed design and cache-hit strategy (OpenAI/Claude/Gemini):
https://gproxy.leenhawk.com/guides/credential-selection-cache-affinity/
Priority: CLI flags / env vars > gproxy.toml > defaults
Supported overrides:
--config/GPROXY_CONFIG_PATH--host/GPROXY_HOST--port/GPROXY_PORT--proxy/GPROXY_PROXY--admin-key/GPROXY_ADMIN_KEY--bootstrap-force-config/GPROXY_BOOTSTRAP_FORCE_CONFIG--mask-sensitive-info/GPROXY_MASK_SENSITIVE_INFO--data-dir/GPROXY_DATA_DIR--dsn/GPROXY_DSN--storage-write-queue-capacity/GPROXY_STORAGE_WRITE_QUEUE_CAPACITY--storage-write-max-batch-size/GPROXY_STORAGE_WRITE_MAX_BATCH_SIZE--storage-write-aggregate-window-ms/GPROXY_STORAGE_WRITE_AGGREGATE_WINDOW_MS--database-secret-key/DATABASE_SECRET_KEY
Configure the database-at-rest encryption key with --database-secret-key or the DATABASE_SECRET_KEY environment variable.
When DATABASE_SECRET_KEY is unset, gproxy stores and reads DB values as plaintext. When it is set, gproxy transparently encrypts at rest for credential.secret_json, user API keys, user passwords, admin_key, and hf_token.
Recommendations:
- set the key before the first database bootstrap and keep it identical on every instance using the same database;
- on free-tier or shared managed databases, strongly prefer setting the key so sensitive values are not stored in plaintext;
- inject it via env vars / platform secrets instead of committing it to the repo;
- do not change it casually after encrypted data exists, or older ciphertext will become unreadable without migration / re-encryption.
--bootstrap-force-config / GPROXY_BOOTSTRAP_FORCE_CONFIG is a startup-only switch (CLI/env only, not a gproxy.toml field).
- default (
falseor unset):- if DB is not initialized, bootstrap from
gproxy.tomlas usual. - if DB is already initialized, prefer DB state and skip config-file channel/provider import.
- startup
admin_keyoverride is still honored.
- if DB is not initialized, bootstrap from
true:- force apply config-file channels/settings/credentials/global values on boot.
- this mode is useful when you intentionally want config file to overwrite existing DB bootstrap state.
All errors return:
{ "error": "..." }POST /loginuses JSON body{ "name": "...", "password": "..." }and returnsapi_key- Admin/User APIs (except
/login): usex-api-key - Provider APIs also accept:
x-api-keyx-goog-api-keyAuthorization: Bearer ...- Gemini query key
?key=...(normalized intox-api-key)
Provider is explicit in path, examples:
POST /openai/v1/chat/completionsPOST /anthropic/v1/messagesPOST /aistudio/v1beta/models/{model}:generateContent
Provider is resolved from model prefix:
POST /v1/chat/completionsPOST /v1/responsesPOST /v1/messagesGET /v1/modelsGET /v1/models/{provider}/{model}
Constraints:
- For OpenAI/Claude-style request bodies,
modelmust be<provider>/<model>, for exampleopenai/gpt-4.1. - For Gemini target paths, provider must be included, for example
models/aistudio/gemini-2.5-flash:generateContent.
GET /{provider}/v1/oauthGET /{provider}/v1/oauth/callbackGET /{provider}/v1/usage
OAuth-capable channels: codex, claudecode, geminicli, antigravity
Main groups:
- Global settings:
/admin/global-settings,/admin/global-settings/upsert - Config export/import:
/admin/config/export-toml,/admin/config/import-toml - Self update:
/admin/system/self_update - Providers/Credentials/CredentialStatuses:
query/upsert/delete - Users:
query/upsert/delete(/admin/users/upsertrequirespassword) - UserKeys:
query/generate/delete - Requests:
/admin/requests/upstream/query,/admin/requests/downstream/query - Usage:
/admin/usages/query,/admin/usages/summary
POST /user/keys/queryPOST /user/keys/generatePOST /user/keys/deletePOST /user/usages/queryPOST /user/usages/summary
curl -sS http://127.0.0.1:8787/openai/v1/chat/completions \
-H "x-api-key: <key>" \
-H "content-type: application/json" \
-d '{
"model": "gpt-4.1",
"messages": [{"role":"user","content":"hello"}],
"stream": false
}'curl -sS http://127.0.0.1:8787/v1/chat/completions \
-H "x-api-key: <key>" \
-H "content-type: application/json" \
-d '{
"model": "openai/gpt-4.1",
"messages": [{"role":"user","content":"hello"}],
"stream": false
}'curl -sS "http://127.0.0.1:8787/aistudio/v1beta/models/gemini-2.5-flash:generateContent" \
-H "x-api-key: <key>" \
-H "content-type: application/json" \
-d '{
"contents":[{"role":"user","parts":[{"text":"hello"}]}]
}'Make sure both providers have at least one cache_breakpoints rule (for example { target = "top_level", ttl = "auto" }).
BASE="http://127.0.0.1:8787"
KEY="<your x-api-key>"
SYS="$(for i in $(seq 1 1800); do printf 'cache-prefix-%04d ' "$i"; done)"# 1) Claude first request (cache write)
curl -sS "$BASE/anthropic/v1/messages" \
-H "x-api-key: $KEY" \
-H "content-type: application/json" \
-H "anthropic-version: 2023-06-01" \
--data-binary @- <<JSON | jq '.usage'
{
"model": "claude-neptune-v3",
"max_tokens": 128,
"stream": false,
"system": "$SYS",
"messages": [
{ "role": "user", "content": "only reply: cache-ok" }
]
}
JSON# 2) Claude second request (cache read)
curl -sS "$BASE/anthropic/v1/messages" \
-H "x-api-key: $KEY" \
-H "content-type: application/json" \
-H "anthropic-version: 2023-06-01" \
--data-binary @- <<JSON | jq '.usage'
{
"model": "claude-neptune-v3",
"max_tokens": 128,
"stream": false,
"system": "$SYS",
"messages": [
{ "role": "user", "content": "only reply: cache-ok" }
]
}
JSON# 3) ClaudeCode first request (cache write)
curl -sS "$BASE/claudecode/v1/messages" \
-H "x-api-key: $KEY" \
-H "content-type: application/json" \
-H "anthropic-version: 2023-06-01" \
--data-binary @- <<JSON | jq '.usage'
{
"model": "claude-sonnet-4-6",
"max_tokens": 128,
"stream": false,
"system": "$SYS",
"messages": [
{ "role": "user", "content": "only reply: cache-ok" }
]
}
JSON# 4) ClaudeCode second request (cache read)
curl -sS "$BASE/claudecode/v1/messages" \
-H "x-api-key: $KEY" \
-H "content-type: application/json" \
-H "anthropic-version: 2023-06-01" \
--data-binary @- <<JSON | jq '.usage'
{
"model": "claude-sonnet-4-6",
"max_tokens": 128,
"stream": false,
"system": "$SYS",
"messages": [
{ "role": "user", "content": "only reply: cache-ok" }
]
}
JSON| Path | Responsibility |
|---|---|
apps/gproxy |
Executable service entry (Axum + embedded admin frontend) |
crates/gproxy-core |
AppState, router orchestration, auth, request execution |
crates/gproxy-provider |
channel implementations, retry, OAuth, dispatch, tokenizers |
crates/gproxy-middleware |
protocol transform middleware, usage extraction |
crates/gproxy-protocol |
OpenAI/Claude/Gemini typed protocol models and transforms |
crates/gproxy-storage |
SeaORM storage layer, query models, async write queue |
crates/gproxy-admin |
admin/user domain operations |
- On bootstrap:
- load config and apply CLI/env overrides
- choose bootstrap source mode (
bootstrap_force_config): prefer DB state by default once initialized - connect database and sync schema automatically
- initialize provider registry, credentials, and credential states
- ensure admin principal (
id=0) and admin key exist
- On request:
- authenticate user key
- route + transform/forward according to dispatch table
- pick eligible credentials and retry/fallback
- persist upstream/downstream events and usage records
healthy: availablepartial: model-level cooldowndead: unavailable
Default cooldowns:
- rate limit:
60s - transient failure:
15s
Provider smoke/regression scripts:
tests/provider/curl_provider.shtests/provider/run_channel_regression.sh
Examples:
API_KEY='<key>' tests/provider/curl_provider.sh \
--provider openai \
--method openai_chat \
--model gpt-4.1API_KEY='<key>' tests/provider/run_channel_regression.sh \
--provider openai \
--model gpt-5-nano \
--embedding-model text-embedding-3-small- Ensure
x-api-keyis provided for key-protected routes, and both the key and its owner user are enabled. - If you don't have a key yet, call
POST /loginwith username/password first.
- The key is not owned by the admin user (
id=0).
- Check:
- whether the channel has any available credential
- whether credential status is
deador currentlypartialfor the target model - whether upstream keeps returning 429/5xx
- You called an unscoped route without provider-prefixed model.
/v1/realtimeis currently not implemented; use/v1/responses(HTTP) instead.
- Set a strong
admin_keyin production. - Keep
mask_sensitive_info = trueunless you explicitly need full payload visibility for debugging. - If you configure an outbound proxy, ensure the proxy path is trusted and access-controlled.
By default:
- data dir:
./data - default DB:
sqlite://./data/gproxy.db?mode=rwc - tokenizer cache:
./data/tokenizers
gproxy-storage supports sqlite / mysql / postgres via dsn.
# backend format/lint/check
cargo fmt
cargo check
cargo clippy --workspace --all-targets
# tests
cargo test --workspace
# run service
cargo run -p gproxyFrontend:
cd apps/gproxy/frontend
pnpm install
pnpm typecheck
pnpm buildThis project is licensed under AGPL-3.0-or-later (see LICENSE).