Lacuna fills the gap between your tailnet users and the AI providers.
Lacuna is a free and open-source API gateway for AI providers. It is meant to be deployed in Tailscale to grant AI API access to your tailnet members without having to distribute API keys.
- Supported Providers: OpenAI, Anthropic, Bedrock, Gemini.
- Automatic Routing: Routes requests to the first compatible provider based on which endpoint is called. For example, calls to
/v1/chat/completionsare automatically routed to the first OpenAI-compatible provider. - Provider-Specific Routing: Dedicated base URL for each provider. For example, calls to
/myprovider/v1/chat/completionswill always route requests tomyprovider. - Prometheus Metrics: Exposes Prometheus metrics at
/metricsfor usage monitoring. Includes per-user metrics. - Fine-Grained Permissions: Control which users can access which providers and models based on Tailscale application capabilities.
- Web Interface: Minimal web interface that displays configured providers.
See the GitHub Releases page.
lacuna --config <path> [--host <host>] [--port <port>]
There are usage examples in the examples directory.
Docker images are published at ghcr.io/flared/lacuna:
docker pull ghcr.io/flared/lacuna:latest
The configuration file defines one or more AI providers in JSON format.
The provided configuration file may include environment variable substitution using the ${VAR_NAME} syntax.
Example Configuration
{
"lacuna": {
"logging": {
"format": "console",
"level": "info"
},
"capabilities_header": "Tailscale-App-Capability",
"identity_header": "Tailscale-User-Login"
},
"providers": {
"anthropic": {
"name": "Anthropic",
"baseurl": "https://api.anthropic.com",
"authorization": "x-api-key",
"apikey": "${ANTHROPIC_API_KEY}",
"capability": {
"models": ["claude-*"],
"user_agents": ["claude-code"]
},
"compatibility": {
"anthropic_messages": true
}
},
"bedrock": {
"name": "AWS Bedrock",
"baseurl": "https://bedrock-runtime.us-east-1.amazonaws.com",
"authorization": "bearer",
"apikey": "${BEDROCK_API_KEY}",
"compatibility": {
"bedrock_model_invoke": true
}
}
}
}When capabilities_header is set, Lacuna expects a header that follows the Tailscale application capabilities format:
{
"flare.io/cap/lacuna/grants": [
{ "providers": ["firstprovider", "secondprovider"] },
{ "providers": ["thirdprovider-*"], "models": ["model-1"] }
],
"flare.io/cap/lacuna/labels": [
{ "team": "platform", "env": "production" }
]
}In the above example, the user may:
- Use all models of
firstproviderandsecondprovider. - Use
model-1in any provider that starts withthirdprovider-.
The flare.io/cap/lacuna/labels key carries flat key-value metadata that is attached to Prometheus metrics and traces for observability.
Notes:
- Providers and models may contain glob patterns.
- Empty lists and omitted values default to
["*"].
More on Tailscale application capabilities:
When capability is set on a provider, Lacuna will use it to filter the requests that are allowed to use that provider:
{
"providers": {
"anthropic": {
"baseurl": "https://api.anthropic.com",
"capability": {
"models": ["claude-*"],
"user_agents": ["claude-code"]
}
}
}
}In the above example:
- The
anthropicprovider will only allow requests with a User-Agent that match Lacuna's built-inclaude-codepattern. - The
anthropicprovider may only be used with models that match theclaude-*glob pattern.
Notes:
- Built-in User-Agent patterns can be found in src/user_agent.rs.
- cargo: https://doc.rust-lang.org/cargo/getting-started/installation.html
- cargo-edit:
cargo install cargo-edit - pnpm: https://pnpm.io/installation
General
make ci: Run CI-equivalent locally.make docker-build: Build the Docker image.bin/bump-version: Bump the version number and allow you to release.
API Targets
make build: Build the API.make run: Run the app with the example config.make test: Run tests.make format: Format the code.make fix: Automatically fix lint warnings.make clippy: Lint for common errors.
Frontend Targets
make frontend-build: Build the frontend.make frontend-format: Format the frontend.make frontend-lint: Lint the frontend.make frontend-run: Serve the frontend with auto-reload using Vite. You must also have the backend running.