Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
4 changes: 2 additions & 2 deletions .dev.vars.example
Original file line number Diff line number Diff line change
Expand Up @@ -6,8 +6,8 @@
# ANTHROPIC_PROXY_BASE_URL=https://openrouter.ai/api/v1

# Model configuration (optional)
# REASONING_MODEL=deepseek/deepseek-r1-0528:free
# COMPLETION_MODEL=deepseek/deepseek-r1-0528:free
# REASONING_MODEL=z-ai/glm-4.5-air:free
# COMPLETION_MODEL=z-ai/glm-4.5-air:free

# Enable debug logging (optional)
# DEBUG=true
4 changes: 2 additions & 2 deletions .env.example
Original file line number Diff line number Diff line change
Expand Up @@ -6,8 +6,8 @@
# ANTHROPIC_PROXY_BASE_URL=https://openrouter.ai/api/v1

# Model configuration (optional)
# REASONING_MODEL=deepseek/deepseek-r1-0528:free
# COMPLETION_MODEL=deepseek/deepseek-r1-0528:free
# REASONING_MODEL=z-ai/glm-4.5-air:free
# COMPLETION_MODEL=z-ai/glm-4.5-air:free

# Enable debug logging (optional)
# DEBUG=true
Expand Down
145 changes: 121 additions & 24 deletions CLAUDE.md
Original file line number Diff line number Diff line change
Expand Up @@ -9,17 +9,28 @@ This is a Claude Code proxy service that translates between Anthropic's Claude A
## Architecture

### Core Components
- **`src/index.ts`** - Main Hono application with API proxy logic
- **`src/index.ts`** - Main Hono application with API proxy logic (src/index.ts:39-516)
- **`src/server.ts`** - Node.js server wrapper for CLI distribution with argument parsing
- **`src/transform.ts`** - API format transformation utilities between OpenAI and Claude formats

### API Translation Logic
The proxy service handles (in `src/index.ts:32-450`):
### API Translation Logic (src/index.ts:39-516)
The proxy handles `/v1/messages` POST requests and transforms between formats:
- **Message normalization**: Converts Claude's nested content arrays to OpenAI's flat structure
- **Tool call mapping**: Transforms Claude's `tool_use`/`tool_result` to OpenAI's `tool_calls`/`tool` roles
- **Schema transformation**: Removes `format: 'uri'` constraints from JSON schemas for compatibility
- **Model routing**: Dynamically selects models based on request type (reasoning vs completion)
- **Model routing**: Dynamically selects models based on `thinking` flag in request
- **Streaming support**: Handles both streaming and non-streaming responses with SSE

### Transformation Module (src/transform.ts)
Key exported functions:
- `transformOpenAIToClaude()`: Main transformation from OpenAI to Claude format
- `sanitizeRoot()`: Drops unsupported OpenAI parameters and ensures Claude requirements
- `mapTools()`: Converts OpenAI tools/functions to Claude tool format
- `mapToolChoice()`: Maps OpenAI function_call to Claude tool_choice
- `transformMessages()`: Converts message roles and content blocks
- `removeUriFormat()`: Recursively removes format:'uri' from JSON schemas
- `transformClaudeToOpenAI()`: Converts Claude responses back to OpenAI format

### Dual Runtime Support
- **Cloudflare Workers**: Uses Hono's built-in fetch handler (`src/index.ts`)
- **Node.js**: Uses `@hono/node-server` adapter (`src/server.ts`)
Expand All @@ -30,38 +41,66 @@ The proxy service handles (in `src/index.ts:32-450`):
# Install dependencies
bun install

# Local development server (hot reload)
# Local development server (hot reload on port 3000)
bun run start

# Cloudflare Workers development
# Cloudflare Workers development (local testing)
bun run dev

# Build CLI package
# Build CLI binary to ./bin
bun run build

# Test the built CLI
bun run bin --help
./bin --help
./bin -p 8080 # Run on different port

# Deploy to Cloudflare Workers
bun run deploy

# Set environment variables for Cloudflare Workers
npx wrangler secret put CLAUDE_CODE_PROXY_API_KEY
npx wrangler env put REASONING_MODEL "z-ai/glm-4.5-air:free"

# Publishing to npm
npm version patch # or minor/major
npm publish
```

## CLI Package

The project builds to an executable CLI via `bun run build`:
- **Output**: `./bin` - Standalone Node.js executable
- **Version management**: Reads from `package.json` dynamically
- **CLI flags**: `-v/--version`, `--help`, `-p/--port`
- **Output**: `./bin` - Standalone Node.js executable with ES module format
- **Version management**: Reads dynamically from `package.json`
- **CLI flags**:
- `-v/--version`: Show version
- `--help`: Show help information
- `-p/--port <PORT>`: Set server port (default: 3000)
- **Build process**: Uses Bun's native TypeScript compilation with executable permission

## Environment Variables

Configure via `wrangler.toml` or environment:
- `CLAUDE_CODE_PROXY_API_KEY` - Bearer token for upstream API

### Required
- `CLAUDE_CODE_PROXY_API_KEY` - Bearer token for upstream API authentication

### Optional Configuration
- `ANTHROPIC_PROXY_BASE_URL` - Upstream API URL (default: https://models.github.ai/inference)
- `REASONING_MODEL` - Model for reasoning requests (default: openai/gpt-4.1)
- `COMPLETION_MODEL` - Model for completion requests (default: openai/gpt-4.1)
- `REASONING_MAX_TOKENS` - Max tokens for reasoning model (optional)
- `COMPLETION_MAX_TOKENS` - Max tokens for completion model (optional)
- `REASONING_MAX_TOKENS` - Max tokens for reasoning model (overrides request setting)
- `COMPLETION_MAX_TOKENS` - Max tokens for completion model (overrides request setting)
- `REASONING_EFFORT` - Reasoning effort level for reasoning model (values: "low", "medium", "high")
- `DEBUG` - Enable debug logging (default: false)
- `PORT` - Server port for Node.js mode (default: 3000)

### Model Selection Logic
- When request contains `thinking: true`, uses `REASONING_MODEL`
- Otherwise uses `COMPLETION_MODEL`
- `REASONING_EFFORT` only applies when using reasoning models
- Max tokens overrides take precedence over request-provided values

## Deployment Options

### Cloudflare Workers
Expand All @@ -71,10 +110,19 @@ bun run deploy
```

### Docker
Multi-stage build with production optimization:
Multi-stage build with production optimization using distroless image:
```bash
# Build image
docker build -t claude-code-proxy .
docker run -d -p 3000:3000 claude-code-proxy

# Run with environment variables
docker run -d -p 3000:3000 \
-e CLAUDE_CODE_PROXY_API_KEY=your_token \
-e ANTHROPIC_PROXY_BASE_URL=https://models.github.ai/inference \
claude-code-proxy

# Development with hot reload (using compose.yml)
docker compose up
```

### NPM Package
Expand All @@ -86,14 +134,28 @@ claude-code-proxy --help

## GitHub Actions Integration

Service container setup for `@claude` mentions:
Service container setup for `@claude` mentions in GitHub Actions:
```yaml
services:
claude-code-proxy:
image: ghcr.io/kiyo-e/claude-code-proxy:latest
ports: [3000:3000]
env:
CLAUDE_CODE_PROXY_API_KEY: ${{ secrets.GITHUB_TOKEN }}
jobs:
review:
runs-on: ubuntu-latest
services:
claude-code-proxy:
image: ghcr.io/kiyo-e/claude-code-proxy:latest
ports:
- 3000:3000
env:
CLAUDE_CODE_PROXY_API_KEY: ${{ secrets.GITHUB_TOKEN }}
ANTHROPIC_PROXY_BASE_URL: https://models.github.ai/inference
REASONING_MODEL: openai/gpt-4.1
COMPLETION_MODEL: openai/gpt-4.1

steps:
- uses: actions/checkout@v4
- name: Run Claude Code
run: |
export ANTHROPIC_BASE_URL=http://localhost:3000
claude "Review the changes in this PR"
```

## Local Usage with Claude Code
Expand All @@ -120,6 +182,41 @@ ANTHROPIC_BASE_URL=http://localhost:3000 claude "Review the API code and suggest
```bash
# Using environment file
echo "ANTHROPIC_PROXY_BASE_URL=https://openrouter.ai/api/v1" > .env
echo "REASONING_MODEL=deepseek/deepseek-r1-0528:free" >> .env
echo "REASONING_MODEL=z-ai/glm-4.5-air:free" >> .env
echo "COMPLETION_MODEL=z-ai/glm-4.5-air:free" >> .env
echo "REASONING_EFFORT=high" >> .env
docker run -d -p 3000:3000 --env-file .env ghcr.io/kiyo-e/claude-code-proxy:latest
```
```

### Development with Local Claude Code
```bash
# Start the proxy
bun run start

# In another terminal, use with Claude Code
ANTHROPIC_BASE_URL=http://localhost:3000 \
CLAUDE_CODE_PROXY_API_KEY=your_token \
claude "Review this code and suggest improvements"

# Debug mode
DEBUG=true bun run start
```

## Important Implementation Notes

### Request Flow
1. Client sends Claude API format request to `/v1/messages`
2. Proxy transforms to OpenAI format using `transformOpenAIToClaude()`
3. Request forwarded to upstream API (configured via `ANTHROPIC_PROXY_BASE_URL`)
4. Response transformed back to Claude format
5. Streaming responses handled with Server-Sent Events (SSE)

### Error Handling
- HTTP errors from upstream API: Returns same status code with error details
- API errors in response body: Returns 500 with error message
- Dropped parameters tracked in `X-Dropped-Params` header

### Debugging
- Enable `DEBUG=true` to log request/response payloads
- Bearer tokens are automatically masked in logs
- Check health endpoint at `/` for configuration status
25 changes: 15 additions & 10 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -56,8 +56,9 @@ docker run -d -p 3000:3000 -e CLAUDE_CODE_PROXY_API_KEY=your_github_token ghcr.i
docker run -d -p 3000:3000 \
-e CLAUDE_CODE_PROXY_API_KEY=your_openrouter_key \
-e ANTHROPIC_PROXY_BASE_URL=https://openrouter.ai/api/v1 \
-e REASONING_MODEL=deepseek/deepseek-r1-0528:free \
-e COMPLETION_MODEL=deepseek/deepseek-r1-0528:free \
-e REASONING_MODEL=z-ai/glm-4.5-air:free \
-e COMPLETION_MODEL=z-ai/glm-4.5-air:free \
-e REASONING_EFFORT=high \
ghcr.io/kiyo-e/claude-code-proxy:latest

# Use with Claude Code
Expand All @@ -71,10 +72,11 @@ ANTHROPIC_BASE_URL=http://localhost:3000 claude "Help me review this code"
cat > .env << EOF
CLAUDE_CODE_PROXY_API_KEY=your_api_key
ANTHROPIC_PROXY_BASE_URL=https://openrouter.ai/api/v1
REASONING_MODEL=deepseek/deepseek-r1-0528:free
COMPLETION_MODEL=deepseek/deepseek-r1-0528:free
REASONING_MODEL=z-ai/glm-4.5-air:free
COMPLETION_MODEL=z-ai/glm-4.5-air:free
REASONING_MAX_TOKENS=4096
COMPLETION_MAX_TOKENS=2048
REASONING_EFFORT=high
DEBUG=false
EOF

Expand Down Expand Up @@ -129,9 +131,9 @@ npx wrangler secret put CLAUDE_CODE_PROXY_API_KEY
npx wrangler secret put ANTHROPIC_PROXY_BASE_URL
# Enter: https://openrouter.ai/api/v1
npx wrangler secret put REASONING_MODEL
# Enter: deepseek/deepseek-r1-0528:free
# Enter: z-ai/glm-4.5-air:free
npx wrangler secret put COMPLETION_MODEL
# Enter: deepseek/deepseek-r1-0528:free
# Enter: z-ai/glm-4.5-air:free
```

3. **Test the deployment:**
Expand Down Expand Up @@ -190,6 +192,7 @@ npm publish
- `COMPLETION_MODEL` - Model for completion requests (default: openai/gpt-4.1)
- `REASONING_MAX_TOKENS` - Max tokens for reasoning model (optional)
- `COMPLETION_MAX_TOKENS` - Max tokens for completion model (optional)
- `REASONING_EFFORT` - Reasoning effort level for reasoning model (optional, e.g., "low", "medium", "high")
- `DEBUG` - Enable debug logging (default: false)
- `PORT` - Server port for CLI mode (default: 3000)

Expand All @@ -203,17 +206,19 @@ npx wrangler secret put CLAUDE_CODE_PROXY_API_KEY
npx wrangler secret put ANTHROPIC_PROXY_BASE_URL

# Set regular environment variables
npx wrangler env put REASONING_MODEL "deepseek/deepseek-r1-0528:free"
npx wrangler env put COMPLETION_MODEL "deepseek/deepseek-r1-0528:free"
npx wrangler env put REASONING_MODEL "z-ai/glm-4.5-air:free"
npx wrangler env put COMPLETION_MODEL "z-ai/glm-4.5-air:free"
npx wrangler env put REASONING_EFFORT "high"
npx wrangler env put DEBUG "false"
```

Alternatively, configure via `wrangler.toml`:

```toml
[env.production.vars]
REASONING_MODEL = "deepseek/deepseek-r1-0528:free"
COMPLETION_MODEL = "deepseek/deepseek-r1-0528:free"
REASONING_MODEL = "z-ai/glm-4.5-air:free"
COMPLETION_MODEL = "z-ai/glm-4.5-air:free"
REASONING_EFFORT = "high"
DEBUG = "false"
```

Expand Down
Loading