Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
6 changes: 6 additions & 0 deletions cmd/cliproxyctl/main_test.go
Original file line number Diff line number Diff line change
Expand Up @@ -559,6 +559,12 @@ func TestResolveLoginProviderNormalizesDroidAliases(t *testing.T) {
if details["provider_supported"] != true {
t.Fatalf("expected provider_supported=true for %q, details=%#v", input, details)
}
if details["provider_alias"] != "gemini" {
t.Fatalf("expected provider_alias=gemini for %q, details=%#v", input, details)
}
if details["provider_aliased"] != true {
t.Fatalf("expected provider_aliased=true for %q, details=%#v", input, details)
}
}
}

Expand Down
4 changes: 4 additions & 0 deletions config.example.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -95,6 +95,10 @@ routing:
# When true, enable authentication for the WebSocket API (/v1/ws).
ws-auth: false

# Gates OpenAI-compatible /v1/responses/compact behavior.
# Default enabled when omitted. Set false for staged rollout / rapid disable.
# responses-compact-enabled: true

# When > 0, emit blank lines every N seconds for non-streaming responses to prevent idle timeouts.
nonstream-keepalive-interval: 0

Expand Down
1 change: 1 addition & 0 deletions docs/operations/index.md
Original file line number Diff line number Diff line change
Expand Up @@ -12,6 +12,7 @@ This section centralizes first-response runbooks for active incidents.
2. [Auth Refresh Failure Symptom/Fix Table](./auth-refresh-failure-symptom-fix.md)
3. [Critical Endpoints Curl Pack](./critical-endpoints-curl-pack.md)
4. [Checks-to-Owner Responder Map](./checks-owner-responder-map.md)
5. [Provider Error Runbook Snippets](./provider-error-runbook.md)

## Freshness Pattern

Expand Down
40 changes: 40 additions & 0 deletions docs/operations/provider-error-runbook.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,40 @@
# Provider Error Runbook Snippets

These are the smallest actionable runbook entries for CPB-0803 and CPB-0804 so the on-call can exercise the correct validation commands before changing code or configuration.

## CPB-0803 – Huggingface CLIProxyAPI errors

**Symptom**: Huggingface calls fail silently in production logs, but observability lacks provider tags and the usage dashboard shows untracked traffic. Alerts are noisy because the log sink cannot route the error rate to the right channel.

**Validation commands**

- `curl -sS http://localhost:8317/v0/management/logs | jq '.logs[] | select(.provider == "huggingface" and .level == "error")'`
- `curl -sS http://localhost:8317/v1/metrics/providers | jq '.data[] | select(.provider == "huggingface") | {error_rate, requests, last_seen}'`
- `curl -sS http://localhost:8317/usage | jq '.providers.huggingface'`

**Runbook steps**

1. Make sure `cliproxyctl` has the `provider_filter` tags set for `huggingface` so the management log output includes `provider: "huggingface"`. If the logs lack tags, reapply the filter via `cliproxyctl config view` + `cliproxyctl config edit` (or update the `config.yaml` block) and restart the agent.
2. Verify the `v1/metrics/providers` entry for `huggingface` shows a stable error rate; if it stays above 5% for 5 minutes, escalate to the platform on-call and mark the alert as a hurt-level incident.
3. After correcting the tagging, confirm the `usage` endpoint reports the provider so the new alerting rule in `provider-error` dashboards can route to the right responder.

## CPB-0804 – Codex backend-api `Not Found`

**Symptom**: Translations still target `https://chatgpt.com/backend-api/codex/responses`, which now returns `404 Not Found`. The problem manifests as a `backend-api` status in the `management/logs` stream that cannot be mapped to the new `v1/responses` path.

**Validation commands**

- `curl -sS http://localhost:8317/v0/management/logs | jq '.logs[] | select(.provider == "codex" and (.path | contains("backend-api/codex")) and .status_code == 404)'`
- `curl -sS http://localhost:8317/v1/responses -H "Authorization: Bearer <api-key>" -H "Content-Type: application/json" -d '{"model":"codex","messages":[{"role":"user","content":"ping"}],"stream":false}' -w "%{http_code}"`
- `curl -sS http://localhost:8317/v1/metrics/providers | jq '.data[] | select(.provider == "codex") | {error_rate, last_seen}'`
- `rg -n "backend-api/codex" config.example.yaml config.yaml`

**Runbook steps**

1. Use the management log command above to confirm the 404 comes from the old `backend-api/codex` target. If the request still hits that path, re-point the translator overrides in `config.yaml` (or environment overrides such as `CLIPROXY_PROVIDER_CODEX_BASE_URL`) to whatever URL serves the current Responses protocol.
2. Re-run the `curl` to `/v1/responses` with the same payload to verify the translation path can resolve to an upstream that still works; if it succeeds, redeploy the next minor release with the provider-agnostic translator patch.
3. If the problem persists after a config change, capture the raw `logs` and `metrics` output and hand it to the translations team together with the failing request body, because the final fix involves sharing translator hooks and the compatibility matrix described in the quickstart docs.

---
Last reviewed: `2026-02-23`
Owner: `Platform On-Call`
Original file line number Diff line number Diff line change
@@ -0,0 +1,30 @@
# Issue Wave CPB-0781-0830 Implementation Batch 3 (Resume 12)

- Date: `2026-02-23`
- Scope: next 12-item execution wave after Batch 2
- Mode: docs/runbook hardening using child-agent lane split (2 items per lane)

## Implemented in this pass

- Lane B set:
- `CPB-0789`, `CPB-0790`, `CPB-0791`, `CPB-0792`, `CPB-0793`, `CPB-0794`, `CPB-0795`
- Lane C set:
- `CPB-0797`, `CPB-0798`, `CPB-0800`, `CPB-0803`, `CPB-0804`

## Evidence Surfaces

- `docs/provider-quickstarts.md`
- Added/expanded parity probes, cache guardrails, compose health checks, proxy/auth usage checks, Antigravity setup flow, and manual callback guidance for `CPB-0789..CPB-0804`.
- `docs/troubleshooting.md`
- Added matrix/runbook entries covering stream-thinking parity, cache drift, auth toggle diagnostics, callback guardrails, huggingface diagnostics, and codex backend-api not-found handling.
- `docs/operations/provider-error-runbook.md`
- Added focused runbook snippets for `CPB-0803` and `CPB-0804`.
- `docs/operations/index.md`
- Linked the new provider error runbook.

## Validation Commands

```bash
rg -n "CPB-0789|CPB-0790|CPB-0791|CPB-0792|CPB-0793|CPB-0794|CPB-0795|CPB-0797|CPB-0798|CPB-0800|CPB-0803|CPB-0804" docs/provider-quickstarts.md docs/troubleshooting.md docs/operations/provider-error-runbook.md
rg -n "Provider Error Runbook Snippets" docs/operations/index.md
```
Original file line number Diff line number Diff line change
Expand Up @@ -7,14 +7,31 @@
## IDs Implemented

- `CPB-0810` (Copilot/OpenAI metadata consistency update for `gpt-5.1`)
- `CPB-0819` (add staged rollout toggle for `/v1/responses/compact` behavior)
- `CPB-0820` (add `gpt-5-pro` OpenAI model metadata with thinking support)
- `CPB-0821` (tighten droid→gemini alias assertions in login and usage telemetry tests)

## Files Changed

- `pkg/llmproxy/registry/model_definitions_static_data.go`
- `pkg/llmproxy/registry/model_definitions_test.go`
- `pkg/llmproxy/config/config.go`
- `pkg/llmproxy/config/responses_compact_toggle_test.go`
- `pkg/llmproxy/executor/openai_compat_executor.go`
- `pkg/llmproxy/executor/openai_compat_executor_compact_test.go`
- `pkg/llmproxy/runtime/executor/openai_compat_executor.go`
- `pkg/llmproxy/runtime/executor/openai_compat_executor_compact_test.go`
- `cmd/cliproxyctl/main_test.go`
- `pkg/llmproxy/usage/metrics_test.go`
- `config.example.yaml`

## Validation Commands

```bash
GOCACHE=$PWD/.cache/go-build go test ./pkg/llmproxy/registry -run 'TestGetOpenAIModels_GPT51Metadata|TestGetGitHubCopilotModels|TestGetStaticModelDefinitionsByChannel' -count=1
GOCACHE=$PWD/.cache/go-build go test ./pkg/llmproxy/registry -run 'TestGetOpenAIModels_GPT51Metadata|TestGetOpenAIModels_IncludesGPT5Pro|TestGetGitHubCopilotModels|TestGetStaticModelDefinitionsByChannel' -count=1
GOCACHE=$PWD/.cache/go-build go test ./pkg/llmproxy/config -run 'TestIsResponsesCompactEnabled_DefaultTrue|TestIsResponsesCompactEnabled_RespectsToggle' -count=1
GOCACHE=$PWD/.cache/go-build go test ./pkg/llmproxy/executor -run 'TestOpenAICompatExecutorCompactPassthrough|TestOpenAICompatExecutorCompactDisabledByConfig' -count=1
GOCACHE=$PWD/.cache/go-build go test ./pkg/llmproxy/runtime/executor -run 'TestOpenAICompatExecutorCompactPassthrough|TestOpenAICompatExecutorCompactDisabledByConfig' -count=1
GOCACHE=$PWD/.cache/go-build go test ./cmd/cliproxyctl -run 'TestResolveLoginProviderNormalizesDroidAliases' -count=1
GOCACHE=$PWD/.cache/go-build go test ./pkg/llmproxy/usage -run 'TestNormalizeProviderAliasesDroidToGemini|TestGetProviderMetrics_MapsDroidAliasToGemini' -count=1
```
39 changes: 39 additions & 0 deletions docs/planning/reports/issue-wave-cpb-0781-0830-lane-b.md
Original file line number Diff line number Diff line change
Expand Up @@ -75,3 +75,42 @@
- `rg -n "CPB-0789|CPB-0796" docs/planning/CLIPROXYAPI_1000_ITEM_BOARD_2026-02-22.md`
- `rg -n "quickstart|troubleshooting|stream|tool|reasoning|provider" docs/provider-quickstarts.md docs/troubleshooting.md`
- `go test ./pkg/llmproxy/translator/... -run "TestConvert|TestTranslate" -count=1`

## Execution Update (Batch 3 — 2026-02-23)

- Snapshot:
- `implemented`: 8 (`CPB-0789`..`CPB-0796`)
- `in_progress`: 0

### Implemented in this update

- `CPB-0789`, `CPB-0790`
- Added rollout + Sonnet metadata guidance in quickstart/troubleshooting surfaces.
- Evidence:
- `docs/provider-quickstarts.md`
- `docs/troubleshooting.md`

- `CPB-0791`, `CPB-0792`
- Added reasoning parity and prompt-cache guardrail probes.
- Evidence:
- `docs/provider-quickstarts.md`
- `docs/troubleshooting.md`

- `CPB-0793`, `CPB-0794`
- Added compose-health and provider proxy behavior checks.
- Evidence:
- `docs/provider-quickstarts.md`
- `docs/troubleshooting.md`

- `CPB-0795`
- Added AI Studio auth-file toggle diagnostics (`enabled/auth_index` + doctor snapshot).
- Evidence:
- `docs/provider-quickstarts.md`
- `docs/troubleshooting.md`

- `CPB-0796`
- Already implemented in prior execution batch; retained as implemented in lane snapshot.

### Validation

- `rg -n "CPB-0789|CPB-0790|CPB-0791|CPB-0792|CPB-0793|CPB-0794|CPB-0795|CPB-0796" docs/provider-quickstarts.md docs/troubleshooting.md`
34 changes: 34 additions & 0 deletions docs/planning/reports/issue-wave-cpb-0781-0830-lane-c.md
Original file line number Diff line number Diff line change
Expand Up @@ -75,3 +75,37 @@
- `rg -n "CPB-0797|CPB-0804" docs/planning/CLIPROXYAPI_1000_ITEM_BOARD_2026-02-22.md`
- `rg -n "quickstart|troubleshooting|stream|tool|reasoning|provider" docs/provider-quickstarts.md docs/troubleshooting.md`
- `go test ./pkg/llmproxy/translator/... -run "TestConvert|TestTranslate" -count=1`

## Execution Update (Batch 3 — 2026-02-23)

- Snapshot:
- `implemented`: 8 (`CPB-0797`..`CPB-0804`)
- `in_progress`: 0

### Implemented in this update

- `CPB-0797`
- Added token-count diagnostics parity checks in quickstart + troubleshooting matrix.
- Evidence:
- `docs/provider-quickstarts.md`
- `docs/troubleshooting.md`

- `CPB-0798`, `CPB-0800`
- Added Antigravity setup/login flow and manual callback headless OAuth guidance.
- Evidence:
- `docs/provider-quickstarts.md`
- `docs/troubleshooting.md`

- `CPB-0803`, `CPB-0804`
- Added provider error runbook anchors and troubleshooting action entries.
- Evidence:
- `docs/operations/provider-error-runbook.md`
- `docs/operations/index.md`
- `docs/troubleshooting.md`

- `CPB-0799`, `CPB-0801`, `CPB-0802`
- Already implemented in prior execution batch; retained as implemented in lane snapshot.

### Validation

- `rg -n "CPB-0797|CPB-0798|CPB-0799|CPB-0800|CPB-0801|CPB-0802|CPB-0803|CPB-0804" docs/provider-quickstarts.md docs/troubleshooting.md docs/operations/provider-error-runbook.md`
30 changes: 27 additions & 3 deletions docs/planning/reports/issue-wave-cpb-0781-0830-next-50-summary.md
Original file line number Diff line number Diff line change
Expand Up @@ -10,8 +10,9 @@

- `proposed` in board snapshot: 50/50
- `triaged with concrete file/test targets in this pass`: 50/50
- `implemented so far`: 16/50
- `remaining`: 34/50
- `implemented so far`: 28/50
- `remaining`: 22/50
- `count basis`: resume-scoped verification in this session (Batch 1 + Batch 2 + Resume Scoped 12)

## Lane Index

Expand Down Expand Up @@ -117,7 +118,30 @@ Validation evidence:
Implemented in this batch:

- `CPB-0810`: corrected `gpt-5.1` static metadata to use version-accurate display/description text for OpenAI/Copilot-facing model surfaces.
- `CPB-0819`: added config-gated `/v1/responses/compact` rollout control (`responses-compact-enabled`) with safe default enabled and explicit disabled behavior tests.
- `CPB-0820`: added `gpt-5-pro` static model metadata with explicit thinking support for OpenAI/Copilot-facing model lists.
- `CPB-0821`: tightened droid alias coverage with explicit `provider_alias`/`provider_aliased` assertions and usage telemetry mapping tests.

Validation evidence:

- `go test ./pkg/llmproxy/registry -run 'TestGetOpenAIModels_GPT51Metadata|TestGetGitHubCopilotModels|TestGetStaticModelDefinitionsByChannel' -count=1` → `ok`
- `go test ./pkg/llmproxy/registry -run 'TestGetOpenAIModels_GPT51Metadata|TestGetOpenAIModels_IncludesGPT5Pro|TestGetGitHubCopilotModels|TestGetStaticModelDefinitionsByChannel' -count=1` → `ok`
- `go test ./pkg/llmproxy/config -run 'TestIsResponsesCompactEnabled_DefaultTrue|TestIsResponsesCompactEnabled_RespectsToggle' -count=1` → `ok`
- `go test ./pkg/llmproxy/executor -run 'TestOpenAICompatExecutorCompactPassthrough|TestOpenAICompatExecutorCompactDisabledByConfig' -count=1` → `ok`
- `go test ./pkg/llmproxy/runtime/executor -run 'TestOpenAICompatExecutorCompactPassthrough|TestOpenAICompatExecutorCompactDisabledByConfig' -count=1` → `ok`
- `go test ./cmd/cliproxyctl -run 'TestResolveLoginProviderNormalizesDroidAliases' -count=1` → `ok`
- `go test ./pkg/llmproxy/usage -run 'TestNormalizeProviderAliasesDroidToGemini|TestGetProviderMetrics_MapsDroidAliasToGemini' -count=1` → `ok`

## Execution Update (Resume Scoped 12)

- Date: `2026-02-23`
- Status: completed next 12-item docs/runbook batch with child-agent split.
- Tracking report: `docs/planning/reports/issue-wave-cpb-0781-0830-implementation-batch-3-resume-12.md`

Implemented in this batch:

- `CPB-0789`, `CPB-0790`, `CPB-0791`, `CPB-0792`, `CPB-0793`, `CPB-0794`, `CPB-0795`
- `CPB-0797`, `CPB-0798`, `CPB-0800`, `CPB-0803`, `CPB-0804`

Verification:

- `rg -n "CPB-0789|CPB-0790|CPB-0791|CPB-0792|CPB-0793|CPB-0794|CPB-0795|CPB-0797|CPB-0798|CPB-0800|CPB-0803|CPB-0804" docs/provider-quickstarts.md docs/troubleshooting.md docs/operations/provider-error-runbook.md`
24 changes: 21 additions & 3 deletions docs/provider-operations.md
Original file line number Diff line number Diff line change
Expand Up @@ -241,9 +241,27 @@ Avoid per-tool aliases for these fields in ops docs to keep telemetry queries de
- Critical: processed thinking mode mismatch ratio > 5% over 10 minutes.
- Warn: reasoning token growth > 25% above baseline for fixed-thinking workloads over 10 minutes.
- Mitigation:
- Force explicit thinking-capable model alias for affected workloads.
- Reduce rollout blast radius by pinning the model suffix/level per workload class.
- Keep one non-stream and one stream canary for each affected client integration.
- Force explicit thinking-capable model alias for affected workloads.
- Reduce rollout blast radius by pinning the model suffix/level per workload class.
- Keep one non-stream and one stream canary for each affected client integration.

### Provider-specific proxy overrides (`CPB-0794`)

- **Goal:** route some providers through a corporate proxy while letting others go direct (for example, Gemini through `socks5://corp-proxy:1080` while Claude works through the default gateway).
- **Config knobs:** `config.yaml` already exposes `proxy-url` at the root (global egress) and a `proxy-url` field per credential or API key entry. Adding the override looks like:

```yaml
gemini-api-key:
- api-key: "AIzaSy..."
proxy-url: "socks5://corp-proxy:1080" # per-provider override
```

- **Validation steps:**
1. `rg -n "proxy-url" config.yaml` (or open `config.example.yaml`) to confirm the override is placed next to the target credential block.
2. `cliproxyctl doctor --json | jq '.config.providers | to_entries[] | {provider:.key,credentials:.value}'` to ensure each credential surfaces the intended `proxy_url` value.
3. After editing, save the file so the built-in watcher hot-reloads the settings (or run `docker compose restart cliproxyapi-plusplus` for a deterministic reload) and rerun an affected client request while tailing `docker compose logs cliproxyapi-plusplus --follow` to watch for proxy-specific connection errors.

- **Fallback behavior:** When no per-credential override exists, the root `proxy-url` applies; clearing the override (set to empty string) forces a direct connection even if the root proxy is set.

## Recommended Production Pattern

Expand Down
Loading
Loading