diff --git a/README.md b/README.md index 11f1009acf..2ba7b4828b 100644 --- a/README.md +++ b/README.md @@ -1,4 +1,4 @@ -# cliproxyapi++ πŸš€ +# cliproxyapi++ [![Go Report Card](https://goreportcard.com/badge/github.com/KooshaPari/cliproxyapi-plusplus)](https://goreportcard.com/report/github.com/KooshaPari/cliproxyapi-plusplus) [![License: MIT](https://img.shields.io/badge/License-MIT-yellow.svg)](https://opensource.org/licenses/MIT) @@ -7,13 +7,13 @@ English | [δΈ­ζ–‡](README_CN.md) -**cliproxyapi++** is the definitive high-performance, security-hardened fork of [CLIProxyAPI](https://github.com/router-for-me/CLIProxyAPI). Designed with a "Defense in Depth" philosophy and a "Library-First" architecture, it provides an OpenAI-compatible interface for proprietary LLMs with enterprise-grade stability. +**cliproxyapi++** is a fork of [CLIProxyAPI](https://github.com/router-for-me/CLIProxyAPI) focused on operational controls, auth lifecycle handling, and reusable proxy components. It provides an OpenAI-compatible interface for proprietary LLMs and follows a defense-in-depth and library-first architecture. --- -## πŸ† Deep Dive: The `++` Advantage +## Project Overview -Why choose **cliproxyapi++** over the mainline? While the mainline focus is on open-source stability, the `++` variant is built for high-scale, production environments where security, automated lifecycle management, and broad provider support are critical. +This section compares baseline capabilities across mainline, `+`, and `++` variants. Full feature-by-feature change reference: @@ -24,48 +24,48 @@ Full feature-by-feature change reference: | Capability | Mainline | CLIProxyAPI+ | **cliproxyapi++** | Granular Notes | | :--- | :---: | :---: | :---: | :--- | | **OpenAI-compatible proxy endpoints** | βœ… | βœ… | βœ… | `chat/completions`, `responses`, `models` surfaces available. | -| **Provider registry breadth** | βœ… | βœ… | βœ… | Direct + aggregator providers supported in all variants, with broader operational polish in `++`. | +| **Provider registry breadth** | βœ… | βœ… | βœ… | Direct + aggregator providers supported in all variants, with additional provider operations surfaces in `++`. | | **Model aliasing / mapping layer** | ⚠️ | βœ… | βœ… | `++` emphasizes unified mapping behavior across heterogeneous upstreams. | | **Management API (`/v0/*`)** | ⚠️ | βœ… | βœ… | Operational controls and inspection endpoints available in `+` and `++`. | | **Web management UI** | ❌ | βœ… | βœ… | `++` keeps UI while hardening operational/auth flows behind it. | | **Kiro web OAuth flow** | ❌ | ⚠️ | βœ… | `++` includes dedicated `/v0/oauth/kiro` browser-based login surface. | | **GitHub Copilot OAuth/device auth depth** | ❌ | ⚠️ | βœ… | `++` adds full lifecycle handling and richer session semantics. | -| **Advanced multi-provider auth set** | ❌ | ⚠️ | βœ… | Kiro/Copilot/Roo/Kilo/MiniMax/Cursor auth paths integrated in `++`. | +| **Multi-provider auth set** | ❌ | ⚠️ | βœ… | Kiro/Copilot/Roo/Kilo/MiniMax/Cursor auth paths integrated in `++`. | | **Background token refresh worker** | ❌ | ❌ | βœ… | Auto-refresh before expiry to reduce auth-related downtime. | | **Credential lifecycle visibility** | ❌ | ⚠️ | βœ… | `++` provides richer auth file/account surfaces for operations. | | **Quota-aware provider handling** | ❌ | ⚠️ | βœ… | `++` includes cooldown and provider-state driven routing behavior. | -| **Rate limiting + intelligent cooldown** | ❌ | ❌ | βœ… | Provider-level cooling/rotation behavior aimed at production resilience. | -| **Failure isolation / route continuity** | ⚠️ | ⚠️ | βœ… | `++` biases toward continuing service via provider-aware routing controls. | +| **Rate limiting + cooldown** | ❌ | ❌ | βœ… | Provider-level cooling/rotation behavior for rate limit handling. | +| **Failure isolation / route continuity** | ⚠️ | ⚠️ | βœ… | `++` routes around unavailable providers when alternatives are configured. | | **Core code importability** | ❌ | ❌ | βœ… | Mainline/+ keep `internal/`; `++` exposes reusable `pkg/llmproxy`. | | **Library-first architecture** | ⚠️ | ⚠️ | βœ… | Translation/proxy logic packaged for embedding into other Go services. | -| **Security controls (path guard, hardened base, fingerprinting)** | Basic | Basic | βœ… | Defense-in-depth additions for CI governance and runtime posture. | -| **Container supply-chain posture** | Basic | Basic | βœ… | Hardened Docker base plus signed/multi-arch release workflow. | +| **Security controls (path guard, hardened base, fingerprinting)** | Basic | Basic | βœ… | Added controls for CI governance and runtime posture. | +| **Container supply-chain posture** | Basic | Basic | βœ… | Docker hardening plus signed multi-arch release workflow. | | **CI quality gates (strict lint/test/governance)** | Basic | Basic | βœ… | Expanded automation and stricter release validation in `++`. | -| **Operational observability surfaces** | ⚠️ | βœ… | βœ… | Logs, usage, provider metrics and management views strengthened in `++`. | -| **Production-readiness target** | Community baseline | Enhanced fork | **Enterprise-grade** | `++` is tuned for long-running agent-heavy deployments. | +| **Operational observability surfaces** | ⚠️ | βœ… | βœ… | Logs, usage, provider metrics, and management views are expanded in `++`. | +| **Production-readiness target** | Community baseline | Enhanced fork | Agent-heavy deployment target | `++` targets long-running agent-heavy deployments. | --- -## πŸ” Technical Differences & Hardening +## Technical Differences ### 1. Architectural Evolution: `pkg/llmproxy` Unlike the mainline which keeps its core logic in `internal/` (preventing external Go projects from importing it), **cliproxyapi++** has refactored its entire translation and proxying engine into a clean, public `pkg/llmproxy` library. * **Reusability**: Import the proxy logic directly into your own Go applications. * **Decoupling**: Configuration management is strictly separated from execution logic. -### 2. Enterprise Authentication & Lifecycle -* **Full GitHub Copilot Integration**: Not just an API wrapper. `++` includes a full OAuth device flow, per-credential quota tracking, and intelligent session management. -* **Kiro (AWS CodeWhisperer) 2.0**: A custom-built web UI (`/v0/oauth/kiro`) for browser-based AWS Builder ID and Identity Center logins. -* **Background Token Refresh**: A dedicated worker service monitors tokens and automatically refreshes them 10 minutes before expiration, ensuring zero downtime for your agents. +### 2. Authentication & Lifecycle Management +* **GitHub Copilot integration**: `++` includes OAuth device flow support, per-credential quota tracking, and session handling. +* **Kiro (AWS CodeWhisperer) login flow**: A web UI (`/v0/oauth/kiro`) supports AWS Builder ID and Identity Center logins. +* **Background token refresh**: A worker service monitors tokens and refreshes them 10 minutes before expiration. -### 3. Security Hardening ("Defense in Depth") +### 3. Security Controls * **Path Guard**: A custom GitHub Action workflow (`pr-path-guard`) that prevents any unauthorized changes to critical `internal/translator/` logic during PRs. * **Device Fingerprinting**: Generates unique, immutable device identifiers to satisfy strict provider security checks and prevent account flagging. * **Hardened Docker Base**: Built on a specific, audited Alpine 3.22.0 layer with minimal packages, reducing the potential attack surface. -### 4. High-Scale Operations -* **Intelligent Cooldown**: Automated "cooling" mechanism that detects provider-side rate limits and intelligently pauses requests to specific providers while routing others. -* **Unified Model Converter**: A sophisticated mapping layer that allows you to request `claude-3-5-sonnet` and have the proxy automatically handle the specific protocol requirements of the target provider (Vertex, AWS, Anthropic, etc.). +### 4. Operations +* **Cooldown**: Automated mechanism that detects provider-side rate limits and pauses requests to specific providers while routing others. +* **Unified model converter**: Mapping layer that translates requested models (for example `claude-3-5-sonnet`) to provider-specific requirements (Vertex, AWS, Anthropic, etc.). --- @@ -158,7 +158,7 @@ The proxy provides two main API surfaces: ## 🀝 Contributing -We maintain strict quality gates to preserve the "hardened" status of the project: +We maintain strict quality gates: 1. **Linting**: Must pass `golangci-lint` with zero warnings. 2. **Coverage**: All new translator logic MUST include unit tests. 3. **Governance**: Changes to core `pkg/` logic require a corresponding Issue discussion. @@ -196,7 +196,7 @@ See **[CONTRIBUTING.md](CONTRIBUTING.md)** for more details. - [Agent Operator](./docs/docsets/agent/) - **Research**: [AgentAPI + cliproxyapi++ tandem and alternatives](./docs/planning/agentapi-cliproxy-integration-research-2026-02-22.md) - **Research (300 repo sweep)**: [coder org + 97 adjacent repos](./docs/planning/coder-org-plus-relative-300-inventory-2026-02-22.md) -- **[Feature Changes in ++](./docs/FEATURE_CHANGES_PLUSPLUS.md)** β€” Comprehensive list of `++` differences and impacts. +- **[Feature Changes in ++](./docs/FEATURE_CHANGES_PLUSPLUS.md)** β€” Detailed list of `++` differences and impacts. - **[Docs README](./docs/README.md)** β€” Core docs map. --- @@ -226,6 +226,5 @@ Distributed under the MIT License. See [LICENSE](LICENSE) for more information. ---

- Hardened AI Infrastructure for the Modern Agentic Stack.
- Built with ❀️ by the community. + OpenAI-compatible proxy infrastructure for agentic workloads.

diff --git a/docs/FEATURE_CHANGES_PLUSPLUS.md b/docs/FEATURE_CHANGES_PLUSPLUS.md index aa797a7a98..e8f63981b9 100644 --- a/docs/FEATURE_CHANGES_PLUSPLUS.md +++ b/docs/FEATURE_CHANGES_PLUSPLUS.md @@ -7,29 +7,29 @@ This document explains what changed in `cliproxyapi++`, why it changed, and how | Change | What changed in `++` | Why it matters | |---|---|---| | Reusable proxy core | Translation and proxy runtime are structured for reusability (`pkg/llmproxy`) | Enables embedding proxy logic into other Go systems and keeps runtime boundaries cleaner | -| Stronger module boundaries | Operational and integration concerns are separated from API surface orchestration | Easier upgrades, clearer ownership, lower accidental coupling | +| Module boundaries | Operational and integration concerns are separated from API surface orchestration | Easier upgrades, clearer ownership, lower accidental coupling | ## 2. Authentication and Identity Changes | Change | What changed in `++` | Why it matters | |---|---|---| -| Copilot-grade auth support | Extended auth handling for enterprise Copilot-style workflows | More stable integration for organizations depending on tokenized auth stacks | -| Kiro/AWS login path support | Additional OAuth/login handling pathways and operational UX around auth | Better compatibility for multi-provider enterprise environments | +| Copilot auth support | Extended auth handling for Copilot-style workflows | More stable integration for tokenized auth stacks | +| Kiro/AWS login path support | Additional OAuth/login handling pathways and auth-related operational UX | Better compatibility for multi-provider environments | | Token lifecycle automation | Background refresh and expiration handling | Reduces downtime from token expiry and manual auth recovery | ## 3. Provider and Model Routing Changes | Change | What changed in `++` | Why it matters | |---|---|---| -| Broader provider matrix | Expanded provider adapter and model mapping surfaces | More routing options without changing client-side OpenAI API integrations | -| Unified model translation | Stronger mapping between OpenAI-style model requests and provider-native model names | Lower integration friction and fewer provider mismatch errors | +| Provider matrix expansion | Expanded provider adapter and model mapping surfaces | More routing options without changing client-side OpenAI API integrations | +| Unified model translation | Mapping between OpenAI-style model requests and provider-native model names | Lower integration friction and fewer provider mismatch errors | | Cooldown and throttling controls | Runtime controls for rate-limit pressure and provider-specific cooldown windows | Better stability under burst traffic and quota pressure | ## 4. Security and Governance Changes | Change | What changed in `++` | Why it matters | |---|---|---| -| Defense-in-depth hardening | Added stricter operational defaults and hardened deployment assumptions | Safer default posture in production environments | +| Defense-in-depth controls | Added stricter operational defaults and deployment assumptions | Safer default posture in production environments | | Protected core path governance | Workflow-level controls around critical core logic paths | Reduces accidental regressions in proxy translation internals | | Device and session consistency controls | Deterministic identity/session behavior for strict provider checks | Fewer auth anomalies in long-running deployments | @@ -37,7 +37,7 @@ This document explains what changed in `cliproxyapi++`, why it changed, and how | Change | What changed in `++` | Why it matters | |---|---|---| -| Stronger CI/CD posture | Expanded release, build, and guard workflows | Faster detection of regressions and safer release cadence | +| CI/CD workflows | Expanded release, build, and guard workflows | Faster detection of regressions and safer release cadence | | Multi-arch/container focus | Production deployment paths optimized for container-first ops | Better portability across heterogeneous infra | | Runtime observability surfaces | Improved log and management endpoints | Easier production debugging and incident response | @@ -50,6 +50,6 @@ This document explains what changed in `cliproxyapi++`, why it changed, and how ## 7. Migration Impact Summary -- **Technical users**: gain higher operational stability, better auth longevity, and stronger multi-provider behavior. +- **Technical users**: gain operational stability, better auth longevity, and broader multi-provider behavior. - **External integrators**: keep OpenAI-compatible interfaces while gaining wider provider compatibility. -- **Internal maintainers**: get cleaner subsystem boundaries and stronger guardrails for production evolution. +- **Internal maintainers**: get cleaner subsystem boundaries and clearer guardrails for production evolution. diff --git a/docs/features/auth/SPEC.md b/docs/features/auth/SPEC.md index d17da94537..ee89c1804c 100644 --- a/docs/features/auth/SPEC.md +++ b/docs/features/auth/SPEC.md @@ -1,8 +1,8 @@ -# Technical Specification: Enterprise Authentication & Lifecycle +# Technical Specification: Authentication & Lifecycle ## Overview -**cliproxyapi++** implements enterprise-grade authentication management with full lifecycle automation, supporting multiple authentication flows (API keys, OAuth, device authorization) and automatic token refresh capabilities. +**cliproxyapi++** implements authentication lifecycle management with multiple flows (API keys, OAuth, device authorization) and automatic token refresh. ## Authentication Architecture @@ -315,7 +315,7 @@ auth: refresh_lead_time: "10m" ``` -**Refresh Lead Time**: Tokens are refreshed 10 minutes before expiration to ensure zero downtime. +**Refresh Lead Time**: Tokens are refreshed 10 minutes before expiration to reduce token-expiry interruptions. ### Refresh Strategies diff --git a/docs/features/auth/USER.md b/docs/features/auth/USER.md index 79552c5de4..b7a1a99185 100644 --- a/docs/features/auth/USER.md +++ b/docs/features/auth/USER.md @@ -1,8 +1,8 @@ -# User Guide: Enterprise Authentication +# User Guide: Authentication ## Understanding Authentication in cliproxyapi++ -cliproxyapi++ supports multiple authentication methods for different LLM providers. The authentication system handles credential management, automatic token refresh, and quota tracking seamlessly in the background. +cliproxyapi++ supports multiple authentication methods for different LLM providers. The authentication system handles credential management, automatic token refresh, and quota tracking. ## Quick Start: Adding Credentials @@ -89,7 +89,7 @@ curl -X POST http://localhost:8317/v0/management/auths \ - Mistral - Groq - DeepSeek -- And many more +- Additional providers can be configured through provider blocks **Setup**: ```json @@ -126,7 +126,7 @@ open http://localhost:8317/v0/oauth/copilot # Enter your GitHub credentials # Authorize the application -# Done! Token is stored and managed automatically +# Token is stored and managed automatically ``` ### Custom Provider Authentication @@ -310,7 +310,7 @@ Response: When quota is exhausted or token expires: 1. System selects next available credential 2. Notifications sent (configured) -3. Load continues seamlessly +3. Requests continue with the next available credential ### Manual Rotation @@ -406,7 +406,7 @@ curl -X POST http://localhost:8317/v0/management/auths \ 3. **Monitor refresh failures** 4. **Review credential usage** patterns -## Advanced: Encryption +## Encryption Enable credential encryption: diff --git a/docs/features/operations/SPEC.md b/docs/features/operations/SPEC.md index d601442e27..8592adcff8 100644 --- a/docs/features/operations/SPEC.md +++ b/docs/features/operations/SPEC.md @@ -1,8 +1,8 @@ -# Technical Specification: High-Scale Operations +# Technical Specification: Operations ## Overview -**cliproxyapi++** is designed for high-scale production environments with intelligent operations features: automated cooldown, load balancing, health checking, and comprehensive observability. +**cliproxyapi++** includes operations features for cooldown handling, load balancing, health checks, and observability. ## Operations Architecture @@ -10,7 +10,7 @@ ``` Operations Layer -β”œβ”€β”€ Intelligent Cooldown System +β”œβ”€β”€ Cooldown System β”‚ β”œβ”€β”€ Rate Limit Detection β”‚ β”œβ”€β”€ Provider-Specific Cooldown β”‚ β”œβ”€β”€ Automatic Recovery diff --git a/docs/index.md b/docs/index.md index a8d604c7ac..acf64ca6ca 100644 --- a/docs/index.md +++ b/docs/index.md @@ -1,6 +1,6 @@ # cliproxyapi++ Docs -`cliproxyapi++` is a high-performance OpenAI-compatible proxy that lets you route one client API surface to many upstream providers. +`cliproxyapi++` is an OpenAI-compatible proxy that routes one client API surface to multiple upstream providers. ## Who This Documentation Is For @@ -14,7 +14,7 @@ - Use one endpoint (`/v1/*`) across heterogeneous providers. - Configure routing and model-prefix behavior in `config.yaml`. - Manage credentials and runtime controls through management APIs. -- Monitor health and per-provider metrics for production operations. +- Monitor health and per-provider metrics for operations. ## Start Here @@ -23,10 +23,10 @@ 3. [Provider Usage](/provider-usage) for provider strategy and setup patterns. 4. [Provider Quickstarts](/provider-quickstarts) for provider-specific 5-minute success paths. 5. [Provider Catalog](/provider-catalog) for provider block reference. -5. [Provider Operations](/provider-operations) for on-call runbook and incident workflows. -6. [Routing and Models Reference](/routing-reference) for model resolution behavior. -7. [Troubleshooting](/troubleshooting) for common failures and concrete fixes. -8. [Planning Boards](/planning/) for source-linked execution tracking and import-ready board artifacts. +6. [Provider Operations](/provider-operations) for on-call runbook and incident workflows. +7. [Routing and Models Reference](/routing-reference) for model resolution behavior. +8. [Troubleshooting](/troubleshooting) for common failures and concrete fixes. +9. [Planning Boards](/planning/) for source-linked execution tracking and import-ready board artifacts. ## API Surfaces