codex/stability fix 1 by KooshaPari · Pull Request #861 · KooshaPari/cliproxyapi-plusplus

KooshaPari · 2026-03-15T00:13:48Z

fix(openai): stabilize thinking and compat paths
fix(iflow): preserve non-thinking template kwargs
fix(sdk): restore auth manager compatibility
fix(stability): restore auth and executor compatibility

Summary by CodeRabbit

New Features
- Added support for reasoning effort configuration with levels: none, low, medium, and high for improved thinking customization.
- Enhanced endpoint compatibility resolution with multi-provider support.
- Improved handling for MiniMax models.
Bug Fixes
- Fixed cleanup of extraneous reasoning-related metadata in request processing.
Tests
- Added comprehensive test coverage for reasoning effort extraction and endpoint override resolution.
Documentation
- Updated code style and formatting for consistency.

Co-authored-by: Codex <noreply@openai.com>

coderabbitai · 2026-03-15T00:14:10Z

Caution

Review failed

Pull request was closed or merged during review

📝 Walkthrough

Walkthrough

The changes refactor authentication token storage to embed a shared BaseTokenStorage struct across multiple providers (Gemini, Copilot, Kilo), add parsing support for iFlow reasoning effort levels with corresponding tests, enhance endpoint override resolution with provider-specific logic, and apply formatting normalizations to documentation configuration files.

Changes

Cohort / File(s)	Summary
Token Storage Refactoring - Auth Handlers `pkg/llmproxy/api/handlers/management/auth_gemini.go`, `auth_github.go`, `auth_kilo.go`	Updated token storage initialization to embed `BaseTokenStorage` with Email/AccessToken/Type fields; added base package imports. Each handler initializes the nested struct with provider-specific values.
Token Storage Refactoring - Auth Packages `pkg/llmproxy/auth/gemini/gemini_auth_test.go`, `sdk/auth/kilo.go`	Extended token storage structs to include `BaseTokenStorage` field; updated test fixtures and initialization logic. Test error message expectations refined to check only "invalid file path".
Token Storage Tests - Path Validation `pkg/llmproxy/auth/kimi/token_path_test.go`	Simplified error string assertion in path traversal rejection test from "invalid token file path" to "invalid file path".
Thinking - iFlow Reasoning Effort Parsing `pkg/llmproxy/thinking/apply.go`, `apply_iflow_test.go`	Introduced `parseIFlowReasoningEffort()` function to map reasoning effort strings ("none", "low", "medium", "high") to `ThinkingConfig` modes/levels. Added comprehensive test coverage with nested and flat JSON object structures.
Thinking - Field Stripping `pkg/llmproxy/thinking/strip.go`, `strip_test.go`	Extended iFlow provider field removal to include "reasoning", "reasoning_split", "reasoning_effort", "variant"; added cleanup logic to delete empty `chat_template_kwargs`. Added tests validating field removal while preserving unrelated content.
iFlow Provider - Legacy Field Handling `pkg/llmproxy/thinking/provider/iflow/apply.go`, `apply_test.go`	For MiniMax models, added removal of reasoning metadata fields (reasoning_effort, reasoning, variant) after setting reasoning_split. Comprehensive tests validate field stripping across different thinking modes and model support levels.
Endpoint Override Resolution `sdk/api/handlers/openai/endpoint_compat.go`, `endpoint_compat_test.go`	Enhanced `resolveEndpointOverride()` to iterate through model providers querying provider-specific overrides; added `resolveEndpointOverrideForInfo()` helper and `shouldForceNonStreamingChatBridge()` for MiniMax detection. Updated function signatures across call sites.
OpenAI Handler Integration `sdk/api/handlers/openai/openai_handlers.go`, `openai_responses_handlers.go`	Updated calls to `resolveEndpointOverride()` to pass new provider parameter (empty string); preserves existing override logic with new signature.
Documentation - Formatting `docs/.vitepress/config.ts`, `theme/index.ts`	Normalized quote style (single to double quotes), added semicolons, adjusted indentation for consistency.
Documentation - Plugin Updates `docs/.vitepress/plugins/content-tabs.ts`	Added exported `tabsClientScript` constant containing client-side initialization; applied formatting normalization (quotes, semicolons).
Documentation - Vue Component Formatting `docs/.vitepress/theme/components/CategorySwitcher.vue`	Minor type annotation formatting in `defineProps()` signature (added trailing semicolon).

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

Possibly related PRs

PR #826: Applies identical BaseTokenStorage embedding pattern to token struct initialization across multiple authentication providers (Gemini, Copilot, Kilo).
PR #822: Refactors provider auth token structs to embed shared BaseTokenStorage with consistent import and field initialization changes.
PR #824: Migrates provider token storage implementations to embed base.BaseTokenStorage, following the same structural consolidation pattern.

Poem

🐰 Tokens now nestle in their Base, so neat,
Reasoning effort makes thoughts complete,
Endpoints skip between providers with grace,
While docs get quoted and spaced just right—
A hop toward clarity, pixel by pixel bright! ✨

🚥 Pre-merge checks | ✅ 1 | ❌ 2

❌ Failed checks (1 warning, 1 inconclusive)

Check name	Status	Explanation	Resolution
Docstring Coverage	⚠️ Warning	Docstring coverage is 12.20% which is insufficient. The required threshold is 80.00%.	Write docstrings for the functions missing them to satisfy the coverage threshold.
Title check	❓ Inconclusive	The title 'codex/stability fix 1' is vague and generic, using non-descriptive phrasing that does not convey meaningful information about the changeset.	Replace with a clear, specific title that summarizes the main change, such as 'Add BaseTokenStorage embedding and iFlow thinking/endpoint support' or 'Refactor auth structures and stabilize thinking provider logic'.

✅ Passed checks (1 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

📝 Generate docstrings (stacked PR)
📝 Generate docstrings (commit on current branch)

🧪 Generate unit tests (beta)

Create PR with unit tests
Post copyable unit tests in a comment
Commit unit tests in branch codex/stability-fix-1

📝 Coding Plan

Generate coding plan for human review comments

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

gemini-code-assist · 2026-03-15T00:14:34Z

Summary of Changes

Hello, I'm Gemini Code Assist¹! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request focuses on enhancing the stability, compatibility, and maintainability of the LLM proxy. It introduces a significant refactoring of authentication token storage to use a common base, simplifies the Kiro executor by removing redundant authentication logic, and improves the handling of "thinking" configurations for iFlow models. Additionally, it refines OpenAI endpoint compatibility and restores critical functionalities within the authentication manager, ensuring more robust and consistent operation across different LLM providers and features.

Highlights

Authentication Refactoring: Standardized token storage across multiple LLM providers (Gemini, GitHub Copilot, Kilo, Codex, Qwen) by embedding a BaseTokenStorage struct, improving consistency and maintainability.
Kiro Executor Simplification: Removed dedicated Kiro authentication logic from the executor, streamlining the executor's responsibilities and reducing dependencies.
iFlow Thinking Configuration Enhancements: Improved handling of "thinking" configurations for iFlow models, including new parsing logic for reasoning.effort and more robust stripping of thinking-related fields from request bodies.
OpenAI Endpoint Compatibility: Enhanced OpenAI endpoint compatibility, particularly for iFlow models, by refining how endpoint overrides are resolved and introducing logic to force non-streaming chat bridges for specific models.
Auth Manager and Cooldown Compatibility Restoration: Re-implemented or restored key functionalities for the authentication manager, including registration, updates, loading, listing, auto-refresh, executor management, retry configurations, and result marking, along with quota cooldown mechanisms.
Configuration Import Path Updates: Standardized internal configuration import paths across various authentication packages.

Changelog

pkg/llmproxy/api/handlers/management/auth_gemini.go
- Imported base.BaseTokenStorage and updated GeminiTokenStorage to embed it, centralizing email management.
pkg/llmproxy/api/handlers/management/auth_github.go
- Imported base.BaseTokenStorage and updated CopilotTokenStorage to embed it, centralizing access token and type management.
pkg/llmproxy/api/handlers/management/auth_kilo.go
- Imported base.BaseTokenStorage and updated KiloTokenStorage to embed it, centralizing email and type management.
pkg/llmproxy/auth/claude/anthropic_auth.go
- Updated the import path for the internal configuration package.
pkg/llmproxy/auth/claude/utls_transport.go
- Updated the import path for the internal configuration package.
pkg/llmproxy/auth/codex/openai_auth.go
- Updated the import path for the internal configuration package.
pkg/llmproxy/auth/codex/openai_auth_test.go
- Imported base.BaseTokenStorage and updated CodexTokenStorage initialization to use the base struct.
pkg/llmproxy/auth/codex/token_test.go
- Imported base.BaseTokenStorage and updated CodexTokenStorage initialization to use the base struct.
pkg/llmproxy/auth/copilot/copilot_auth.go
- Updated the import path for the internal configuration package.
pkg/llmproxy/auth/copilot/copilot_extra_test.go
- Imported base.BaseTokenStorage and updated CopilotTokenStorage initialization to use the base struct.
pkg/llmproxy/auth/copilot/token_test.go
- Imported base.BaseTokenStorage and updated CopilotTokenStorage initialization to use the base struct.
pkg/llmproxy/auth/gemini/gemini_auth.go
- Updated the import path for the internal configuration package.
pkg/llmproxy/auth/gemini/gemini_auth_test.go
- Imported base.BaseTokenStorage, updated GeminiTokenStorage initialization to use the base struct, and refined the error message for path traversal.
pkg/llmproxy/auth/iflow/iflow_auth.go
- Updated the import path for the internal configuration package.
pkg/llmproxy/auth/kimi/kimi.go
- Updated the import path for the internal configuration package.
pkg/llmproxy/auth/kimi/token_path_test.go
- Imported base.BaseTokenStorage, updated KimiTokenStorage initialization to use the base struct, and refined the error message for path traversal.
pkg/llmproxy/auth/qwen/qwen_auth.go
- Imported base.BaseTokenStorage and updated CreateTokenStorage to embed BaseTokenStorage.
pkg/llmproxy/auth/qwen/qwen_auth_test.go
- Imported base.BaseTokenStorage and updated QwenTokenStorage initialization to use the base struct.
pkg/llmproxy/auth/qwen/qwen_token.go
- Refactored QwenTokenStorage to embed base.BaseTokenStorage, added LastRefresh and Expire fields, and updated NewQwenTokenStorage and SaveTokenToFile to align with the new structure.
pkg/llmproxy/auth/qwen/qwen_token_test.go
- Imported base.BaseTokenStorage and updated QwenTokenStorage initialization to use the base struct.
pkg/llmproxy/executor/kiro_auth.go
- Removed the entire file, eliminating Kiro-specific authentication logic from the executor package.
pkg/llmproxy/executor/kiro_executor.go
- Removed unused imports and dependencies related to Kiro authentication and usage tracking.
pkg/llmproxy/executor/kiro_streaming.go
- Corrected import aliases for cliproxyauth and cliproxyexecutor to ensure proper package referencing.
pkg/llmproxy/executor/kiro_transform.go
- Corrected the import alias for cliproxyauth to ensure proper package referencing.
pkg/llmproxy/thinking/apply.go
- Introduced parseIFlowReasoningEffort to handle reasoning.effort fields and integrated it into extractIFlowConfig for improved iFlow thinking configuration parsing.
pkg/llmproxy/thinking/apply_iflow_test.go
- Added a new test file to verify the correct extraction of iFlow thinking configurations based on reasoning.effort fields.
pkg/llmproxy/thinking/provider/iflow/apply.go
- Added logic to remove reasoning_effort, reasoning, and variant fields from the request body after applying thinking configurations for iFlow.
pkg/llmproxy/thinking/provider/iflow/apply_test.go
- Added a new test file to confirm that ApplyMiniMax correctly strips reasoning and variant fields from iFlow requests.
pkg/llmproxy/thinking/strip.go
- Expanded the list of fields to strip for iflow provider to include reasoning and variant, and added logic to remove chat_template_kwargs if it becomes empty.
pkg/llmproxy/thinking/strip_test.go
- Added a new test file to validate the stripping of thinking-related fields for the iflow provider, including empty chat_template_kwargs.
pkg/llmproxy/usage/metrics.go
- Updated the import path for the util package.
sdk/api/handlers/openai/endpoint_compat.go
- Enhanced resolveEndpointOverride to consider all model providers and introduced shouldForceNonStreamingChatBridge for specific models.
sdk/api/handlers/openai/endpoint_compat_test.go
- Added a new test file to verify OpenAI endpoint override logic and the shouldForceNonStreamingChatBridge function.
sdk/api/handlers/openai/openai_handlers.go
- Modified the call to resolveEndpointOverride to include the provider constant for more precise endpoint resolution.
sdk/api/handlers/openai/openai_responses_handlers.go
- Modified the call to resolveEndpointOverride to include the provider constant for more precise endpoint resolution.
sdk/auth/kilo.go
- Imported base.BaseTokenStorage and updated KiloTokenStorage initialization to use the base struct.
sdk/cliproxy/auth/cooldown_compat.go
- Added a new file with functions SetQuotaCooldownDisabled and isQuotaCooldownDisabled to manage quota cooldown behavior.
sdk/cliproxy/auth/manager_compat.go
- Added a new file that re-implements core Manager functionalities, including auth registration, updates, loading, listing, auto-refresh, executor management, retry configuration, and result marking, ensuring compatibility.
sdk/cliproxy/auth/oauth_model_alias.go
- Added apiKeyModelAliasTable type and rebuildAPIKeyModelAliasFromRuntimeConfig method to manage API key model aliases.

Activity

No specific activity has been recorded for this pull request yet.

Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature	Command	Description
Code Review	`/gemini review`	Performs a code review for the current pull request in its current state.
Pull Request Summary	`/gemini summary`	Provides a summary of the current pull request in its current state.
Comment	@gemini-code-assist	Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help	`/gemini help`	Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for GitHub and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution. ↩

gemini-code-assist

Code Review

This pull request introduces a series of stability fixes and refactorings across the authentication and execution paths. Key changes include refactoring various token storage implementations to use a common base struct, which improves maintainability. Additionally, there are fixes to stabilize the 'thinking' feature for the iFlow provider and to improve OpenAI endpoint compatibility. A significant part of this PR appears to be the restoration of the auth manager and executor compatibility, which should enhance overall system stability. My review includes one suggestion for removing redundant code in the Qwen token handling.

gemini-code-assist · 2026-03-15T00:17:36Z

 	if _, err := cleanTokenFilePath(authFilePath, "qwen token"); err != nil {
 		return err
 	}


This path validation check is redundant. The base.BaseTokenStorage.Save method, which is called on line 60, already performs the same path validation. Removing this block and the local cleanTokenFilePath function will help avoid code duplication and rely on the centralized validation in the base class.

coderabbitai

Actionable comments posted: 16

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (3)

pkg/llmproxy/auth/claude/utls_transport.go (1)

67-82: ⚠️ Potential issue | 🔴 Critical

Race condition: multiple goroutines can create duplicate connections.

When Broadcast() wakes multiple waiting goroutines, they sequentially acquire the lock. The first waiter finds no connection, falls through, and creates a new pending entry. However, subsequent waiters also fall through (the original pending was deleted before Broadcast) without checking if another waiter has already started creating a connection—they create their own pending entry and connection.

This can lead to resource leaks (duplicate connections) and wasted resources.

🐛 Proposed fix: Use a loop to re-check after waking

 func (t *utlsRoundTripper) getOrCreateConnection(host, addr string) (*http2.ClientConn, error) {
 	t.mu.Lock()
 
-	// Check if connection exists and is usable
-	if h2Conn, ok := t.connections[host]; ok && h2Conn.CanTakeNewRequest() {
-		t.mu.Unlock()
-		return h2Conn, nil
-	}
-
-	// Check if another goroutine is already creating a connection
-	if cond, ok := t.pending[host]; ok {
-		// Wait for the other goroutine to finish
-		cond.Wait()
-		// Check if connection is now available
+	for {
+		// Check if connection exists and is usable
 		if h2Conn, ok := t.connections[host]; ok && h2Conn.CanTakeNewRequest() {
 			t.mu.Unlock()
 			return h2Conn, nil
 		}
-		// Connection still not available, we'll create one
+
+		// Check if another goroutine is already creating a connection
+		if cond, ok := t.pending[host]; ok {
+			// Wait for the other goroutine to finish, then re-check
+			cond.Wait()
+			continue
+		}
+
+		// No one is creating a connection, we'll do it
+		break
 	}
 
 	// Mark this host as pending

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@pkg/llmproxy/auth/claude/utls_transport.go` around lines 67 - 82, The current
wait logic can let multiple waiters proceed to create duplicate connections;
change the single Wait() to a loop that re-checks both t.connections[host] and
t.pending[host] after each wake: acquire the lock, and while there is an
existing pending entry for host, call cond.Wait(); after waking, if a usable
connection exists return it; if no pending entry exists anymore then this
goroutine should create the new sync.Cond, assign t.pending[host]=cond and
proceed as the sole creator; ensure all checks reference t.pending,
t.connections, cond.Wait(), and CanTakeNewRequest() so only one goroutine
becomes the creator.

sdk/api/handlers/openai/openai_responses_handlers.go (1)

91-98: ⚠️ Potential issue | 🟠 Major

The Minimax non-stream fallback never takes effect here.

This bridge still honors stream unconditionally. shouldForceNonStreamingChatBridge was added for this compatibility path, but this branch never applies it, so Minimax requests will keep going through the streaming chat bridge instead of the intended non-stream fallback.

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@sdk/api/handlers/openai/openai_responses_handlers.go` around lines 91 - 98,
The branch that converts Responses->Chat currently respects the incoming
"stream" flag unconditionally (in the block using resolveEndpointOverride and
responsesconverter.ConvertOpenAIResponsesRequestToOpenAIChatCompletions), so the
Minimax non-stream fallback never triggers; update this block to consult
shouldForceNonStreamingChatBridge(modelName, rawJSON) after creating chatJSON
and override the stream value to false when that function returns true before
calling h.handleStreamingResponseViaChat or h.handleNonStreamingResponseViaChat
so Minimax requests are routed to the non-streaming handler; reference
resolveEndpointOverride, openAIResponsesEndpoint, OpenaiResponse,
openAIChatEndpoint,
responsesconverter.ConvertOpenAIResponsesRequestToOpenAIChatCompletions,
shouldForceNonStreamingChatBridge, h.handleStreamingResponseViaChat and
h.handleNonStreamingResponseViaChat.

sdk/api/handlers/openai/openai_handlers.go (1)

117-130: 🛠️ Refactor suggestion | 🟠 Major

Extract this compat branch out of ChatCompletions.

This routing path pushes ChatCompletions further past the 40-line limit and makes the endpoint/format dispatch harder to follow. Pull the override-and-conversion flow into a helper before adding more cases.

As per coding guidelines, "Maximum function length: 40 lines".

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@sdk/api/handlers/openai/openai_handlers.go` around lines 117 - 130, Extract
the compat branch inside ChatCompletions into a new helper method (e.g.,
handleResponsesCompat or resolveAndHandleResponsesOverride) that accepts the
same inputs used there (modelName, rawJSON, stream flag, and the context/caller
receiver h and c) and returns a boolean indicating “handled” (true means caller
should return). Move the resolveEndpointOverride check, originalChat assignment,
shouldTreatAsResponsesFormat check, codexconverter.ConvertOpenAIRequestToCodex
call, re-evaluation of stream via gjson.GetBytes(rawJSON, "stream").Bool(), and
the conditional calls to h.handleStreamingResponseViaResponses /
h.handleNonStreamingResponseViaResponses into that helper. Replace the original
inlined block with a single if helper(...) { return } so ChatCompletions stays
under 40 lines and routing is clearer.

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@pkg/llmproxy/auth/claude/utls_transport.go`:
- Line 13: Current code imports logrus as log (import alias: log
"github.com/sirupsen/logrus"); follow-up refactor should replace that import
with zerolog (e.g., github.com/rs/zerolog) and update all usages of the log
alias (calls like log.Info, log.Errorf, log.Debug, etc.) to zerolog equivalents
(logger.Info().Msg, logger.Error().Err(err).Msg, etc.), initializing a zerolog
logger instance where needed and ensuring structured fields are migrated to
With/Dict patterns; search for the import line and all references to the log
identifier in utls_transport.go (and related files) to perform the replacement
consistently.

In `@pkg/llmproxy/thinking/provider/iflow/apply_test.go`:
- Around line 54-93: Update the
TestApplyMiniMaxStripsReasoningVariantsAndLegacyFields test to include a
regression case where the JSON body contains a top-level "reasoning.effort" key
(e.g. "reasoning.effort":"medium") alongside the existing fields; call
a.Apply(body, cfg, model) as before and add an assertion that
res.Get("reasoning.effort").Exists() is false (or equivalent check) to ensure
Apply removes the literal reasoning.effort key; keep the rest of the assertions
unchanged so unrelated fields (e.g. "foo") are still preserved.

In `@pkg/llmproxy/thinking/provider/iflow/apply.go`:
- Around line 146-148: The cleanup in applyMiniMax misses the literal dotted key
"reasoning.effort" so add an sjson.DeleteBytes call to remove the escaped path
"reasoning\\.effort" alongside the existing deletes; locate the applyMiniMax
function in pkg/llmproxy/thinking/provider/iflow/apply.go and insert a deletion
using sjson.DeleteBytes(result, "reasoning\\.effort") (follow existing error
handling pattern) so that keys accepted by extractIFlowConfig (which reads
"reasoning\\.effort") are not leaked upstream.

In `@pkg/llmproxy/thinking/strip.go`:
- Around line 43-50: The iflow stripping list in pkg/llmproxy/thinking/strip.go
currently removes nested "reasoning" but not a literal top-level key
"reasoning.effort"; update the paths slice (the variable assigned to paths in
strip.go) to include the escaped literal key "reasoning\.effort" so top-level
dot-keys are stripped as well, and consider moving this paths list into a shared
constant used by the iflow cleanup code in provider/iflow/apply.go to avoid
drift.

In `@sdk/api/handlers/openai/endpoint_compat_test.go`:
- Around line 62-81: Register a second provider for the same modelID that only
supports "/responses" (e.g., call registry.GetGlobalRegistry().RegisterClient
with a different client name and provider id like "bridge" and ModelInfo
SupportedEndpoints []string{"/responses"}), add a t.Cleanup to UnregisterClient
that second provider, and then update the assertions around
resolveEndpointOverride(modelID, "/responses", OpenaiResponse) to assert that ok
is true and overrideEndpoint == "/responses" (i.e., the requested /responses
remains unmodified) while keeping the original iflow registration in place so
the mixed-provider case is exercised.
- Around line 45-50: The cleanup is unregistering a different client ID than the
one registered, leaking the registry entry; update the UnregisterClient call so
it uses the exact same client ID string used in RegisterClient
("endpoint-compat-iflow-direction")—i.e., locate the register call via
registry.GetGlobalRegistry().RegisterClient(...) and make the corresponding
registry.GetGlobalRegistry().UnregisterClient(...) use that identical ID.

In `@sdk/api/handlers/openai/endpoint_compat.go`:
- Around line 19-26: The loop treating
resolveEndpointOverrideForInfo(reg.GetModelInfo(...), requestedEndpoint) the
same for both "no override" and "provider already supports endpoint" causes
native-support cases to be overwritten by later providers; update the logic so
that when iterating providers from reg.GetModelProviders(modelName) you stop and
return immediately if the provider already supports the requested endpoint
(e.g., check modelInfo.SupportsEndpoint(requestedEndpoint) on
reg.GetModelInfo(modelName, provider) or change resolveEndpointOverrideForInfo
to return an explicit tri-state like (override, ok, supported)); only continue
scanning when there is no native support and an override is not present, and
only accept an override when resolveEndpointOverrideForInfo indicates an actual
override (override + ok).

In `@sdk/cliproxy/auth/cooldown_compat.go`:
- Around line 3-5: Add a Go doc comment for the exported function
SetQuotaCooldownDisabled describing its purpose and usage; place a brief
sentence or two immediately above the SetQuotaCooldownDisabled function stating
that it enables/disables quota cooldown behavior and when callers should use it,
and reference the underlying quotaCooldownDisabled store variable in the comment
for clarity.

In `@sdk/cliproxy/auth/manager_compat.go`:
- Around line 16-24: Create shared error message constants (e.g.,
ErrMsgAuthRequired, ErrMsgAuthIDRequired, ErrMsgProviderRequired) in the auth
package (or a nearby shared file) and replace the inline string literals in
manager_compat.go (locations around the auth validation branches and the other
occurrences you noted at lines ~54-60, ~439, ~476) with those constants; update
the Error{Message: ...} initializers to use the new constants (or fmt.Sprintf
with template constants if interpolation is needed) so all
register/update/select paths reference the same template values.
- Around line 39-46: The code updates the in-memory map m.auths (via
m.mu.Lock()/m.auths[next.ID] = next.Clone()) before calling m.store.Save(ctx,
next.Clone()), which can leave runtime state ahead of durable state on save
failure or races; change the flow in the methods touching m.auths and
m.store.Save (the two mirrored blocks around m.auths/next.Clone and m.store.Save
guarded by shouldSkipPersist) so you attempt the persistent Save first (call
m.store.Save with the clone), and only if Save succeeds acquire m.mu, assign
m.auths[next.ID] = next.Clone() and then unlock; alternatively hold m.mu for the
duration of Save to serialize races—ensure you use next.Clone for both Save and
the in-memory assignment and return the original error if Save fails.
- Around line 148-163: StartAutoRefresh ignores the interval and does nothing
besides waiting for ctx.Done(); change it to use the provided interval parameter
(remove the blank identifier) and spawn a goroutine that ticks at that interval
to run the Manager's refresh logic until canceled. Specifically: accept the
interval argument in StartAutoRefresh, call StopAutoRefresh() as you already do,
create refreshCtx and cancel, store cancel in m.refreshCancel, then start a
goroutine that creates a time.Ticker from the interval (handle zero/negative by
defaulting or returning) and loops with select { case <-ticker.C: invoke the
manager's refresh routine (e.g., m.Refresh() or the method that renews auths);
case <-refreshCtx.Done(): ticker.Stop(); return }. Retain the existing names
Manager, StartAutoRefresh, StopAutoRefresh, m.refreshCancel, refreshCtx, cancel
to locate the change.
- Around line 452-487: The selection mixes auths from all providers into
candidates but picks an executor only for the first provider with an executor
and always calls selector.Pick with providers[0], causing provider mismatch;
instead, for each provider iterate building provider-scoped candidates from
m.auths (using strings.EqualFold(auth.Provider, provider) and pinnedAuthID
filtering), get the corresponding exec from m.executors, and if exec != nil and
that provider has candidates call selector.Pick(ctx, provider, model, opts,
providerCandidates); if selector.Pick returns an auth, call
updateAggregatedAvailability and selectedAuthCallback(auth.ID) and immediately
return that auth and the matching exec; only if no provider yields a pick return
the auth_not_found error—this ensures executor and auth come from the same
provider (referencing candidates, executor, m.executors, providers,
selector.Pick, m.auths, pinnedAuthID, updateAggregatedAvailability,
selectedAuthCallback).
- Around line 428-434: The ExecuteStream path currently returns
executor.ExecuteStream errors without updating quota/backoff state; modify
Manager.ExecuteStream so after selecting auth/executor (selectAuthAndExecutor)
and applying applyAPIKeyModelAlias, you call executor.ExecuteStream and if it
returns an error immediately call m.recordExecutionResult(...) with the same
ctx, auth, req.Model (or the aliased model) and an appropriate failure result to
update quota/backoff state before returning the error; keep the existing success
return path unchanged so only startup failures are recorded.
- Around line 543-574: The switch on
strings.ToLower(strings.TrimSpace(auth.Provider)) duplicates the same
API-key/model lookup for cfg.GeminiKey, cfg.ClaudeKey and cfg.CodexKey; extract
a provider registry/helper and replace the three branches with a single generic
lookup. Implement a small provider registry (map[string]func(*Config)[]KeyType
or an interface) that maps "gemini"/"claude"/"codex" to functions returning the
corresponding cfg.<GeminiKey|ClaudeKey|CodexKey> slice, then in the current
function call that helper once: iterate the returned keys, compare
strings.TrimSpace(key.APIKey) and BaseURL, build models []modelAliasEntry and
call resolveModelAliasFromConfigModels(requestedModel, models). Update names
referenced here (auth.Provider, cfg.GeminiKey, cfg.ClaudeKey, cfg.CodexKey,
resolveModelAliasFromConfigModels, modelAliasEntry) and ensure the refactor
keeps the logic identical and reduces the function length below 40 lines.
- Around line 186-193: The manager currently calls CloseExecutionSession while
holding m.mu which can deadlock; modify Set/replace logic so you capture the
replaced executor (m.executors[key]) while holding m.mu, then release the lock
and call CloseExecutionSession(CloseAllExecutionSessionsID) on it outside the
lock; specifically, in the function that assigns m.executors[key] (the block
that checks for replaced, ok := m.executors[key] and type-asserts to
ExecutionSessionCloser), store the replaced value in a local variable while
locked, perform m.executors[key] = executor, unlock (defer removal of defer if
needed), and only after the mutex is released, if the local variable implements
ExecutionSessionCloser call
closer.CloseExecutionSession(CloseAllExecutionSessionsID).

In `@sdk/cliproxy/auth/oauth_model_alias.go`:
- Around line 60-65: The method rebuildAPIKeyModelAliasFromRuntimeConfig
currently only clears m.apiKeyModelAlias but its name implies it reads runtime
config and repopulates entries; either rename it to clearAPIKeyModelAliasCache
(leave body as m.apiKeyModelAlias.Store(apiKeyModelAliasTable(nil))) or
implement the intended rebuild: fetch the runtime config from Manager (e.g.
m.runtimeConfig or a getter like m.GetRuntimeConfig()), build the alias table
via apiKeyModelAliasTable(cfg) and then call m.apiKeyModelAlias.Store(...) to
replace the cache; update callers to match the new name or behavior and keep
references to Manager, rebuildAPIKeyModelAliasFromRuntimeConfig,
apiKeyModelAlias, and apiKeyModelAliasTable to locate the change.

---

Outside diff comments:
In `@pkg/llmproxy/auth/claude/utls_transport.go`:
- Around line 67-82: The current wait logic can let multiple waiters proceed to
create duplicate connections; change the single Wait() to a loop that re-checks
both t.connections[host] and t.pending[host] after each wake: acquire the lock,
and while there is an existing pending entry for host, call cond.Wait(); after
waking, if a usable connection exists return it; if no pending entry exists
anymore then this goroutine should create the new sync.Cond, assign
t.pending[host]=cond and proceed as the sole creator; ensure all checks
reference t.pending, t.connections, cond.Wait(), and CanTakeNewRequest() so only
one goroutine becomes the creator.

In `@sdk/api/handlers/openai/openai_handlers.go`:
- Around line 117-130: Extract the compat branch inside ChatCompletions into a
new helper method (e.g., handleResponsesCompat or
resolveAndHandleResponsesOverride) that accepts the same inputs used there
(modelName, rawJSON, stream flag, and the context/caller receiver h and c) and
returns a boolean indicating “handled” (true means caller should return). Move
the resolveEndpointOverride check, originalChat assignment,
shouldTreatAsResponsesFormat check, codexconverter.ConvertOpenAIRequestToCodex
call, re-evaluation of stream via gjson.GetBytes(rawJSON, "stream").Bool(), and
the conditional calls to h.handleStreamingResponseViaResponses /
h.handleNonStreamingResponseViaResponses into that helper. Replace the original
inlined block with a single if helper(...) { return } so ChatCompletions stays
under 40 lines and routing is clearer.

In `@sdk/api/handlers/openai/openai_responses_handlers.go`:
- Around line 91-98: The branch that converts Responses->Chat currently respects
the incoming "stream" flag unconditionally (in the block using
resolveEndpointOverride and
responsesconverter.ConvertOpenAIResponsesRequestToOpenAIChatCompletions), so the
Minimax non-stream fallback never triggers; update this block to consult
shouldForceNonStreamingChatBridge(modelName, rawJSON) after creating chatJSON
and override the stream value to false when that function returns true before
calling h.handleStreamingResponseViaChat or h.handleNonStreamingResponseViaChat
so Minimax requests are routed to the non-streaming handler; reference
resolveEndpointOverride, openAIResponsesEndpoint, OpenaiResponse,
openAIChatEndpoint,
responsesconverter.ConvertOpenAIResponsesRequestToOpenAIChatCompletions,
shouldForceNonStreamingChatBridge, h.handleStreamingResponseViaChat and
h.handleNonStreamingResponseViaChat.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: Organization UI

Review profile: ASSERTIVE

Plan: Pro

Run ID: ceff7acf-7eb6-4d80-a16e-d01eb1d31360

📥 Commits

Reviewing files that changed from the base of the PR and between a82cb72 and c7079b7.

📒 Files selected for processing (39)

pkg/llmproxy/api/handlers/management/auth_gemini.go
pkg/llmproxy/api/handlers/management/auth_github.go
pkg/llmproxy/api/handlers/management/auth_kilo.go
pkg/llmproxy/auth/claude/anthropic_auth.go
pkg/llmproxy/auth/claude/utls_transport.go
pkg/llmproxy/auth/codex/openai_auth.go
pkg/llmproxy/auth/codex/openai_auth_test.go
pkg/llmproxy/auth/codex/token_test.go
pkg/llmproxy/auth/copilot/copilot_auth.go
pkg/llmproxy/auth/copilot/copilot_extra_test.go
pkg/llmproxy/auth/copilot/token_test.go
pkg/llmproxy/auth/gemini/gemini_auth.go
pkg/llmproxy/auth/gemini/gemini_auth_test.go
pkg/llmproxy/auth/iflow/iflow_auth.go
pkg/llmproxy/auth/kimi/kimi.go
pkg/llmproxy/auth/kimi/token_path_test.go
pkg/llmproxy/auth/qwen/qwen_auth.go
pkg/llmproxy/auth/qwen/qwen_auth_test.go
pkg/llmproxy/auth/qwen/qwen_token.go
pkg/llmproxy/auth/qwen/qwen_token_test.go
pkg/llmproxy/executor/kiro_auth.go
pkg/llmproxy/executor/kiro_executor.go
pkg/llmproxy/executor/kiro_streaming.go
pkg/llmproxy/executor/kiro_transform.go
pkg/llmproxy/thinking/apply.go
pkg/llmproxy/thinking/apply_iflow_test.go
pkg/llmproxy/thinking/provider/iflow/apply.go
pkg/llmproxy/thinking/provider/iflow/apply_test.go
pkg/llmproxy/thinking/strip.go
pkg/llmproxy/thinking/strip_test.go
pkg/llmproxy/usage/metrics.go
sdk/api/handlers/openai/endpoint_compat.go
sdk/api/handlers/openai/endpoint_compat_test.go
sdk/api/handlers/openai/openai_handlers.go
sdk/api/handlers/openai/openai_responses_handlers.go
sdk/auth/kilo.go
sdk/cliproxy/auth/cooldown_compat.go
sdk/cliproxy/auth/manager_compat.go
sdk/cliproxy/auth/oauth_model_alias.go

💤 Files with no reviewable changes (2)

pkg/llmproxy/executor/kiro_executor.go
pkg/llmproxy/executor/kiro_auth.go

📜 Review details

🧰 Additional context used

📓 Path-based instructions (1)

**/*.go

📄 CodeRabbit inference engine (AGENTS.md)

**/*.go: NEVER create a v2 file - refactor the original instead
NEVER create a new class if an existing one can be made generic
NEVER create custom implementations when an OSS library exists - search pkg.go.dev for existing libraries before writing code
Build generic building blocks (provider interface + registry) before application logic
Use chi for HTTP routing (NOT custom routers)
Use zerolog for logging (NOT fmt.Print)
Use viper for configuration (NOT manual env parsing)
Use go-playground/validator for validation (NOT manual if/else validation)
Use golang.org/x/time/rate for rate limiting (NOT custom limiters)
Use template strings for messages instead of hardcoded messages and config-driven logic instead of code-driven
Zero new lint suppressions without inline justification
All new code must pass: go fmt, go vet, golint
Maximum function length: 40 lines
No placeholder TODOs in committed code

Files:

pkg/llmproxy/auth/gemini/gemini_auth_test.go
pkg/llmproxy/auth/codex/openai_auth.go
pkg/llmproxy/auth/iflow/iflow_auth.go
pkg/llmproxy/auth/claude/anthropic_auth.go
pkg/llmproxy/auth/claude/utls_transport.go
pkg/llmproxy/auth/copilot/copilot_auth.go
pkg/llmproxy/thinking/provider/iflow/apply.go
sdk/cliproxy/auth/cooldown_compat.go
pkg/llmproxy/auth/gemini/gemini_auth.go
pkg/llmproxy/auth/codex/openai_auth_test.go
sdk/api/handlers/openai/openai_responses_handlers.go
pkg/llmproxy/thinking/strip_test.go
pkg/llmproxy/thinking/provider/iflow/apply_test.go
sdk/cliproxy/auth/oauth_model_alias.go
sdk/api/handlers/openai/endpoint_compat_test.go
pkg/llmproxy/thinking/apply_iflow_test.go
pkg/llmproxy/auth/copilot/copilot_extra_test.go
pkg/llmproxy/auth/kimi/kimi.go
pkg/llmproxy/executor/kiro_streaming.go
sdk/auth/kilo.go
sdk/api/handlers/openai/openai_handlers.go
pkg/llmproxy/api/handlers/management/auth_kilo.go
pkg/llmproxy/usage/metrics.go
pkg/llmproxy/api/handlers/management/auth_gemini.go
pkg/llmproxy/executor/kiro_transform.go
sdk/api/handlers/openai/endpoint_compat.go
pkg/llmproxy/auth/qwen/qwen_auth_test.go
pkg/llmproxy/auth/codex/token_test.go
pkg/llmproxy/api/handlers/management/auth_github.go
pkg/llmproxy/thinking/strip.go
pkg/llmproxy/auth/qwen/qwen_token_test.go
pkg/llmproxy/auth/copilot/token_test.go
pkg/llmproxy/auth/qwen/qwen_token.go
pkg/llmproxy/auth/qwen/qwen_auth.go
pkg/llmproxy/auth/kimi/token_path_test.go
pkg/llmproxy/thinking/apply.go
sdk/cliproxy/auth/manager_compat.go

🧠 Learnings (1)

📚 Learning: 2026-02-25T10:11:41.448Z

Learnt from: CR
Repo: KooshaPari/cliproxyapi-plusplus PR: 0
File: AGENTS.md:0-0
Timestamp: 2026-02-25T10:11:41.448Z
Learning: Applies to **/*.go : Use template strings for messages instead of hardcoded messages and config-driven logic instead of code-driven

Applied to files:

pkg/llmproxy/thinking/strip.go

🔇 Additional comments (38)

pkg/llmproxy/usage/metrics.go (1)

7-7: Import path migration looks correct.

Line 7 now matches the module declared in go.mod and the util package path, which avoids mixed-prefix import/build issues.

pkg/llmproxy/executor/kiro_transform.go (1)

13-13: Import alias rename is clean and non-breaking.

Line 13 updates the alias to cliproxyauth and usage remains consistent in this file.

pkg/llmproxy/executor/kiro_streaming.go (1)

22-23: Alias updates are consistent and safe.

Lines 22-23 align import aliases with current naming (cliproxyauth, cliproxyexecutor) and all in-file references match.

pkg/llmproxy/auth/claude/anthropic_auth.go (1)

16-16: Import path migration looks correct.

This update is consistent with relocating config usage to pkg/llmproxy/config and introduces no functional risk in this segment.

sdk/cliproxy/auth/cooldown_compat.go (1)

7-9: LGTM!

Correct use of atomic.Bool.Load() for thread-safe read access. The unexported getter properly encapsulates access to the package-level variable and aligns with its usage in manager_compat.go.

pkg/llmproxy/auth/iflow/iflow_auth.go (1)

16-16: Import migration verified as complete and consistent across all auth modules.

The import path update from internal/config to pkg/llmproxy/config is correct and has been successfully applied across all auth providers (copilot, codex, synthesizer, qwen, kiro, kimi, claude, diff, gemini, antigravity). No remaining references to internal/config exist in auth modules.

pkg/llmproxy/auth/claude/utls_transport.go (1)

11-11: LGTM!

The import path update from internal/config to pkg/llmproxy/config is consistent with the broader refactoring across auth modules described in the PR objectives.

pkg/llmproxy/auth/kimi/kimi.go (1)

18-18: LGTM!

The import path change from internal/config to pkg/llmproxy/config is consistent with the broader refactoring to expose configuration as a public package.

pkg/llmproxy/auth/codex/openai_auth.go (1)

17-17: LGTM!

The import path update aligns with the consistent migration from internal/config to pkg/llmproxy/config across the codebase.

pkg/llmproxy/auth/codex/openai_auth_test.go (2)

12-12: LGTM!

Import correctly added to support the embedded BaseTokenStorage structure in test initializations.

300-300: LGTM!

Test initialization correctly updated to use the embedded BaseTokenStorage structure, maintaining compatibility with field promotion for assertions on storage.AccessToken.

pkg/llmproxy/api/handlers/management/auth_github.go (2)

10-10: LGTM!

Import for base package correctly added to support the embedded BaseTokenStorage pattern.

55-58: LGTM!

Token storage initialization correctly migrated to use the embedded BaseTokenStorage structure with AccessToken and Type fields properly nested.

pkg/llmproxy/auth/copilot/copilot_extra_test.go (3)

13-13: LGTM!

Import correctly added to support BaseTokenStorage usage in test initializations.

179-181: LGTM!

Test initialization correctly updated to construct CopilotTokenStorage with embedded BaseTokenStorage.

187-189: LGTM!

Consistent with the valid token test case, correctly using embedded BaseTokenStorage for the expired token scenario.

pkg/llmproxy/auth/copilot/copilot_auth.go (1)

13-13: LGTM!

Import path correctly updated to pkg/llmproxy/config, consistent with the codebase-wide migration. The SDKConfig field access via cfg.SDKConfig remains valid per the config struct definition.
pkg/llmproxy/auth/kimi/token_path_test.go (3)
6-7: LGTM!

Import correctly added to support BaseTokenStorage usage.

11-11: LGTM!

Test initialization correctly updated to use embedded BaseTokenStorage structure.

18-18: No action needed. The test correctly validates the error message. The underlying SaveTokenToFile delegates to BaseTokenStorage.Save(), which generates the error message "base token storage: invalid file path: %w". Since the test uses strings.Contains() to check for "invalid file path", it will successfully match this error message. The implementation and test are aligned—there is no mismatch or regression.
			> Likely an incorrect or invalid review comment.
pkg/llmproxy/auth/codex/token_test.go (2)

8-9: LGTM!

Import correctly added to support BaseTokenStorage usage in test initializations.

21-29: LGTM!

Test initialization correctly updated to nest AccessToken, RefreshToken, and Email inside BaseTokenStorage, while keeping IDToken and AccountID at the top level. The test assertions on lines 46 and 49 will work correctly due to Go's embedded struct field promotion.

pkg/llmproxy/auth/copilot/token_test.go (1)

9-9: Base token embedding update in tests looks correct.

The fixture now matches the migrated token storage shape and should keep serialization coverage aligned with production structs.

Also applies to: 22-25

pkg/llmproxy/auth/gemini/gemini_auth.go (1)

17-17: Config import path update is consistent and safe.

This change cleanly aligns Gemini auth with the new pkg/llmproxy/config package location.

pkg/llmproxy/api/handlers/management/auth_kilo.go (1)

10-10: Kilo token storage migration is applied correctly.

Using BaseTokenStorage here keeps the handler aligned with the shared auth storage model while preserving current behavior.

Also applies to: 60-63

pkg/llmproxy/auth/qwen/qwen_auth_test.go (1)

13-13: Qwen auth test fixture update is consistent with BaseTokenStorage embedding.

The changed setup keeps the traversal-path test aligned with the new storage structure.

Also applies to: 157-157

sdk/auth/kilo.go (1)

8-8: SDK Kilo storage initialization is compatible with the embedded base struct.

This update correctly preserves Email/Type through BaseTokenStorage during auth record construction.

Also applies to: 101-104

pkg/llmproxy/auth/qwen/qwen_token_test.go (1)

8-8: Qwen token tests are correctly updated for BaseTokenStorage embedding.

Both test cases now initialize token data using the shared base storage shape.

Also applies to: 17-20, 35-37

pkg/llmproxy/auth/gemini/gemini_auth_test.go (1)

15-15: Gemini auth tests are aligned with the storage refactor and updated error wording.

The changed fixture shape and traversal-path assertion remain coherent with the new token storage contract.

Also applies to: 50-53, 82-82

pkg/llmproxy/api/handlers/management/auth_gemini.go (1)

14-14: Gemini management handler uses the new base token storage pattern correctly.

This keeps handler-generated auth records compatible with the shared storage model.

Also applies to: 142-144

pkg/llmproxy/auth/qwen/qwen_auth.go (2)

16-16: LGTM!

Import correctly added for the new base package containing BaseTokenStorage.

351-363: LGTM!

The refactored CreateTokenStorage correctly initializes the embedded BaseTokenStorage with token data. This approach:

Properly sets AccessToken and RefreshToken within the embedded struct

Maintains the Qwen-specific fields (LastRefresh, ResourceURL, Expire) at the outer struct level

Aligns with the pattern used across other provider modules in this PR

pkg/llmproxy/auth/qwen/qwen_token.go (3)

19-30: LGTM!

The struct refactoring is well-designed:

Anonymous embedding of base.BaseTokenStorage promotes fields to the JSON top level, maintaining backward compatibility with existing token files

Value embedding (not pointer) ensures the base storage is always initialized, avoiding nil pointer issues

Qwen-specific fields remain cleanly separated at the outer struct level

38-42: LGTM!

Constructor correctly initializes only the FilePath field, which is appropriate since other fields will be populated during authentication or when loading from file.

53-61: LGTM!

The SaveTokenToFile method correctly:

Sets the Type field on the embedded BaseTokenStorage before saving

Passes the full QwenTokenStorage (ts) as the second parameter to Save(), ensuring all provider-specific fields are serialized alongside base fields

This aligns with the Save method signature shown in pkg/llmproxy/auth/base/token_storage.go:52-95.

pkg/llmproxy/thinking/apply.go (1)

535-565: Good iFlow compatibility handling for effort parsing.

Supporting both nested reasoning.effort and escaped literal reasoning\.effort is a solid stabilization change.

pkg/llmproxy/thinking/strip_test.go (1)

9-71: Test coverage for iFlow strip behavior looks solid.

These cases clearly validate field stripping and preservation of unrelated payload content.

pkg/llmproxy/thinking/apply_iflow_test.go (1)

5-64: Good targeted tests for extractIFlowConfig.

The table-driven cases exercise both payload shapes and expected mode/level mapping cleanly.

coderabbitai · 2026-03-15T00:29:58Z

-	pkgconfig "github.com/kooshapari/cliproxyapi-plusplus/v6/internal/config"
+	pkgconfig "github.com/kooshapari/cliproxyapi-plusplus/v6/pkg/llmproxy/config"
 	tls "github.com/refraction-networking/utls"
 	log "github.com/sirupsen/logrus"


🧹 Nitpick | 🔵 Trivial

Consider migrating to zerolog.

The code uses logrus for logging, but the coding guidelines specify using zerolog. While this is pre-existing code, consider aligning with the guideline in a future refactor.

As per coding guidelines: "Use zerolog for logging (NOT fmt.Print)".

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@pkg/llmproxy/auth/claude/utls_transport.go` at line 13, Current code imports logrus as log (import alias: log "github.com/sirupsen/logrus"); follow-up refactor should replace that import with zerolog (e.g., github.com/rs/zerolog) and update all usages of the log alias (calls like log.Info, log.Errorf, log.Debug, etc.) to zerolog equivalents (logger.Info().Msg, logger.Error().Err(err).Msg, etc.), initializing a zerolog logger instance where needed and ensuring structured fields are migrated to With/Dict patterns; search for the import line and all references to the log identifier in utls_transport.go (and related files) to perform the replacement consistently.

coderabbitai · 2026-03-15T00:29:58Z

+func TestApplyMiniMaxStripsReasoningVariantsAndLegacyFields(t *testing.T) {
+	a := NewApplier()
+	body := []byte(`{
+		"reasoning_split":true,
+		"reasoning_effort":"high",
+		"reasoning":{"effort":"medium","summary":{"text":"legacy"}},
+		"variant":"low",
+		"foo":"bar"
+	}`)
+	cfg := thinking.ThinkingConfig{Mode: thinking.ModeLevel, Level: thinking.LevelLow}
+	model := &registry.ModelInfo{
+		ID:       "minimax-m2.5",
+		Thinking: &registry.ThinkingSupport{Levels: []string{"low", "medium", "high"}},
+	}
+
+	out, err := a.Apply(body, cfg, model)
+	if err != nil {
+		t.Fatalf("unexpected error: %v", err)
+	}
+
+	res := gjson.ParseBytes(out)
+	if !res.Get("reasoning_split").Bool() {
+		t.Fatalf("expected reasoning_split=true")
+	}
+	if res.Get("reasoning_effort").Exists() {
+		t.Fatalf("expected reasoning_effort to be removed")
+	}
+	if res.Get("reasoning").Exists() {
+		t.Fatalf("expected reasoning object to be removed")
+	}
+	if res.Get("reasoning.effort").Exists() {
+		t.Fatalf("expected reasoning.effort to be removed")
+	}
+	if res.Get("variant").Exists() {
+		t.Fatalf("expected variant to be removed")
+	}
+	if res.Get("foo").String() != "bar" {
+		t.Fatalf("expected unrelated fields to be preserved")
+	}
+}


🧹 Nitpick | 🔵 Trivial

Add a regression case for literal reasoning.effort input.

Please add one MiniMax test body containing a top-level "reasoning.effort" key and assert it is removed after Apply, to guard the compatibility cleanup path.

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@pkg/llmproxy/thinking/provider/iflow/apply_test.go` around lines 54 - 93, Update the TestApplyMiniMaxStripsReasoningVariantsAndLegacyFields test to include a regression case where the JSON body contains a top-level "reasoning.effort" key (e.g. "reasoning.effort":"medium") alongside the existing fields; call a.Apply(body, cfg, model) as before and add an assertion that res.Get("reasoning.effort").Exists() is false (or equivalent check) to ensure Apply removes the literal reasoning.effort key; keep the rest of the assertions unchanged so unrelated fields (e.g. "foo") are still preserved.

coderabbitai · 2026-03-15T00:29:58Z

+	result, _ = sjson.DeleteBytes(result, "reasoning_effort")
+	result, _ = sjson.DeleteBytes(result, "reasoning")
+	result, _ = sjson.DeleteBytes(result, "variant")


⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🏁 Script executed:

# First, get the file structure and locate the apply.go file find . -type f -name "apply.go" | head -20

Repository: KooshaPari/cliproxyapi-plusplus

Length of output: 503

🏁 Script executed:

# Read the apply.go file with context around lines 146-148 cat -n pkg/llmproxy/thinking/provider/iflow/apply.go | head -170 | tail -50

Repository: KooshaPari/cliproxyapi-plusplus

Length of output: 1842

🏁 Script executed:

# Look for extractIFlowConfig and understand what fields it expects rg -B 5 -A 15 "extractIFlowConfig" pkg/llmproxy/thinking/

Repository: KooshaPari/cliproxyapi-plusplus

Length of output: 5435

🏁 Script executed:

# Search for "reasoning.effort" or "reasoning_effort" patterns in the codebase rg "reasoning[._]effort" --type go

Repository: KooshaPari/cliproxyapi-plusplus

Length of output: 31454

🏁 Script executed:

# Check the test for applyMiniMax to see what's expected cat -n pkg/llmproxy/thinking/provider/iflow/apply_test.go | grep -A 30 "applyMiniMax"

Repository: KooshaPari/cliproxyapi-plusplus

Length of output: 57

🏁 Script executed:

# Look for how sjson is used with dotted keys in the codebase rg -B 2 -A 2 "sjson.*\\\\\." --type go

Repository: KooshaPari/cliproxyapi-plusplus

Length of output: 57

🏁 Script executed:

# Check the full apply_iflow_test.go test cases cat -n pkg/llmproxy/thinking/apply_iflow_test.go | grep -B 10 -A 10 "reasoning.effort"

Repository: KooshaPari/cliproxyapi-plusplus

Length of output: 1105

🏁 Script executed:

# Search for actual sjson usage patterns with special characters rg "sjson\.(Set|Delete)" pkg/llmproxy/thinking/provider/iflow/ --type go -B 1 -A 1

Repository: KooshaPari/cliproxyapi-plusplus

Length of output: 1368

🏁 Script executed:

# Look for applyMiniMax tests cat -n pkg/llmproxy/thinking/provider/iflow/apply_test.go

Repository: KooshaPari/cliproxyapi-plusplus

Length of output: 3416

🏁 Script executed:

# Check how sjson handles dotted keys - search for documentation or usage examples rg "sjson" pkg/llmproxy/thinking/provider/iflow/apply_test.go -B 2 -A 2

Repository: KooshaPari/cliproxyapi-plusplus

Length of output: 57

🏁 Script executed:

# Let's check if there are any tests that verify the cleanup of reasoning.effort (literal key) rg "reasoning\.effort" pkg/llmproxy/thinking/provider/iflow/ -B 3 -A 3

Repository: KooshaPari/cliproxyapi-plusplus

Length of output: 754

🏁 Script executed:

# Check if there are any tests that specifically test literal "reasoning.effort" key removal in applyMiniMax cat -n pkg/llmproxy/thinking/provider/iflow/apply_test.go | grep -B 20 "reasoning.effort"

Repository: KooshaPari/cliproxyapi-plusplus

Length of output: 3114

🏁 Script executed:

# Look at the full test to see if literal keys are tested cat -n pkg/llmproxy/thinking/provider/iflow/apply_test.go | sed -n '54,93p'

Repository: KooshaPari/cliproxyapi-plusplus

Length of output: 1546

🏁 Script executed:

# Search for any sjson escape examples or documentation in the codebase rg "sjson.*Delete.*\\\\\\\\" --type go

Repository: KooshaPari/cliproxyapi-plusplus

Length of output: 57

🏁 Script executed:

# Check the tidwall/sjson documentation by searching for test patterns rg -B 5 -A 5 "literal.*key\|key.*literal" pkg/llmproxy/thinking/provider/iflow/

Repository: KooshaPari/cliproxyapi-plusplus

Length of output: 57

🏁 Script executed:

# Run the actual test to see if it passes cd . && go test ./pkg/llmproxy/thinking/provider/iflow -v -run TestApplyMiniMaxStripsReasoningVariantsAndLegacyFields 2>&1 | head -100

Repository: KooshaPari/cliproxyapi-plusplus

Length of output: 597

🏁 Script executed:

# Check if the test input can actually contain a literal "reasoning.effort" key python3 << 'EOF' import json import subprocess # Test 1: nested structure (what's in the test) nested = { "reasoning_split": True, "reasoning_effort": "high", "reasoning": {"effort": "medium", "summary": {"text": "legacy"}}, "variant": "low", "foo": "bar" } # Test 2: literal dotted key literal = { "reasoning.effort": "high", # literal key with a dot "foo": "bar" } print("Test 1 - Nested structure:") print(json.dumps(nested)) print("\nTest 2 - Literal dotted key:") print(json.dumps(literal)) EOF

Repository: KooshaPari/cliproxyapi-plusplus

Length of output: 326

🏁 Script executed:

# Now let's test if applyMiniMax would clean up a literal "reasoning.effort" key # by creating a simple test python3 << 'EOF' import json # When JSON is parsed, "reasoning.effort" as a key is literally that string # gjson.Get("reasoning.effort") looks for nested path reasoning -> effort # But a literal key "reasoning.effort" is at the top level body_with_literal_key = { "reasoning.effort": "high", "foo": "bar" } print("JSON with literal dotted key:") print(json.dumps(body_with_literal_key)) print("\nAccessing via gjson path 'reasoning.effort' would NOT find this") print("because it looks for nested path, not the literal key") print("\nAccessing via literal path 'reasoning\\.effort' (escaped) would find it") EOF

Repository: KooshaPari/cliproxyapi-plusplus

Length of output: 338

🏁 Script executed:

# Look for how sjson handles literal keys with dots rg -B 3 -A 3 "sjson.*Delete.*reasoning" pkg/llmproxy/thinking/ --type go

Repository: KooshaPari/cliproxyapi-plusplus

Length of output: 693

🏁 Script executed:

# Check if there's documentation or examples of cleaning literal dotted keys in tests cat -n pkg/llmproxy/thinking/strip_test.go | head -50

Repository: KooshaPari/cliproxyapi-plusplus

Length of output: 1770

🏁 Script executed:

# Let's verify what the extractIFlowConfig actually extracts from a literal key cat -n pkg/llmproxy/thinking/apply.go | grep -A 30 "func extractIFlowConfig"

Repository: KooshaPari/cliproxyapi-plusplus

Length of output: 1409

🏁 Script executed:

# Now create a test to verify the behavior with a literal "reasoning.effort" key python3 << 'EOF' import json import subprocess # Create test input with literal "reasoning.effort" key test_json = { "reasoning.effort": "high", "reasoning_split": False, "foo": "bar" } print("Test JSON with literal 'reasoning.effort' key:") print(json.dumps(test_json)) # This would be sent as input to applyMiniMax # The question is: does the cleanup remove it? EOF

Repository: KooshaPari/cliproxyapi-plusplus

Length of output: 190

🏁 Script executed:

# Check the actual behavior by looking at strip.go which seems to handle similar cleanup cat -n pkg/llmproxy/thinking/strip.go | head -80

Repository: KooshaPari/cliproxyapi-plusplus

Length of output: 2353

🏁 Script executed:

# Perfect! I can see the key finding: in extractIFlowConfig (line 541), # it specifically looks for escaped literal key "reasoning\\.effort" # Let's verify if applyMiniMax actually handles this # Create a test to see if the literal key would be cleaned up python3 << 'EOF' # The critical finding: # extractIFlowConfig DOES handle both: # 1. Nested: gjson.GetBytes(body, "reasoning.effort") # 2. Literal: gjson.GetBytes(body, "reasoning\\.effort") # But applyMiniMax only cleans: # - "reasoning_effort" (underscore) # - "reasoning" (nested object) # - "variant" # It does NOT clean the literal "reasoning.effort" key! # This means if extractIFlowConfig accepts it, but applyMiniMax doesn't remove it, # it will leak upstream. print("Confirmed: The literal 'reasoning.effort' key is supported by extractIFlowConfig") print("but NOT cleaned up by applyMiniMax - this is indeed a bug!") EOF

Repository: KooshaPari/cliproxyapi-plusplus

Length of output: 214

🏁 Script executed:

# Now let's verify the correct escaping for sjson # Check how sjson is used elsewhere with dots rg "sjson.*\\\\\\\\" --type go | head -20

Repository: KooshaPari/cliproxyapi-plusplus

Length of output: 57

🏁 Script executed:

# Let's see if there are any other examples of sjson with escaped keys rg "sjson.Delete.*\\\\" pkg/llmproxy/ --type go

Repository: KooshaPari/cliproxyapi-plusplus

Length of output: 57

🏁 Script executed:

# Check what the web says about sjson path escaping cat > /tmp/test_sjson.go << 'EOF' package main import ( "fmt" "github.com/tidwall/sjson" "github.com/tidwall/gjson" ) func main() { // Test: does escaping work for sjson.DeleteBytes? body := []byte(`{"reasoning.effort":"high","foo":"bar"}`) fmt.Println("Original:", string(body)) // Try to delete with escaped path result, _ := sjson.DeleteBytes(body, "reasoning\\.effort") fmt.Println("After delete:", string(result)) // Verify it's gone if gjson.GetBytes(result, "reasoning\\.effort").Exists() { fmt.Println("ERROR: literal key still exists!") } else { fmt.Println("SUCCESS: literal key was removed") } } EOF cd /tmp && go run test_sjson.go 2>&1

Repository: KooshaPari/cliproxyapi-plusplus

Length of output: 470

🏁 Script executed:

# Let's run the actual test in the repo context to see if there's a test case for literal keys cd pkg/llmproxy/thinking/provider/iflow && go test -v -run TestApplyMiniMaxStripsReasoningVariantsAndLegacyFields 2>&1 | tail -20

Repository: KooshaPari/cliproxyapi-plusplus

Length of output: 312

🏁 Script executed:

# Check if the test input actually includes a literal "reasoning.effort" key cat -n pkg/llmproxy/thinking/provider/iflow/apply_test.go | sed -n '54,62p'

Repository: KooshaPari/cliproxyapi-plusplus

Length of output: 407

🏁 Script executed:

# Let's create a focused test in the actual repo to verify the behavior cat > /tmp/test_literal_key.sh << 'EOF' #!/bin/bash cd /repo cat > test_literal_key.go << 'GOEOF' package main import ( "fmt" "github.com/tidwall/sjson" "github.com/tidwall/gjson" ) func main() { // Test case: literal "reasoning.effort" key body := []byte(`{"reasoning.effort":"high","foo":"bar"}`) fmt.Println("Original:", string(body)) // Current applyMiniMax cleanup (no escape) result1, _ := sjson.DeleteBytes(body, "reasoning.effort") fmt.Println("After unescaped delete:", string(result1)) fmt.Println("Literal key still exists?", gjson.GetBytes(result1, "reasoning\\.effort").Exists()) // Proposed fix with escape result2, _ := sjson.DeleteBytes(body, "reasoning\\.effort") fmt.Println("After escaped delete:", string(result2)) fmt.Println("Literal key still exists?", gjson.GetBytes(result2, "reasoning\\.effort").Exists()) } GOEOF go run test_literal_key.go 2>&1 EOF bash /tmp/test_literal_key.sh

Repository: KooshaPari/cliproxyapi-plusplus

Length of output: 375

Add cleanup for literal reasoning.effort key in applyMiniMax.

The current cleanup removes nested reasoning and reasoning_effort (underscore), but misses the literal dotted key "reasoning.effort". Since extractIFlowConfig explicitly accepts this format (via escaped path lookup "reasoning\\.effort"), it can be present in input. When not removed, it leaks upstream in the transformed request.

Use the escaped path "reasoning\\.effort" with sjson.DeleteBytes to remove the literal key:

Proposed fix

result, _ := sjson.SetBytes(body, "reasoning_split", reasoningSplit) result, _ = sjson.DeleteBytes(result, "reasoning_effort") +result, _ = sjson.DeleteBytes(result, "reasoning\\.effort") result, _ = sjson.DeleteBytes(result, "reasoning") result, _ = sjson.DeleteBytes(result, "variant")

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@pkg/llmproxy/thinking/provider/iflow/apply.go` around lines 146 - 148, The cleanup in applyMiniMax misses the literal dotted key "reasoning.effort" so add an sjson.DeleteBytes call to remove the escaped path "reasoning\\.effort" alongside the existing deletes; locate the applyMiniMax function in pkg/llmproxy/thinking/provider/iflow/apply.go and insert a deletion using sjson.DeleteBytes(result, "reasoning\\.effort") (follow existing error handling pattern) so that keys accepted by extractIFlowConfig (which reads "reasoning\\.effort") are not leaked upstream.

coderabbitai · 2026-03-15T00:29:59Z

 		paths = []string{
 			"chat_template_kwargs.enable_thinking",
 			"chat_template_kwargs.clear_thinking",
+			"reasoning",
 			"reasoning_split",
 			"reasoning_effort",
+			"variant",
 		}


⚠️ Potential issue | 🟠 Major

❓ Verification inconclusive

Script executed:

find . -type f -name "*.go" | grep -E "(strip|apply)" | head -20

Repository: KooshaPari/cliproxyapi-plusplus

Repository: KooshaPari/cliproxyapi-plusplus
Exit code: 0

stdout:

./pkg/llmproxy/thinking/apply.go ./pkg/llmproxy/thinking/apply_codex_variant_test.go ./pkg/llmproxy/thinking/apply_iflow_test.go ./pkg/llmproxy/thinking/apply_logging_test.go ./pkg/llmproxy/thinking/provider/antigravity/apply.go ./pkg/llmproxy/thinking/provider/antigravity/apply_test.go ./pkg/llmproxy/thinking/provider/claude/apply.go ./pkg/llmproxy/thinking/provider/claude/apply_test.go ./pkg/llmproxy/thinking/provider/codex/apply.go ./pkg/llmproxy/thinking/provider/gemini/apply.go ./pkg/llmproxy/thinking/provider/gemini/apply_test.go ./pkg/llmproxy/thinking/provider/geminicli/apply.go ./pkg/llmproxy/thinking/provider/geminicli/apply_test.go ./pkg/llmproxy/thinking/provider/iflow/apply.go ./pkg/llmproxy/thinking/provider/iflow/apply_test.go ./pkg/llmproxy/thinking/provider/kimi/apply.go ./pkg/llmproxy/thinking/provider/openai/apply.go ./pkg/llmproxy/thinking/strip.go ./pkg/llmproxy/thinking/strip_test.go

Script executed:

cd . && git ls-files | grep -E "llmproxy.*thinking" | head -20

Repository: KooshaPari/cliproxyapi-plusplus

Repository: KooshaPari/cliproxyapi-plusplus
Exit code: 0

stdout:

pkg/llmproxy/executor/thinking_providers.go pkg/llmproxy/thinking/apply.go pkg/llmproxy/thinking/apply_codex_variant_test.go pkg/llmproxy/thinking/apply_iflow_test.go pkg/llmproxy/thinking/apply_logging_test.go pkg/llmproxy/thinking/convert.go pkg/llmproxy/thinking/convert_test.go pkg/llmproxy/thinking/errors.go pkg/llmproxy/thinking/log_redaction.go pkg/llmproxy/thinking/log_redaction_test.go pkg/llmproxy/thinking/provider/antigravity/apply.go pkg/llmproxy/thinking/provider/antigravity/apply_test.go pkg/llmproxy/thinking/provider/claude/apply.go pkg/llmproxy/thinking/provider/claude/apply_test.go pkg/llmproxy/thinking/provider/codex/apply.go pkg/llmproxy/thinking/provider/gemini/apply.go pkg/llmproxy/thinking/provider/gemini/apply_test.go pkg/llmproxy/thinking/provider/geminicli/apply.go pkg/llmproxy/thinking/provider/geminicli/apply_test.go pkg/llmproxy/thinking/provider/iflow/apply.go

Script executed:

cat -n pkg/llmproxy/thinking/strip.go

Repository: KooshaPari/cliproxyapi-plusplus

Repository: KooshaPari/cliproxyapi-plusplus
Exit code: 0

stdout:

1 // Package thinking provides unified thinking configuration processing. 2 package thinking 3 4 import ( 5 "github.com/tidwall/gjson" 6 "github.com/tidwall/sjson" 7 ) 8 9 // StripThinkingConfig removes thinking configuration fields from request body. 10 // 11 // This function is used when a model doesn't support thinking but the request 12 // contains thinking configuration. The configuration is silently removed to 13 // prevent upstream API errors. 14 // 15 // Parameters: 16 // - body: Original request body JSON 17 // - provider: Provider name (determines which fields to strip) 18 // 19 // Returns: 20 // - Modified request body JSON with thinking configuration removed 21 // - Original body is returned unchanged if: 22 // - body is empty or invalid JSON 23 // - provider is unknown 24 // - no thinking configuration found 25 func StripThinkingConfig(body []byte, provider string) []byte { 26 if len(body) == 0 || !gjson.ValidBytes(body) { 27 return body 28 } 29 30 var paths []string 31 switch provider { 32 case "claude": 33 paths = []string{"thinking"} 34 case "gemini": 35 paths = []string{"generationConfig.thinkingConfig"} 36 case "gemini-cli", "antigravity": 37 paths = []string{"request.generationConfig.thinkingConfig"} 38 case "openai": 39 paths = []string{"reasoning_effort"} 40 case "codex": 41 paths = []string{"reasoning.effort"} 42 case "iflow": 43 paths = []string{ 44 "chat_template_kwargs.enable_thinking", 45 "chat_template_kwargs.clear_thinking", 46 "reasoning", 47 "reasoning_split", 48 "reasoning_effort", 49 "variant", 50 } 51 default: 52 return body 53 } 54 55 result := body 56 for _, path := range paths { 57 result, _ = sjson.DeleteBytes(result, path) 58 } 59 if provider == "iflow" { 60 if kwargs := gjson.GetBytes(result, "chat_template_kwargs"); kwargs.Exists() && len(kwargs.Map()) == 0 { 61 result, _ = sjson.DeleteBytes(result, "chat_template_kwargs") 62 } 63 } 64 return result 65 }

Script executed:

cat -n pkg/llmproxy/thinking/provider/iflow/apply.go

Repository: KooshaPari/cliproxyapi-plusplus

Repository: KooshaPari/cliproxyapi-plusplus
Exit code: 0

stdout:

1 // Package iflow implements thinking configuration for iFlow models. 2 // 3 // iFlow models use boolean toggle semantics: 4 // - Models using chat_template_kwargs.enable_thinking (boolean toggle) 5 // - MiniMax models: reasoning_split (boolean) 6 // 7 // Level values are converted to boolean: none=false, all others=true 8 // See: _bmad-output/planning-artifacts/architecture.md#Epic-9 9 package iflow 10 11 import ( 12 "strings" 13 14 "github.com/kooshapari/cliproxyapi-plusplus/v6/pkg/llmproxy/registry" 15 "github.com/kooshapari/cliproxyapi-plusplus/v6/pkg/llmproxy/thinking" 16 "github.com/tidwall/gjson" 17 "github.com/tidwall/sjson" 18 ) 19 20 // Applier implements thinking.ProviderApplier for iFlow models. 21 // 22 // iFlow-specific behavior: 23 // - enable_thinking toggle models: enable_thinking boolean 24 // - GLM models: enable_thinking boolean + clear_thinking=false 25 // - MiniMax models: reasoning_split boolean 26 // - Level to boolean: none=false, others=true 27 // - No quantized support (only on/off) 28 type Applier struct{} 29 30 var _ thinking.ProviderApplier = (*Applier)(nil) 31 32 // NewApplier creates a new iFlow thinking applier. 33 func NewApplier() *Applier { 34 return &Applier{} 35 } 36 37 func init() { 38 thinking.RegisterProvider("iflow", NewApplier()) 39 } 40 41 // Apply applies thinking configuration to iFlow request body. 42 // 43 // Expected output format (GLM): 44 // 45 // { 46 // "chat_template_kwargs": { 47 // "enable_thinking": true, 48 // "clear_thinking": false 49 // } 50 // } 51 // 52 // Expected output format (MiniMax): 53 // 54 // { 55 // "reasoning_split": true 56 // } 57 func (a *Applier) Apply(body []byte, config thinking.ThinkingConfig, modelInfo *registry.ModelInfo) ([]byte, error) { 58 if thinking.IsUserDefinedModel(modelInfo) { 59 return body, nil 60 } 61 if modelInfo.Thinking == nil { 62 return body, nil 63 } 64 65 if isEnableThinkingModel(modelInfo.ID) { 66 return applyEnableThinking(body, config, isGLMModel(modelInfo.ID)), nil 67 } 68 69 if isMiniMaxModel(modelInfo.ID) { 70 return applyMiniMax(body, config), nil 71 } 72 73 return body, nil 74 } 75 76 // configToBoolean converts ThinkingConfig to boolean for iFlow models. 77 // 78 // Conversion rules: 79 // - ModeNone: false 80 // - ModeAuto: true 81 // - ModeBudget + Budget=0: false 82 // - ModeBudget + Budget>0: true 83 // - ModeLevel + Level="none": false 84 // - ModeLevel + any other level: true 85 // - Default (unknown mode): true 86 func configToBoolean(config thinking.ThinkingConfig) bool { 87 switch config.Mode { 88 case thinking.ModeNone: 89 return false 90 case thinking.ModeAuto: 91 return true 92 case thinking.ModeBudget: 93 return config.Budget > 0 94 case thinking.ModeLevel: 95 return config.Level != thinking.LevelNone 96 default: 97 return true 98 } 99 } 100 101 // applyEnableThinking applies thinking configuration for models that use 102 // chat_template_kwargs.enable_thinking format. 103 // 104 // Output format when enabled: 105 // 106 // {"chat_template_kwargs": {"enable_thinking": true, "clear_thinking": false}} 107 // 108 // Output format when disabled: 109 // 110 // {"chat_template_kwargs": {"enable_thinking": false}} 111 // 112 // Note: clear_thinking is only set for GLM models when thinking is enabled. 113 func applyEnableThinking(body []byte, config thinking.ThinkingConfig, setClearThinking bool) []byte { 114 enableThinking := configToBoolean(config) 115 116 if len(body) == 0 || !gjson.ValidBytes(body) { 117 body = []byte(`{}`) 118 } 119 120 result, _ := sjson.SetBytes(body, "chat_template_kwargs.enable_thinking", enableThinking) 121 122 // clear_thinking is a GLM-only knob, strip it for other models. 123 result, _ = sjson.DeleteBytes(result, "chat_template_kwargs.clear_thinking") 124 125 // clear_thinking only needed when thinking is enabled 126 if enableThinking && setClearThinking { 127 result, _ = sjson.SetBytes(result, "chat_template_kwargs.clear_thinking", false) 128 } 129 130 return result 131 } 132 133 // applyMiniMax applies thinking configuration for MiniMax models. 134 // 135 // Output format: 136 // 137 // {"reasoning_split": true/false} 138 func applyMiniMax(body []byte, config thinking.ThinkingConfig) []byte { 139 reasoningSplit := configToBoolean(config) 140 141 if len(body) == 0 || !gjson.ValidBytes(body) { 142 body = []byte(`{}`) 143 } 144 145 result, _ := sjson.SetBytes(body, "reasoning_split", reasoningSplit) 146 result, _ = sjson.DeleteBytes(result, "reasoning_effort") 147 result, _ = sjson.DeleteBytes(result, "reasoning") 148 result, _ = sjson.DeleteBytes(result, "variant") 149 150 return result 151 } 152 153 // isEnableThinkingModel determines if the model uses chat_template_kwargs.enable_thinking format. 154 func isEnableThinkingModel(modelID string) bool { 155 if isGLMModel(modelID) { 156 return true 157 } 158 id := strings.ToLower(modelID) 159 switch id { 160 case "qwen3-max-preview", "deepseek-v3.2", "deepseek-v3.1": 161 return true 162 default: 163 return false 164 } 165 } 166 167 // isGLMModel determines if the model is a GLM series model. 168 func isGLMModel(modelID string) bool { 169 return strings.HasPrefix(strings.ToLower(modelID), "glm") 170 } 171 172 // isMiniMaxModel determines if the model is a MiniMax series model. 173 // MiniMax models use reasoning_split format. 174 func isMiniMaxModel(modelID string) bool { 175 return strings.HasPrefix(strings.ToLower(modelID), "minimax") 176 }

Script executed:

cat -n pkg/llmproxy/thinking/provider/iflow/apply_test.go

Repository: KooshaPari/cliproxyapi-plusplus

Repository: KooshaPari/cliproxyapi-plusplus
Exit code: 0

stdout:

1 package iflow 2 3 import ( 4 "testing" 5 6 "github.com/kooshapari/cliproxyapi-plusplus/v6/pkg/llmproxy/registry" 7 "github.com/kooshapari/cliproxyapi-plusplus/v6/pkg/llmproxy/thinking" 8 "github.com/tidwall/gjson" 9 ) 10 11 func TestApplyMiniMaxStripsReasoningEffort(t *testing.T) { 12 a := NewApplier() 13 body := []byte(`{"reasoning_effort":"high","foo":"bar"}`) 14 cfg := thinking.ThinkingConfig{Mode: thinking.ModeLevel, Level: thinking.LevelHigh} 15 model := &registry.ModelInfo{ 16 ID: "minimax-m2.5", 17 Thinking: &registry.ThinkingSupport{Levels: []string{"low", "medium", "high"}}, 18 } 19 20 out, err := a.Apply(body, cfg, model) 21 if err != nil { 22 t.Fatalf("unexpected error: %v", err) 23 } 24 25 res := gjson.ParseBytes(out) 26 if !res.Get("reasoning_split").Bool() { 27 t.Fatalf("expected reasoning_split=true for non-none config") 28 } 29 if res.Get("reasoning_effort").Exists() { 30 t.Fatalf("expected reasoning_effort to be removed") 31 } 32 } 33 34 func TestApplyMiniMaxSetsReasoningSplitFalseForModeNone(t *testing.T) { 35 a := NewApplier() 36 body := []byte(`{"reasoning_effort":"high","foo":"bar"}`) 37 cfg := thinking.ThinkingConfig{Mode: thinking.ModeNone} 38 model := &registry.ModelInfo{ 39 ID: "minimax-m2", 40 Thinking: &registry.ThinkingSupport{Levels: []string{"none", "low", "high"}}, 41 } 42 43 out, err := a.Apply(body, cfg, model) 44 if err != nil { 45 t.Fatalf("unexpected error: %v", err) 46 } 47 48 res := gjson.ParseBytes(out) 49 if res.Get("reasoning_split").Bool() { 50 t.Fatalf("expected reasoning_split=false for ModeNone") 51 } 52 } 53 54 func TestApplyMiniMaxStripsReasoningVariantsAndLegacyFields(t *testing.T) { 55 a := NewApplier() 56 body := []byte(`{ 57 "reasoning_split":true, 58 "reasoning_effort":"high", 59 "reasoning":{"effort":"medium","summary":{"text":"legacy"}}, 60 "variant":"low", 61 "foo":"bar" 62 }`) 63 cfg := thinking.ThinkingConfig{Mode: thinking.ModeLevel, Level: thinking.LevelLow} 64 model := &registry.ModelInfo{ 65 ID: "minimax-m2.5", 66 Thinking: &registry.ThinkingSupport{Levels: []string{"low", "medium", "high"}}, 67 } 68 69 out, err := a.Apply(body, cfg, model) 70 if err != nil { 71 t.Fatalf("unexpected error: %v", err) 72 } 73 74 res := gjson.ParseBytes(out) 75 if !res.Get("reasoning_split").Bool() { 76 t.Fatalf("expected reasoning_split=true") 77 } 78 if res.Get("reasoning_effort").Exists() { 79 t.Fatalf("expected reasoning_effort to be removed") 80 } 81 if res.Get("reasoning").Exists() { 82 t.Fatalf("expected reasoning object to be removed") 83 } 84 if res.Get("reasoning.effort").Exists() { 85 t.Fatalf("expected reasoning.effort to be removed") 86 } 87 if res.Get("variant").Exists() { 88 t.Fatalf("expected variant to be removed") 89 } 90 if res.Get("foo").String() != "bar" { 91 t.Fatalf("expected unrelated fields to be preserved") 92 } 93 }

Script executed:

rg "reasoning\.effort" --type go -A 2 -B 2

Repository: KooshaPari/cliproxyapi-plusplus

Repository: KooshaPari/cliproxyapi-plusplus
Exit code: 0

stdout:

test/thinking_conversion_test.go- model: "level-model", test/thinking_conversion_test.go- inputJSON: `{"model":"level-model","messages":[{"role":"user","content":"hi"}]}`, test/thinking_conversion_test.go: expectField: "reasoning.effort", test/thinking_conversion_test.go- expectValue: "medium", test/thinking_conversion_test.go- expectErr: false, -- test/thinking_conversion_test.go- model: "level-model(medium)", test/thinking_conversion_test.go- inputJSON: `{"model":"level-model(medium)","messages":[{"role":"user","content":"hi"}]}`, test/thinking_conversion_test.go: expectField: "reasoning.effort", test/thinking_conversion_test.go- expectValue: "medium", test/thinking_conversion_test.go- expectErr: false, -- test/thinking_conversion_test.go- model: "level-model(none)", test/thinking_conversion_test.go- inputJSON: `{"model":"level-model(none)","messages":[{"role":"user","content":"hi"}]}`, test/thinking_conversion_test.go: expectField: "reasoning.effort", test/thinking_conversion_test.go- expectValue: "minimal", test/thinking_conversion_test.go- expectErr: false, -- test/thinking_conversion_test.go- model: "level-model(auto)", test/thinking_conversion_test.go- inputJSON: `{"model":"level-model(auto)","messages":[{"role":"user","content":"hi"}]}`, test/thinking_conversion_test.go: expectField: "reasoning.effort", test/thinking_conversion_test.go- expectValue: "medium", test/thinking_conversion_test.go- expectErr: false, -- test/thinking_conversion_test.go- model: "level-model", test/thinking_conversion_test.go- inputJSON: `{"model":"level-model","contents":[{"role":"user","parts":[{"text":"hi"}]}]}`, test/thinking_conversion_test.go: expectField: "reasoning.effort", test/thinking_conversion_test.go- expectValue: "medium", test/thinking_conversion_test.go- expectErr: false, -- test/thinking_conversion_test.go- model: "level-model(8192)", test/thinking_conversion_test.go- inputJSON: `{"model":"level-model(8192)","contents":[{"role":"user","parts":[{"text":"hi"}]}]}`, test/thinking_conversion_test.go: expectField: "reasoning.effort", test/thinking_conversion_test.go- expectValue: "medium", test/thinking_conversion_test.go- expectErr: false, -- test/thinking_conversion_test.go- model: "level-model(64000)", test/thinking_conversion_test.go- inputJSON: `{"model":"level-model(64000)","contents":[{"role":"user","parts":[{"text":"hi"}]}]}`, test/thinking_conversion_test.go: expectField: "reasoning.effort", test/thinking_conversion_test.go- expectValue: "high", test/thinking_conversion_test.go- expectErr: false, -- test/thinking_conversion_test.go- model: "level-model(0)", test/thinking_conversion_test.go- inputJSON: `{"model":"level-model(0)","contents":[{"role":"user","parts":[{"text":"hi"}]}]}`, test/thinking_conversion_test.go: expectField: "reasoning.effort", test/thinking_conversion_test.go- expectValue: "minimal", test/thinking_conversion_test.go- expectErr: false, -- test/thinking_conversion_test.go- model: "level-model(-1)", test/thinking_conversion_test.go- inputJSON: `{"model":"level-model(-1)","contents":[{"role":"user","parts":[{"text":"hi"}]}]}`, test/thinking_conversion_test.go: expectField: "reasoning.effort", test/thinking_conversion_test.go- expectValue: "medium", test/thinking_conversion_test.go- expectErr: false, -- test/thinking_conversion_test.go- model: "user-defined-model", test/thinking_conversion_test.go- inputJSON: `{"model":"user-defined-model","messages":[{"role":"user","content":"hi"}]}`, test/thinking_conversion_test.go: expectField: "reasoning.effort", test/thinking_conversion_test.go- expectValue: "medium", test/thinking_conversion_test.go- expectErr: false, -- test/thinking_conversion_test.go- model: "user-defined-model(8192)", test/thinking_conversion_test.go- inputJSON: `{"model":"user-defined-model(8192)","messages":[{"role":"user","content":"hi"}]}`, test/thinking_conversion_test.go: expectField: "reasoning.effort", test/thinking_conversion_test.go- expectValue: "medium", test/thinking_conversion_test.go- expectErr: false, -- test/thinking_conversion_test.go- model: "user-defined-model(64000)", test/thinking_conversion_test.go- inputJSON: `{"model":"user-defined-model(64000)","messages":[{"role":"user","content":"hi"}]}`, test/thinking_conversion_test.go: expectField: "reasoning.effort", test/thinking_conversion_test.go- expectValue: "xhigh", test/thinking_conversion_test.go- expectErr: false, -- test/thinking_conversion_test.go- model: "user-defined-model(0)", test/thinking_conversion_test.go- inputJSON: `{"model":"user-defined-model(0)","messages":[{"role":"user","content":"hi"}]}`, test/thinking_conversion_test.go: expectField: "reasoning.effort", test/thinking_conversion_test.go- expectValue: "none", test/thinking_conversion_test.go- expectErr: false, -- test/thinking_conversion_test.go- model: "user-defined-model(-1)", test/thinking_conversion_test.go- inputJSON: `{"model":"user-defined-model(-1)","messages":[{"role":"user","content":"hi"}]}`, test/thinking_conversion_test.go: expectField: "reasoning.effort", test/thinking_conversion_test.go- expectValue: "auto", test/thinking_conversion_test.go- expectErr: false, -- test/thinking_conversion_test.go- expectErr: true, test/thinking_conversion_test.go- }, test/thinking_conversion_test.go: // Case 82: OpenAI-Response to Codex, level high → passthrough reasoning.effort test/thinking_conversion_test.go- { test/thinking_conversion_test.go- name: "82", -- test/thinking_conversion_test.go- model: "level-model(high)", test/thinking_conversion_test.go- inputJSON: `{"model":"level-model(high)","input":[{"role":"user","content":"hi"}]}`, test/thinking_conversion_test.go: expectField: "reasoning.effort", test/thinking_conversion_test.go- expectValue: "high", test/thinking_conversion_test.go- expectErr: false, -- test/thinking_conversion_test.go- model: "gpt-5-codex(high)", test/thinking_conversion_test.go- inputJSON: `{"model":"gpt-5-codex(high)","input":[{"role":"user","content":"hi"}]}`, test/thinking_conversion_test.go: expectField: "reasoning.effort", test/thinking_conversion_test.go- expectValue: "high", test/thinking_conversion_test.go- expectErr: false, -- test/thinking_conversion_test.go- model: "gpt-5.2-codex(xhigh)", test/thinking_conversion_test.go- inputJSON: `{"model":"gpt-5.2-codex(xhigh)","input":[{"role":"user","content":"hi"}]}`, test/thinking_conversion_test.go: expectField: "reasoning.effort", test/thinking_conversion_test.go- expectValue: "xhigh", test/thinking_conversion_test.go- expectErr: false, -- test/thinking_conversion_test.go- model: "gpt-5.2-codex(none)", test/thinking_conversion_test.go- inputJSON: `{"model":"gpt-5.2-codex(none)","input":[{"role":"user","content":"hi"}]}`, test/thinking_conversion_test.go: expectField: "reasoning.effort", test/thinking_conversion_test.go- expectValue: "none", test/thinking_conversion_test.go- expectErr: false, -- test/thinking_conversion_test.go- model: "gpt-5-codex(none)", test/thinking_conversion_test.go- inputJSON: `{"model":"gpt-5-codex(none)","input":[{"role":"user","content":"hi"}]}`, test/thinking_conversion_test.go: expectField: "reasoning.effort", test/thinking_conversion_test.go- expectValue: "low", test/thinking_conversion_test.go- expectErr: false, -- test/thinking_conversion_test.go- model: "level-model", test/thinking_conversion_test.go- inputJSON: `{"model":"level-model","messages":[{"role":"user","content":"hi"}]}`, test/thinking_conversion_test.go: expectField: "reasoning.effort", test/thinking_conversion_test.go- expectValue: "medium", test/thinking_conversion_test.go- expectErr: false, -- test/thinking_conversion_test.go- model: "level-model", test/thinking_conversion_test.go- inputJSON: `{"model":"level-model","messages":[{"role":"user","content":"hi"}],"reasoning_effort":"medium"}`, test/thinking_conversion_test.go: expectField: "reasoning.effort", test/thinking_conversion_test.go- expectValue: "medium", test/thinking_conversion_test.go- expectErr: false, -- test/thinking_conversion_test.go- model: "level-model", test/thinking_conversion_test.go- inputJSON: `{"model":"level-model","messages":[{"role":"user","content":"hi"}],"reasoning_effort":"none"}`, test/thinking_conversion_test.go: expectField: "reasoning.effort", test/thinking_conversion_test.go- expectValue: "minimal", test/thinking_conversion_test.go- expectErr: false, -- test/thinking_conversion_test.go- model: "level-model", test/thinking_conversion_test.go- inputJSON: `{"model":"level-model","messages":[{"role":"user","content":"hi"}],"reasoning_effort":"auto"}`, test/thinking_conversion_test.go: expectField: "reasoning.effort", test/thinking_conversion_test.go- expectValue: "medium", test/thinking_conversion_test.go- expectErr: false, -- test/thinking_conversion_test.go- model: "level-model", test/thinking_conversion_test.go- inputJSON: `{"model":"level-model","contents":[{"role":"user","parts":[{"text":"hi"}]}]}`, test/thinking_conversion_test.go: expectField: "reasoning.effort", test/thinking_conversion_test.go- expectValue: "medium", test/thinking_conversion_test.go- expectErr: false, -- test/thinking_conversion_test.go- model: "level-model", test/thinking_conversion_test.go- inputJSON: `{"model":"level-model","contents":[{"role":"user","parts":[{"text":"hi"}]}],"generationConfig":{"thinkingConfig":{"thinkingBudget":8192}}}`, test/thinking_conversion_test.go: expectField: "reasoning.effort", test/thinking_conversion_test.go- expectValue: "medium", test/thinking_conversion_test.go- expectErr: false, -- test/thinking_conversion_test.go- model: "level-model", test/thinking_conversion_test.go- inputJSON: `{"model":"level-model","contents":[{"role":"user","parts":[{"text":"hi"}]}],"generationConfig":{"thinkingConfig":{"thinkingBudget":64000}}}`, test/thinking_conversion_test.go: expectField: "reasoning.effort", test/thinking_conversion_test.go- expectValue: "high", test/thinking_conversion_test.go- expectErr: false, -- test/thinking_conversion_test.go- model: "level-model", test/thinking_conversion_test.go- inputJSON: `{"model":"level-model","contents":[{"role":"user","parts":[{"text":"hi"}]}],"generationConfig":{"thinkingConfig":{"thinkingBudget":0}}}`, test/thinking_conversion_test.go: expectField: "reasoning.effort", test/thinking_conversion_test.go- expectValue: "minimal", test/thinking_conversion_test.go- expectErr: false, -- test/thinking_conversion_test.go- model: "level-model", test/thinking_conversion_test.go- inputJSON: `{"model":"level-model","contents":[{"role":"user","parts":[{"text":"hi"}]}],"generationConfig":{"thinkingConfig":{"thinkingBudget":-1}}}`, test/thinking_conversion_test.go: expectField: "reasoning.effort", test/thinking_conversion_test.go- expectValue: "medium", test/thinking_conversion_test.go- expectErr: false, -- test/thinking_conversion_test.go- model: "user-defined-model", test/thinking_conversion_test.go- inputJSON: `{"model":"user-defined-model","messages":[{"role":"user","content":"hi"}]}`, test/thinking_conversion_test.go: expectField: "reasoning.effort", test/thinking_conversion_test.go- expectValue: "medium", test/thinking_conversion_test.go- expectErr: false, -- test/thinking_conversion_test.go- model: "user-defined-model", test/thinking_conversion_test.go- inputJSON: `{"model":"user-defined-model","messages":[{"role":"user","content":"hi"}],"thinking":{"type":"enabled","budget_tokens":8192}}`, test/thinking_conversion_test.go: expectField: "reasoning.effort", test/thinking_conversion_test.go- expectValue: "medium", test/thinking_conversion_test.go- expectErr: false, -- test/thinking_conversion_test.go- model: "user-defined-model", test/thinking_conversion_test.go- inputJSON: `{"model":"user-defined-model","messages":[{"role":"user","content":"hi"}],"thinking":{"type":"enabled","budget_tokens":64000}}`, test/thinking_conversion_test.go: expectField: "reasoning.effort", test/thinking_conversion_test.go- expectValue: "xhigh", test/thinking_conversion_test.go- expectErr: false, -- test/thinking_conversion_test.go- model: "user-defined-model", test/thinking_conversion_test.go- inputJSON: `{"model":"user-defined-model","messages":[{"role":"user","content":"hi"}],"thinking":{"type":"enabled","budget_tokens":0}}`, test/thinking_conversion_test.go: expectField: "reasoning.effort", test/thinking_conversion_test.go- expectValue: "none", test/thinking_conversion_test.go- expectErr: false, -- test/thinking_conversion_test.go- model: "user-defined-model", test/thinking_conversion_test.go- inputJSON: `{"model":"user-defined-model","messages":[{"role":"user","content":"hi"}],"thinking":{"type":"enabled","budget_tokens":-1}}`, test/thinking_conversion_test.go: expectField: "reasoning.effort", test/thinking_conversion_test.go- expectValue: "auto", test/thinking_conversion_test.go- expectErr: false, -- test/thinking_conversion_test.go- expectErr: false, test/thinking_conversion_test.go- }, test/thinking_conversion_test.go: // Case 78: OpenAI-Response reasoning.effort=medium to Gemini → 8192 test/thinking_conversion_test.go- { test/thinking_conversion_test.go- name: "78", -- test/thinking_conversion_test.go- expectErr: false, test/thinking_conversion_test.go- }, test/thinking_conversion_test.go: // Case 79: OpenAI-Response reasoning.effort=medium to Claude → 8192 test/thinking_conversion_test.go- { test/thinking_conversion_test.go- name: "79", -- test/thinking_conversion_test.go- expectErr: true, test/thinking_conversion_test.go- }, test/thinking_conversion_test.go: // Case 82: OpenAI-Response to Codex, reasoning.effort=high → passthrough test/thinking_conversion_test.go- { test/thinking_conversion_test.go- name: "82", -- test/thinking_conversion_test.go- model: "level-model", test/thinking_conversion_test.go- inputJSON: `{"model":"level-model","input":[{"role":"user","content":"hi"}],"reasoning":{"effort":"high"}}`, test/thinking_conversion_test.go: expectField: "reasoning.effort", test/thinking_conversion_test.go- expectValue: "high", test/thinking_conversion_test.go- expectErr: false, test/thinking_conversion_test.go- }, test/thinking_conversion_test.go: // Case 83: OpenAI-Response to Codex, reasoning.effort=xhigh → out of range error test/thinking_conversion_test.go- { test/thinking_conversion_test.go- name: "83", -- test/thinking_conversion_test.go- // GitHub Copilot tests: /responses endpoint (codex format) - with body params test/thinking_conversion_test.go- test/thinking_conversion_test.go: // Case 118: OpenAI-Response to gpt-5-codex, reasoning.effort=high → high test/thinking_conversion_test.go- { test/thinking_conversion_test.go- name: "118", -- test/thinking_conversion_test.go- model: "gpt-5-codex", test/thinking_conversion_test.go- inputJSON: `{"model":"gpt-5-codex","input":[{"role":"user","content":"hi"}],"reasoning":{"effort":"high"}}`, test/thinking_conversion_test.go: expectField: "reasoning.effort", test/thinking_conversion_test.go- expectValue: "high", test/thinking_conversion_test.go- expectErr: false, test/thinking_conversion_test.go- }, test/thinking_conversion_test.go: // Case 119: OpenAI-Response to gpt-5.2-codex, reasoning.effort=xhigh → xhigh test/thinking_conversion_test.go- { test/thinking_conversion_test.go- name: "119", -- test/thinking_conversion_test.go- model: "gpt-5.2-codex", test/thinking_conversion_test.go- inputJSON: `{"model":"gpt-5.2-codex","input":[{"role":"user","content":"hi"}],"reasoning":{"effort":"xhigh"}}`, test/thinking_conversion_test.go: expectField: "reasoning.effort", test/thinking_conversion_test.go- expectValue: "xhigh", test/thinking_conversion_test.go- expectErr: false, test/thinking_conversion_test.go- }, test/thinking_conversion_test.go: // Case 120: OpenAI-Response to gpt-5.2-codex, reasoning.effort=none → none test/thinking_conversion_test.go- { test/thinking_conversion_test.go- name: "120", -- test/thinking_conversion_test.go- model: "gpt-5.2-codex", test/thinking_conversion_test.go- inputJSON: `{"model":"gpt-5.2-codex","input":[{"role":"user","content":"hi"}],"reasoning":{"effort":"none"}}`, test/thinking_conversion_test.go: expectField: "reasoning.effort", test/thinking_conversion_test.go- expectValue: "none", test/thinking_conversion_test.go- expectErr: false, test/thinking_conversion_test.go- }, test/thinking_conversion_test.go: // Case 121: OpenAI-Response to gpt-5-codex, reasoning.effort=none → clamped to low (ZeroAllowed=false) test/thinking_conversion_test.go- { test/thinking_conversion_test.go- name: "121", -- test/thinking_conversion_test.go- model: "gpt-5-codex", test/thinking_conversion_test.go- inputJSON: `{"model":"gpt-5-codex","input":[{"role":"user","content":"hi"}],"reasoning":{"effort":"none"}}`, test/thinking_conversion_test.go: expectField: "reasoning.effort", test/thinking_conversion_test.go- expectValue: "low", test/thinking_conversion_test.go- expectErr: false, -- test/thinking_conversion_test.go- model: "level-model", test/thinking_conversion_test.go- inputJSON: `{"model":"level-model","messages":[{"role":"user","content":"hi"}],"thinking":{"type":"adaptive"}}`, test/thinking_conversion_test.go: expectField: "reasoning.effort", test/thinking_conversion_test.go- expectValue: "high", test/thinking_conversion_test.go- expectErr: false, -- test/thinking_conversion_test.go- hasThinking = gjson.GetBytes(body, "reasoning_effort").Exists() test/thinking_conversion_test.go- case "codex": test/thinking_conversion_test.go: hasThinking = gjson.GetBytes(body, "reasoning.effort").Exists() || gjson.GetBytes(body, "reasoning").Exists() test/thinking_conversion_test.go- case "iflow": test/thinking_conversion_test.go- hasThinking = gjson.GetBytes(body, "chat_template_kwargs.enable_thinking").Exists() || gjson.GetBytes(body, "reasoning_split").Exists() -- pkg/llmproxy/translator/openai/openai/responses/openai_openai-responses_request.go- // pkg/llmproxy/translator/openai/openai/responses/openai_openai-responses_request.go- // Priority: pkg/llmproxy/translator/openai/openai/responses/openai_openai-responses_request.go: // 1. reasoning.effort object field pkg/llmproxy/translator/openai/openai/responses/openai_openai-responses_request.go: // 2. flat legacy field "reasoning.effort" pkg/llmproxy/translator/openai/openai/responses/openai_openai-responses_request.go- // 3. variant pkg/llmproxy/translator/openai/openai/responses/openai_openai-responses_request.go: if reasoningEffort := root.Get("reasoning.effort"); reasoningEffort.Exists() { pkg/llmproxy/translator/openai/openai/responses/openai_openai-responses_request.go- effort := strings.ToLower(strings.TrimSpace(reasoningEffort.String())) pkg/llmproxy/translator/openai/openai/responses/openai_openai-responses_request.go- if effort != "" { -- pkg/llmproxy/translator/gemini/openai/responses/gemini_openai-responses_request.go- } pkg/llmproxy/translator/gemini/openai/responses/gemini_openai-responses_request.go- pkg/llmproxy/translator/gemini/openai/responses/gemini_openai-responses_request.go: // Apply thinking configuration: convert OpenAI Responses API reasoning.effort to Gemini thinkingConfig. pkg/llmproxy/translator/gemini/openai/responses/gemini_openai-responses_request.go- // Inline translation-only mapping; capability checks happen later in ApplyThinking. pkg/llmproxy/translator/gemini/openai/responses/gemini_openai-responses_request.go: re := root.Get("reasoning.effort") pkg/llmproxy/translator/gemini/openai/responses/gemini_openai-responses_request.go- if re.Exists() { pkg/llmproxy/translator/gemini/openai/responses/gemini_openai-responses_request.go- effort := strings.ToLower(strings.TrimSpace(re.String())) -- pkg/llmproxy/translator/codex/openai/chat-completions/codex_openai_request_test.go- } pkg/llmproxy/translator/codex/openai/chat-completions/codex_openai_request_test.go- pkg/llmproxy/translator/codex/openai/chat-completions/codex_openai_request_test.go: if res.Get("reasoning.effort").String() != "medium" { pkg/llmproxy/translator/codex/openai/chat-completions/codex_openai_request_test.go: t.Errorf("expected reasoning.effort medium, got %s", res.Get("reasoning.effort").String()) pkg/llmproxy/translator/codex/openai/chat-completions/codex_openai_request_test.go- } pkg/llmproxy/translator/codex/openai/chat-completions/codex_openai_request_test.go- -- pkg/llmproxy/translator/codex/openai/chat-completions/codex_openai_request_test.go- res2 := gjson.ParseBytes(got2) pkg/llmproxy/translator/codex/openai/chat-completions/codex_openai_request_test.go- pkg/llmproxy/translator/codex/openai/chat-completions/codex_openai_request_test.go: if res2.Get("reasoning.effort").String() != "high" { pkg/llmproxy/translator/codex/openai/chat-completions/codex_openai_request_test.go: t.Errorf("expected reasoning.effort high, got %s", res2.Get("reasoning.effort").String()) pkg/llmproxy/translator/codex/openai/chat-completions/codex_openai_request_test.go- } pkg/llmproxy/translator/codex/openai/chat-completions/codex_openai_request_test.go- -- pkg/llmproxy/translator/codex/openai/chat-completions/codex_openai_request_test.go- res := gjson.ParseBytes(got) pkg/llmproxy/translator/codex/openai/chat-completions/codex_openai_request_test.go- pkg/llmproxy/translator/codex/openai/chat-completions/codex_openai_request_test.go: if gotEffort := res.Get("reasoning.effort").String(); gotEffort != "high" { pkg/llmproxy/translator/codex/openai/chat-completions/codex_openai_request_test.go: t.Fatalf("expected reasoning.effort to use variant fallback high, got %s", gotEffort) pkg/llmproxy/translator/codex/openai/chat-completions/codex_openai_request_test.go- } pkg/llmproxy/translator/codex/openai/chat-completions/codex_openai_request_test.go-} -- pkg/llmproxy/translator/codex/openai/chat-completions/codex_openai_request_test.go- "model": "gpt-4o", pkg/llmproxy/translator/codex/openai/chat-completions/codex_openai_request_test.go- "messages": [{"role":"user","content":"hello"}], pkg/llmproxy/translator/codex/openai/chat-completions/codex_openai_request_test.go: "reasoning.effort": "low" pkg/llmproxy/translator/codex/openai/chat-completions/codex_openai_request_test.go- }`) pkg/llmproxy/translator/codex/openai/chat-completions/codex_openai_request_test.go- got := ConvertOpenAIRequestToCodex("gpt-4o", input, false) pkg/llmproxy/translator/codex/openai/chat-completions/codex_openai_request_test.go- res := gjson.ParseBytes(got) pkg/llmproxy/translator/codex/openai/chat-completions/codex_openai_request_test.go- pkg/llmproxy/translator/codex/openai/chat-completions/codex_openai_request_test.go: if gotEffort := res.Get("reasoning.effort").String(); gotEffort != "low" { pkg/llmproxy/translator/codex/openai/chat-completions/codex_openai_request_test.go: t.Fatalf("expected reasoning.effort to use legacy flat field low, got %s", gotEffort) pkg/llmproxy/translator/codex/openai/chat-completions/codex_openai_request_test.go- } pkg/llmproxy/translator/codex/openai/chat-completions/codex_openai_request_test.go-} -- pkg/llmproxy/translator/codex/openai/chat-completions/codex_openai_request_test.go- res := gjson.ParseBytes(got) pkg/llmproxy/translator/codex/openai/chat-completions/codex_openai_request_test.go- pkg/llmproxy/translator/codex/openai/chat-completions/codex_openai_request_test.go: if gotEffort := res.Get("reasoning.effort").String(); gotEffort != "low" { pkg/llmproxy/translator/codex/openai/chat-completions/codex_openai_request_test.go: t.Fatalf("expected reasoning.effort to prefer reasoning_effort low, got %s", gotEffort) pkg/llmproxy/translator/codex/openai/chat-completions/codex_openai_request_test.go- } pkg/llmproxy/translator/codex/openai/chat-completions/codex_openai_request_test.go-} -- pkg/llmproxy/translator/codex/openai/responses/codex_openai-responses_request_test.go- outputStr := string(output) pkg/llmproxy/translator/codex/openai/responses/codex_openai-responses_request_test.go- pkg/llmproxy/translator/codex/openai/responses/codex_openai-responses_request_test.go: if got := gjson.Get(outputStr, "reasoning.effort").String(); got != "high" { pkg/llmproxy/translator/codex/openai/responses/codex_openai-responses_request_test.go: t.Fatalf("expected reasoning.effort=high fallback, got %s", got) pkg/llmproxy/translator/codex/openai/responses/codex_openai-responses_request_test.go- } pkg/llmproxy/translator/codex/openai/responses/codex_openai-responses_request_test.go-} -- pkg/llmproxy/translator/codex/openai/responses/codex_openai-responses_request_test.go- outputStr := string(output) pkg/llmproxy/translator/codex/openai/responses/codex_openai-responses_request_test.go- pkg/llmproxy/translator/codex/openai/responses/codex_openai-responses_request_test.go: if got := gjson.Get(outputStr, "reasoning.effort").String(); got != "low" { pkg/llmproxy/translator/codex/openai/responses/codex_openai-responses_request_test.go: t.Fatalf("expected reasoning.effort to prefer explicit reasoning.effort low, got %s", got) pkg/llmproxy/translator/codex/openai/responses/codex_openai-responses_request_test.go- } pkg/llmproxy/translator/codex/openai/responses/codex_openai-responses_request_test.go-} -- pkg/llmproxy/translator/codex/openai/chat-completions/codex_openai_request.go- // Map reasoning effort; support flat legacy field and variant fallback. pkg/llmproxy/translator/codex/openai/chat-completions/codex_openai_request.go- if v := gjson.GetBytes(rawJSON, "reasoning_effort"); v.Exists() { pkg/llmproxy/translator/codex/openai/chat-completions/codex_openai_request.go: out, _ = sjson.Set(out, "reasoning.effort", v.Value()) pkg/llmproxy/translator/codex/openai/chat-completions/codex_openai_request.go- } else if v := gjson.GetBytes(rawJSON, `reasoning\.effort`); v.Exists() { pkg/llmproxy/translator/codex/openai/chat-completions/codex_openai_request.go: out, _ = sjson.Set(out, "reasoning.effort", v.Value()) pkg/llmproxy/translator/codex/openai/chat-completions/codex_openai_request.go- } else if v := gjson.GetBytes(rawJSON, "variant"); v.Exists() { pkg/llmproxy/translator/codex/openai/chat-completions/codex_openai_request.go- effort := strings.ToLower(strings.TrimSpace(v.String())) pkg/llmproxy/translator/codex/openai/chat-completions/codex_openai_request.go- if effort == "" { pkg/llmproxy/translator/codex/openai/chat-completions/codex_openai_request.go: out, _ = sjson.Set(out, "reasoning.effort", "medium") pkg/llmproxy/translator/codex/openai/chat-completions/codex_openai_request.go- } else { pkg/llmproxy/translator/codex/openai/chat-completions/codex_openai_request.go: out, _ = sjson.Set(out, "reasoning.effort", effort) pkg/llmproxy/translator/codex/openai/chat-completions/codex_openai_request.go- } pkg/llmproxy/translator/codex/openai/chat-completions/codex_openai_request.go- } else { pkg/llmproxy/translator/codex/openai/chat-completions/codex_openai_request.go: out, _ = sjson.Set(out, "reasoning.effort", "medium") pkg/llmproxy/translator/codex/openai/chat-completions/codex_openai_request.go- } pkg/llmproxy/translator/codex/openai/chat-completions/codex_openai_request.go- out, _ = sjson.Set(out, "parallel_tool_calls", true) -- pkg/llmproxy/translator/codex/gemini/codex_gemini_request.go- out, _ = sjson.Set(out, "parallel_tool_calls", true) pkg/llmproxy/translator/codex/gemini/codex_gemini_request.go- pkg/llmproxy/translator/codex/gemini/codex_gemini_request.go: // Convert Gemini thinkingConfig to Codex reasoning.effort. pkg/llmproxy/translator/codex/gemini/codex_gemini_request.go- // Note: Google official Python SDK sends snake_case fields (thinking_level/thinking_budget). pkg/llmproxy/translator/codex/gemini/codex_gemini_request.go- effortSet := false -- pkg/llmproxy/translator/codex/gemini/codex_gemini_request.go- effort := strings.ToLower(strings.TrimSpace(thinkingLevel.String())) pkg/llmproxy/translator/codex/gemini/codex_gemini_request.go- if effort != "" { pkg/llmproxy/translator/codex/gemini/codex_gemini_request.go: out, _ = sjson.Set(out, "reasoning.effort", effort) pkg/llmproxy/translator/codex/gemini/codex_gemini_request.go- effortSet = true pkg/llmproxy/translator/codex/gemini/codex_gemini_request.go- } -- pkg/llmproxy/translator/codex/gemini/codex_gemini_request.go- if thinkingBudget.Exists() { pkg/llmproxy/translator/codex/gemini/codex_gemini_request.go- if effort, ok := thinking.ConvertBudgetToLevel(int(thinkingBudget.Int())); ok { pkg/llmproxy/translator/codex/gemini/codex_gemini_request.go: out, _ = sjson.Set(out, "reasoning.effort", effort) pkg/llmproxy/translator/codex/gemini/codex_gemini_request.go- effortSet = true pkg/llmproxy/translator/codex/gemini/codex_gemini_request.go- } -- pkg/llmproxy/translator/codex/gemini/codex_gemini_request.go- if !effortSet { pkg/llmproxy/translator/codex/gemini/codex_gemini_request.go- // No thinking config, set default effort pkg/llmproxy/translator/codex/gemini/codex_gemini_request.go: out, _ = sjson.Set(out, "reasoning.effort", "medium") pkg/llmproxy/translator/codex/gemini/codex_gemini_request.go- } pkg/llmproxy/translator/codex/gemini/codex_gemini_request.go- out, _ = sjson.Set(out, "reasoning.summary", "auto") -- pkg/llmproxy/translator/codex/openai/responses/codex_openai-responses_request.go- rawJSON, _ = sjson.SetBytes(rawJSON, "stream", true) pkg/llmproxy/translator/codex/openai/responses/codex_openai-responses_request.go- rawJSON, _ = sjson.SetBytes(rawJSON, "store", false) pkg/llmproxy/translator/codex/openai/responses/codex_openai-responses_request.go: // Map variant -> reasoning.effort when reasoning.effort is not explicitly provided. pkg/llmproxy/translator/codex/openai/responses/codex_openai-responses_request.go: if !gjson.GetBytes(rawJSON, "reasoning.effort").Exists() { pkg/llmproxy/translator/codex/openai/responses/codex_openai-responses_request.go- if variant := gjson.GetBytes(rawJSON, "variant"); variant.Exists() { pkg/llmproxy/translator/codex/openai/responses/codex_openai-responses_request.go- effort := strings.ToLower(strings.TrimSpace(variant.String())) pkg/llmproxy/translator/codex/openai/responses/codex_openai-responses_request.go- if effort != "" { pkg/llmproxy/translator/codex/openai/responses/codex_openai-responses_request.go: rawJSON, _ = sjson.SetBytes(rawJSON, "reasoning.effort", effort) pkg/llmproxy/translator/codex/openai/responses/codex_openai-responses_request.go- } pkg/llmproxy/translator/codex/openai/responses/codex_openai-responses_request.go- } -- pkg/llmproxy/thinking/apply_iflow_test.go- }{ pkg/llmproxy/thinking/apply_iflow_test.go- { pkg/llmproxy/thinking/apply_iflow_test.go: name: "nested reasoning.effort maps to level", pkg/llmproxy/thinking/apply_iflow_test.go- body: `{"reasoning":{"effort":"high"}}`, pkg/llmproxy/thinking/apply_iflow_test.go- want: ThinkingConfig{Mode: ModeLevel, Level: LevelHigh}, pkg/llmproxy/thinking/apply_iflow_test.go- }, pkg/llmproxy/thinking/apply_iflow_test.go- { pkg/llmproxy/thinking/apply_iflow_test.go: name: "literal reasoning.effort key maps to level", pkg/llmproxy/thinking/apply_iflow_test.go: body: `{"reasoning.effort":"high"}`, pkg/llmproxy/thinking/apply_iflow_test.go- want: ThinkingConfig{Mode: ModeLevel, Level: LevelHigh}, pkg/llmproxy/thinking/apply_iflow_test.go- }, pkg/llmproxy/thinking/apply_iflow_test.go- { pkg/llmproxy/thinking/apply_iflow_test.go- name: "none maps to disabled", pkg/llmproxy/thinking/apply_iflow_test.go: body: `{"reasoning.effort":"none"}`, pkg/llmproxy/thinking/apply_iflow_test.go- want: ThinkingConfig{Mode: ModeNone, Budget: 0}, pkg/llmproxy/thinking/apply_iflow_test.go- }, -- pkg/llmproxy/thinking/strip.go- paths = []string{"reasoning_effort"} pkg/llmproxy/thinking/strip.go- case "codex": pkg/llmproxy/thinking/strip.go: paths = []string{"reasoning.effort"} pkg/llmproxy/thinking/strip.go- case "iflow": pkg/llmproxy/thinking/strip.go- paths = []string{ -- pkg/llmproxy/thinking/strip_test.go- t.Fatalf("expected reasoning to be removed") pkg/llmproxy/thinking/strip_test.go- } pkg/llmproxy/thinking/strip_test.go: if res.Get("reasoning.effort").Exists() { pkg/llmproxy/thinking/strip_test.go: t.Fatalf("expected reasoning.effort to be removed") pkg/llmproxy/thinking/strip_test.go- } pkg/llmproxy/thinking/strip_test.go- if res.Get("reasoning_split").Exists() { -- pkg/llmproxy/thinking/provider/codex/apply.go-// Package codex implements thinking configuration for Codex (OpenAI Responses API) models. pkg/llmproxy/thinking/provider/codex/apply.go-// pkg/llmproxy/thinking/provider/codex/apply.go:// Codex models use the reasoning.effort format with discrete levels pkg/llmproxy/thinking/provider/codex/apply.go-// (low/medium/high). This is similar to OpenAI but uses nested field pkg/llmproxy/thinking/provider/codex/apply.go:// "reasoning.effort" instead of "reasoning_effort". pkg/llmproxy/thinking/provider/codex/apply.go-// See: _bmad-output/planning-artifacts/architecture.md#Epic-8 pkg/llmproxy/thinking/provider/codex/apply.go-package codex -- pkg/llmproxy/thinking/provider/codex/apply.go-// pkg/llmproxy/thinking/provider/codex/apply.go-// Codex-specific behavior: pkg/llmproxy/thinking/provider/codex/apply.go:// - Output format: reasoning.effort (string: low/medium/high/xhigh) pkg/llmproxy/thinking/provider/codex/apply.go-// - Level-only mode: no numeric budget support pkg/llmproxy/thinking/provider/codex/apply.go-// - Some models support ZeroAllowed (gpt-5.1, gpt-5.2) -- pkg/llmproxy/thinking/provider/codex/apply.go- pkg/llmproxy/thinking/provider/codex/apply.go- if config.Mode == thinking.ModeLevel { pkg/llmproxy/thinking/provider/codex/apply.go: result, _ := sjson.SetBytes(body, "reasoning.effort", string(config.Level)) pkg/llmproxy/thinking/provider/codex/apply.go- return result, nil pkg/llmproxy/thinking/provider/codex/apply.go- } -- pkg/llmproxy/thinking/provider/codex/apply.go- } pkg/llmproxy/thinking/provider/codex/apply.go- pkg/llmproxy/thinking/provider/codex/apply.go: result, _ := sjson.SetBytes(body, "reasoning.effort", effort) pkg/llmproxy/thinking/provider/codex/apply.go- return result, nil pkg/llmproxy/thinking/provider/codex/apply.go-} -- pkg/llmproxy/thinking/provider/codex/apply.go- } pkg/llmproxy/thinking/provider/codex/apply.go- pkg/llmproxy/thinking/provider/codex/apply.go: result, _ := sjson.SetBytes(body, "reasoning.effort", effort) pkg/llmproxy/thinking/provider/codex/apply.go- return result, nil pkg/llmproxy/thinking/provider/codex/apply.go-} -- pkg/llmproxy/thinking/provider/iflow/apply_test.go- t.Fatalf("expected reasoning object to be removed") pkg/llmproxy/thinking/provider/iflow/apply_test.go- } pkg/llmproxy/thinking/provider/iflow/apply_test.go: if res.Get("reasoning.effort").Exists() { pkg/llmproxy/thinking/provider/iflow/apply_test.go: t.Fatalf("expected reasoning.effort to be removed") pkg/llmproxy/thinking/provider/iflow/apply_test.go- } pkg/llmproxy/thinking/provider/iflow/apply_test.go- if res.Get("variant").Exists() { -- pkg/llmproxy/translator/claude/openai/responses/claude_openai-responses_request.go- root := gjson.ParseBytes(rawJSON) pkg/llmproxy/translator/claude/openai/responses/claude_openai-responses_request.go- pkg/llmproxy/translator/claude/openai/responses/claude_openai-responses_request.go: // Convert OpenAI Responses reasoning.effort to Claude thinking config. pkg/llmproxy/translator/claude/openai/responses/claude_openai-responses_request.go: if v := root.Get("reasoning.effort"); v.Exists() { pkg/llmproxy/translator/claude/openai/responses/claude_openai-responses_request.go- effort := strings.ToLower(strings.TrimSpace(v.String())) pkg/llmproxy/translator/claude/openai/responses/claude_openai-responses_request.go- if effort != "" { -- pkg/llmproxy/thinking/apply.go-// pkg/llmproxy/thinking/apply.go-// Codex API format (OpenAI Responses API): pkg/llmproxy/thinking/apply.go:// - reasoning.effort: "none", "low", "medium", "high" pkg/llmproxy/thinking/apply.go-// pkg/llmproxy/thinking/apply.go:// This is similar to OpenAI but uses nested field "reasoning.effort" instead of "reasoning_effort". pkg/llmproxy/thinking/apply.go-func extractCodexConfig(body []byte) ThinkingConfig { pkg/llmproxy/thinking/apply.go: // Check reasoning.effort (Codex / OpenAI Responses API format) pkg/llmproxy/thinking/apply.go: if effort := gjson.GetBytes(body, "reasoning.effort"); effort.Exists() { pkg/llmproxy/thinking/apply.go- value := effort.String() pkg/llmproxy/thinking/apply.go- if value == "none" { -- pkg/llmproxy/thinking/apply.go- pkg/llmproxy/thinking/apply.go- // Compatibility fallback: some clients send Claude-style `variant` pkg/llmproxy/thinking/apply.go: // instead of OpenAI/Codex `reasoning.effort`. pkg/llmproxy/thinking/apply.go- if variant := gjson.GetBytes(body, "variant"); variant.Exists() { pkg/llmproxy/thinking/apply.go- switch strings.ToLower(strings.TrimSpace(variant.String())) { -- pkg/llmproxy/thinking/apply.go- } pkg/llmproxy/thinking/apply.go- pkg/llmproxy/thinking/apply.go: if effort := gjson.GetBytes(body, "reasoning.effort"); effort.Exists() { pkg/llmproxy/thinking/apply.go- if config, ok := parseIFlowReasoningEffort(strings.TrimSpace(effort.String())); ok { pkg/llmproxy/thinking/apply.go- return config -- pkg/llmproxy/translator/codex/claude/codex_claude_request.go- template, _ = sjson.Set(template, "parallel_tool_calls", true) pkg/llmproxy/translator/codex/claude/codex_claude_request.go- pkg/llmproxy/translator/codex/claude/codex_claude_request.go: // Convert thinking.budget_tokens to reasoning.effort. pkg/llmproxy/translator/codex/claude/codex_claude_request.go- reasoningEffort := "medium" pkg/llmproxy/translator/codex/claude/codex_claude_request.go- if thinkingConfig := rootResult.Get("thinking"); thinkingConfig.Exists() && thinkingConfig.IsObject() { -- pkg/llmproxy/translator/codex/claude/codex_claude_request.go- } pkg/llmproxy/translator/codex/claude/codex_claude_request.go- } pkg/llmproxy/translator/codex/claude/codex_claude_request.go: template, _ = sjson.Set(template, "reasoning.effort", reasoningEffort) pkg/llmproxy/translator/codex/claude/codex_claude_request.go- template, _ = sjson.Set(template, "reasoning.summary", "auto") pkg/llmproxy/translator/codex/claude/codex_claude_request.go- template, _ = sjson.Set(template, "stream", true) -- pkg/llmproxy/executor/codex_executor_cpb0106_test.go- t.Fatalf("expected upstream stream=true, got %v", out.Bool()) pkg/llmproxy/executor/codex_executor_cpb0106_test.go- } pkg/llmproxy/executor/codex_executor_cpb0106_test.go: if got := gjson.GetBytes(upstreamBody, "reasoning.effort").String(); got != "high" { pkg/llmproxy/executor/codex_executor_cpb0106_test.go: t.Fatalf("expected reasoning.effort=high, got %q", got) pkg/llmproxy/executor/codex_executor_cpb0106_test.go- } pkg/llmproxy/executor/codex_executor_cpb0106_test.go-} -- pkg/llmproxy/executor/codex_executor_cpb0106_test.go- t.Fatalf("expected upstream stream=false in ExecuteStream path, got %v", got) pkg/llmproxy/executor/codex_executor_cpb0106_test.go- } pkg/llmproxy/executor/codex_executor_cpb0106_test.go: if got := gjson.GetBytes(upstreamBody, "reasoning.effort").String(); got != "high" { pkg/llmproxy/executor/codex_executor_cpb0106_test.go: t.Fatalf("expected reasoning.effort=high, got %q", got) pkg/llmproxy/executor/codex_executor_cpb0106_test.go- } pkg/llmproxy/executor/codex_executor_cpb0106_test.go-} -- pkg/llmproxy/translator/openai/openai/responses/openai_openai-responses_request_test.go- "model":"gpt-4.1", pkg/llmproxy/translator/openai/openai/responses/openai_openai-responses_request_test.go- "input":[{"type":"message","role":"user","content":[{"type":"input_text","text":"ping"}]}], pkg/llmproxy/translator/openai/openai/responses/openai_openai-responses_request_test.go: "reasoning.effort":"low" pkg/llmproxy/translator/openai/openai/responses/openai_openai-responses_request_test.go- }`) pkg/llmproxy/translator/openai/openai/responses/openai_openai-responses_request_test.go-

Script executed:

cat -n pkg/llmproxy/thinking/apply_iflow_test.go | head -50

Repository: KooshaPari/cliproxyapi-plusplus

Repository: KooshaPari/cliproxyapi-plusplus
Exit code: 0

stdout:

1 package thinking 2 3 import "testing" 4 5 func TestExtractIFlowConfig_ReasoningEffort(t *testing.T) { 6 tests := []struct { 7 name string 8 body string 9 want ThinkingConfig 10 }{ 11 { 12 name: "nested reasoning.effort maps to level", 13 body: `{"reasoning":{"effort":"high"}}`, 14 want: ThinkingConfig{Mode: ModeLevel, Level: LevelHigh}, 15 }, 16 { 17 name: "literal reasoning.effort key maps to level", 18 body: `{"reasoning.effort":"high"}`, 19 want: ThinkingConfig{Mode: ModeLevel, Level: LevelHigh}, 20 }, 21 { 22 name: "none maps to disabled", 23 body: `{"reasoning.effort":"none"}`, 24 want: ThinkingConfig{Mode: ModeNone, Budget: 0}, 25 }, 26 } 27 28 for _, tt := range tests { 29 t.Run(tt.name, func(t *testing.T) { 30 got := extractIFlowConfig([]byte(tt.body)) 31 if got != tt.want { 32 t.Fatalf("got=%+v want=%+v", got, tt.want) 33 } 34 }) 35 } 36 } 37 38 func TestExtractIFlowConfig_ReasoningObjectEffort(t *testing.T) { 39 tests := []struct { 40 name string 41 body string 42 want ThinkingConfig 43 }{ 44 { 45 name: "reasoning object effort maps to level", 46 body: `{"reasoning":{"effort":"medium"}}`, 47 want: ThinkingConfig{Mode: ModeLevel, Level: LevelMedium}, 48 }, 49 { 50 name: "empty effort falls back to empty config",

Script executed:

rg "reasoning\.effort" pkg/llmproxy/thinking/ --type go -B 3 -A 3

Repository: KooshaPari/cliproxyapi-plusplus

Repository: KooshaPari/cliproxyapi-plusplus
Exit code: 0

stdout:

pkg/llmproxy/thinking/apply.go-// extractCodexConfig extracts thinking configuration from Codex format request body. pkg/llmproxy/thinking/apply.go-// pkg/llmproxy/thinking/apply.go-// Codex API format (OpenAI Responses API): pkg/llmproxy/thinking/apply.go:// - reasoning.effort: "none", "low", "medium", "high" pkg/llmproxy/thinking/apply.go-// pkg/llmproxy/thinking/apply.go:// This is similar to OpenAI but uses nested field "reasoning.effort" instead of "reasoning_effort". pkg/llmproxy/thinking/apply.go-func extractCodexConfig(body []byte) ThinkingConfig { pkg/llmproxy/thinking/apply.go: // Check reasoning.effort (Codex / OpenAI Responses API format) pkg/llmproxy/thinking/apply.go: if effort := gjson.GetBytes(body, "reasoning.effort"); effort.Exists() { pkg/llmproxy/thinking/apply.go- value := effort.String() pkg/llmproxy/thinking/apply.go- if value == "none" { pkg/llmproxy/thinking/apply.go- return ThinkingConfig{Mode: ModeNone, Budget: 0} -- pkg/llmproxy/thinking/apply.go- } pkg/llmproxy/thinking/apply.go- pkg/llmproxy/thinking/apply.go- // Compatibility fallback: some clients send Claude-style `variant` pkg/llmproxy/thinking/apply.go: // instead of OpenAI/Codex `reasoning.effort`. pkg/llmproxy/thinking/apply.go- if variant := gjson.GetBytes(body, "variant"); variant.Exists() { pkg/llmproxy/thinking/apply.go- switch strings.ToLower(strings.TrimSpace(variant.String())) { pkg/llmproxy/thinking/apply.go- case "none": -- pkg/llmproxy/thinking/apply.go- return ThinkingConfig{Mode: ModeNone, Budget: 0} pkg/llmproxy/thinking/apply.go- } pkg/llmproxy/thinking/apply.go- pkg/llmproxy/thinking/apply.go: if effort := gjson.GetBytes(body, "reasoning.effort"); effort.Exists() { pkg/llmproxy/thinking/apply.go- if config, ok := parseIFlowReasoningEffort(strings.TrimSpace(effort.String())); ok { pkg/llmproxy/thinking/apply.go- return config pkg/llmproxy/thinking/apply.go- } -- pkg/llmproxy/thinking/strip_test.go- if res.Get("reasoning").Exists() { pkg/llmproxy/thinking/strip_test.go- t.Fatalf("expected reasoning to be removed") pkg/llmproxy/thinking/strip_test.go- } pkg/llmproxy/thinking/strip_test.go: if res.Get("reasoning.effort").Exists() { pkg/llmproxy/thinking/strip_test.go: t.Fatalf("expected reasoning.effort to be removed") pkg/llmproxy/thinking/strip_test.go- } pkg/llmproxy/thinking/strip_test.go- if res.Get("reasoning_split").Exists() { pkg/llmproxy/thinking/strip_test.go- t.Fatalf("expected reasoning_split to be removed") -- pkg/llmproxy/thinking/strip.go- case "openai": pkg/llmproxy/thinking/strip.go- paths = []string{"reasoning_effort"} pkg/llmproxy/thinking/strip.go- case "codex": pkg/llmproxy/thinking/strip.go: paths = []string{"reasoning.effort"} pkg/llmproxy/thinking/strip.go- case "iflow": pkg/llmproxy/thinking/strip.go- paths = []string{ pkg/llmproxy/thinking/strip.go- "chat_template_kwargs.enable_thinking", -- pkg/llmproxy/thinking/apply_iflow_test.go- want ThinkingConfig pkg/llmproxy/thinking/apply_iflow_test.go- }{ pkg/llmproxy/thinking/apply_iflow_test.go- { pkg/llmproxy/thinking/apply_iflow_test.go: name: "nested reasoning.effort maps to level", pkg/llmproxy/thinking/apply_iflow_test.go- body: `{"reasoning":{"effort":"high"}}`, pkg/llmproxy/thinking/apply_iflow_test.go- want: ThinkingConfig{Mode: ModeLevel, Level: LevelHigh}, pkg/llmproxy/thinking/apply_iflow_test.go- }, pkg/llmproxy/thinking/apply_iflow_test.go- { pkg/llmproxy/thinking/apply_iflow_test.go: name: "literal reasoning.effort key maps to level", pkg/llmproxy/thinking/apply_iflow_test.go: body: `{"reasoning.effort":"high"}`, pkg/llmproxy/thinking/apply_iflow_test.go- want: ThinkingConfig{Mode: ModeLevel, Level: LevelHigh}, pkg/llmproxy/thinking/apply_iflow_test.go- }, pkg/llmproxy/thinking/apply_iflow_test.go- { pkg/llmproxy/thinking/apply_iflow_test.go- name: "none maps to disabled", pkg/llmproxy/thinking/apply_iflow_test.go: body: `{"reasoning.effort":"none"}`, pkg/llmproxy/thinking/apply_iflow_test.go- want: ThinkingConfig{Mode: ModeNone, Budget: 0}, pkg/llmproxy/thinking/apply_iflow_test.go- }, pkg/llmproxy/thinking/apply_iflow_test.go- } -- pkg/llmproxy/thinking/provider/codex/apply.go-// Package codex implements thinking configuration for Codex (OpenAI Responses API) models. pkg/llmproxy/thinking/provider/codex/apply.go-// pkg/llmproxy/thinking/provider/codex/apply.go:// Codex models use the reasoning.effort format with discrete levels pkg/llmproxy/thinking/provider/codex/apply.go-// (low/medium/high). This is similar to OpenAI but uses nested field pkg/llmproxy/thinking/provider/codex/apply.go:// "reasoning.effort" instead of "reasoning_effort". pkg/llmproxy/thinking/provider/codex/apply.go-// See: _bmad-output/planning-artifacts/architecture.md#Epic-8 pkg/llmproxy/thinking/provider/codex/apply.go-package codex pkg/llmproxy/thinking/provider/codex/apply.go- -- pkg/llmproxy/thinking/provider/codex/apply.go-// Applier implements thinking.ProviderApplier for Codex models. pkg/llmproxy/thinking/provider/codex/apply.go-// pkg/llmproxy/thinking/provider/codex/apply.go-// Codex-specific behavior: pkg/llmproxy/thinking/provider/codex/apply.go:// - Output format: reasoning.effort (string: low/medium/high/xhigh) pkg/llmproxy/thinking/provider/codex/apply.go-// - Level-only mode: no numeric budget support pkg/llmproxy/thinking/provider/codex/apply.go-// - Some models support ZeroAllowed (gpt-5.1, gpt-5.2) pkg/llmproxy/thinking/provider/codex/apply.go-type Applier struct{} -- pkg/llmproxy/thinking/provider/codex/apply.go- } pkg/llmproxy/thinking/provider/codex/apply.go- pkg/llmproxy/thinking/provider/codex/apply.go- if config.Mode == thinking.ModeLevel { pkg/llmproxy/thinking/provider/codex/apply.go: result, _ := sjson.SetBytes(body, "reasoning.effort", string(config.Level)) pkg/llmproxy/thinking/provider/codex/apply.go- return result, nil pkg/llmproxy/thinking/provider/codex/apply.go- } pkg/llmproxy/thinking/provider/codex/apply.go- -- pkg/llmproxy/thinking/provider/codex/apply.go- return body, nil pkg/llmproxy/thinking/provider/codex/apply.go- } pkg/llmproxy/thinking/provider/codex/apply.go- pkg/llmproxy/thinking/provider/codex/apply.go: result, _ := sjson.SetBytes(body, "reasoning.effort", effort) pkg/llmproxy/thinking/provider/codex/apply.go- return result, nil pkg/llmproxy/thinking/provider/codex/apply.go-} pkg/llmproxy/thinking/provider/codex/apply.go- -- pkg/llmproxy/thinking/provider/codex/apply.go- return body, nil pkg/llmproxy/thinking/provider/codex/apply.go- } pkg/llmproxy/thinking/provider/codex/apply.go- pkg/llmproxy/thinking/provider/codex/apply.go: result, _ := sjson.SetBytes(body, "reasoning.effort", effort) pkg/llmproxy/thinking/provider/codex/apply.go- return result, nil pkg/llmproxy/thinking/provider/codex/apply.go-} pkg/llmproxy/thinking/provider/codex/apply.go- -- pkg/llmproxy/thinking/provider/iflow/apply_test.go- if res.Get("reasoning").Exists() { pkg/llmproxy/thinking/provider/iflow/apply_test.go- t.Fatalf("expected reasoning object to be removed") pkg/llmproxy/thinking/provider/iflow/apply_test.go- } pkg/llmproxy/thinking/provider/iflow/apply_test.go: if res.Get("reasoning.effort").Exists() { pkg/llmproxy/thinking/provider/iflow/apply_test.go: t.Fatalf("expected reasoning.effort to be removed") pkg/llmproxy/thinking/provider/iflow/apply_test.go- } pkg/llmproxy/thinking/provider/iflow/apply_test.go- if res.Get("variant").Exists() { pkg/llmproxy/thinking/provider/iflow/apply_test.go- t.Fatalf("expected variant to be removed")

Script executed:

cat > /tmp/test_sjson.go << 'EOF' package main import ( "fmt" "github.com/tidwall/gjson" "github.com/tidwall/sjson" ) func main() { // Test 1: nested path deletion input1 := []byte(`{"reasoning":{"effort":"high"},"foo":"bar"}`) result1, _ := sjson.DeleteBytes(input1, "reasoning.effort") fmt.Println("Nested delete result:", string(result1)) fmt.Println("After nested delete, reasoning exists:", gjson.GetBytes(result1, "reasoning").Exists()) fmt.Println("After nested delete, reasoning.effort exists:", gjson.GetBytes(result1, "reasoning.effort").Exists()) // Test 2: literal dot-key without escaping input2 := []byte(`{"reasoning.effort":"high","foo":"bar"}`) result2, _ := sjson.DeleteBytes(input2, "reasoning.effort") fmt.Println("\nLiteral key (no escape) delete result:", string(result2)) fmt.Println("After literal delete, reasoning.effort exists:", gjson.GetBytes(result2, `reasoning\.effort`).Exists()) // Test 3: literal dot-key WITH escaping input3 := []byte(`{"reasoning.effort":"high","foo":"bar"}`) result3, _ := sjson.DeleteBytes(input3, `reasoning\.effort`) fmt.Println("\nLiteral key (with escape) delete result:", string(result3)) fmt.Println("After escaped delete, reasoning.effort exists:", gjson.GetBytes(result3, `reasoning\.effort`).Exists()) } EOF cd /tmp && go mod init test 2>/dev/null; go get github.com/tidwall/sjson 2>/dev/null; go run test_sjson.go

Repository: KooshaPari/cliproxyapi-plusplus

Repository: KooshaPari/cliproxyapi-plusplus
Exit code: 1

stderr:

go: warning: ignoring go.mod in system temp root /tmp test_sjson.go:5:2: no required module provides package github.com/tidwall/gjson: go.mod file not found in current directory or any parent directory; see 'go help modules' test_sjson.go:6:2: no required module provides package github.com/tidwall/sjson: go.mod file not found in current directory or any parent directory; see 'go help modules'

Script executed:

cat -n pkg/llmproxy/thinking/strip_test.go

Repository: KooshaPari/cliproxyapi-plusplus

Repository: KooshaPari/cliproxyapi-plusplus
Exit code: 0

stdout:

1 package thinking 2 3 import ( 4 "testing" 5 6 "github.com/tidwall/gjson" 7 ) 8 9 func TestStripThinkingConfigIflow(t *testing.T) { 10 body := []byte(`{ 11 "model":"minimax-m2.5", 12 "reasoning":{"effort":"high"}, 13 "reasoning_effort":"high", 14 "reasoning_split":true, 15 "variant":"medium", 16 "chat_template_kwargs":{"enable_thinking":true,"clear_thinking":false}, 17 "messages":[{"role":"user","content":"hi"}] 18 }`) 19 20 out := StripThinkingConfig(body, "iflow") 21 res := gjson.ParseBytes(out) 22 23 if res.Get("reasoning").Exists() { 24 t.Fatalf("expected reasoning to be removed") 25 } 26 if res.Get("reasoning.effort").Exists() { 27 t.Fatalf("expected reasoning.effort to be removed") 28 } 29 if res.Get("reasoning_split").Exists() { 30 t.Fatalf("expected reasoning_split to be removed") 31 } 32 if res.Get("reasoning_effort").Exists() { 33 t.Fatalf("expected reasoning_effort to be removed") 34 } 35 if res.Get("variant").Exists() { 36 t.Fatalf("expected variant to be removed") 37 } 38 if res.Get("chat_template_kwargs").Exists() { 39 t.Fatalf("expected chat_template_kwargs to be removed") 40 } 41 if res.Get("messages.0.content").String() != "hi" { 42 t.Fatalf("expected unrelated messages to remain") 43 } 44 } 45 46 func TestStripThinkingConfigIflowPreservesUnrelatedChatTemplateKwargs(t *testing.T) { 47 body := []byte(`{ 48 "chat_template_kwargs":{ 49 "enable_thinking":true, 50 "clear_thinking":false, 51 "system_prompt":"keep-me" 52 }, 53 "messages":[{"role":"user","content":"hi"}] 54 }`) 55 56 out := StripThinkingConfig(body, "iflow") 57 res := gjson.ParseBytes(out) 58 59 if res.Get("chat_template_kwargs.enable_thinking").Exists() { 60 t.Fatalf("expected enable_thinking to be removed") 61 } 62 if res.Get("chat_template_kwargs.clear_thinking").Exists() { 63 t.Fatalf("expected clear_thinking to be removed") 64 } 65 if res.Get("chat_template_kwargs.system_prompt").String() != "keep-me" { 66 t.Fatalf("expected unrelated chat_template_kwargs fields to remain") 67 } 68 if res.Get("messages.0.content").String() != "hi" { 69 t.Fatalf("expected unrelated messages to remain") 70 } 71 }

Script executed:

rg "reasoning\.effort" pkg/llmproxy/thinking/strip_test.go

Repository: KooshaPari/cliproxyapi-plusplus

Repository: KooshaPari/cliproxyapi-plusplus
Exit code: 0

stdout:

if res.Get("reasoning.effort").Exists() { t.Fatalf("expected reasoning.effort to be removed")

Script executed:

python3 << 'PYEOF' # Let's verify the sjson behavior with literal dot-keys # by checking how iflow extraction tests work import json # From apply_iflow_test.go line 17-18: test_body = '{"reasoning.effort":"high"}' data = json.loads(test_body) print("Parsed JSON with literal dot-key:") print(json.dumps(data, indent=2)) print("\nKey in dict:", "reasoning.effort" in data) print("Key names:", list(data.keys())) PYEOF

Repository: KooshaPari/cliproxyapi-plusplus

Repository: KooshaPari/cliproxyapi-plusplus
Exit code: 0

stdout:

Parsed JSON with literal dot-key: { "reasoning.effort": "high" } Key in dict: True Key names: ['reasoning.effort']

Script executed:

rg -A 5 "TestStripThinkingConfigIflow\(" pkg/llmproxy/thinking/strip_test.go | head -20

Repository: KooshaPari/cliproxyapi-plusplus

Repository: KooshaPari/cliproxyapi-plusplus
Exit code: 0

stdout:

func TestStripThinkingConfigIflow(t *testing.T) { body := []byte(`{ "model":"minimax-m2.5", "reasoning":{"effort":"high"}, "reasoning_effort":"high", "reasoning_split":true,

Script executed:

# Check if any test input to strip has a literal reasoning.effort key rg '"reasoning\.effort"' pkg/llmproxy/thinking/strip_test.go

Repository: KooshaPari/cliproxyapi-plusplus

Repository: KooshaPari/cliproxyapi-plusplus
Exit code: 0

stdout:

if res.Get("reasoning.effort").Exists() {

Script executed:

sed -n '40,65p' pkg/llmproxy/thinking/strip.go

Repository: KooshaPari/cliproxyapi-plusplus

Repository: KooshaPari/cliproxyapi-plusplus
Exit code: 0

stdout:

case "codex": paths = []string{"reasoning.effort"} case "iflow": paths = []string{ "chat_template_kwargs.enable_thinking", "chat_template_kwargs.clear_thinking", "reasoning", "reasoning_split", "reasoning_effort", "variant", } default: return body } result := body for _, path := range paths { result, _ = sjson.DeleteBytes(result, path) } if provider == "iflow" { if kwargs := gjson.GetBytes(result, "chat_template_kwargs"); kwargs.Exists() && len(kwargs.Map()) == 0 { result, _ = sjson.DeleteBytes(result, "chat_template_kwargs") } } return result }

Add escaped literal key path "reasoning\.effort" to iflow stripping.

The iflow case removes "reasoning" (nested) but not a literal top-level "reasoning.effort" key. Since iflow inputs can contain this literal dot-key (as shown in apply_iflow_test.go), it must be stripped to prevent unexpected payload shapes. Add the escaped path to the paths list.

Proposed fix

case "iflow": paths = []string{ "chat_template_kwargs.enable_thinking", "chat_template_kwargs.clear_thinking", "reasoning", "reasoning_split", "reasoning_effort", + "reasoning\.effort", "variant", }

Consider centralizing iflow cleanup paths shared with pkg/llmproxy/thinking/provider/iflow/apply.go to prevent future drift.

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

paths = []string{

"chat_template_kwargs.enable_thinking",

"chat_template_kwargs.clear_thinking",

"reasoning",

"reasoning_split",

"reasoning_effort",

"variant",

}

paths = []string{

"chat_template_kwargs.enable_thinking",

"chat_template_kwargs.clear_thinking",

"reasoning",

"reasoning_split",

"reasoning_effort",

"reasoning\.effort",

"variant",

}

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@pkg/llmproxy/thinking/strip.go` around lines 43 - 50, The iflow stripping list in pkg/llmproxy/thinking/strip.go currently removes nested "reasoning" but not a literal top-level key "reasoning.effort"; update the paths slice (the variable assigned to paths in strip.go) to include the escaped literal key "reasoning\.effort" so top-level dot-keys are stripped as well, and consider moving this paths list into a shared constant used by the iflow cleanup code in provider/iflow/apply.go to avoid drift.

coderabbitai · 2026-03-15T00:29:59Z

+	registry.GetGlobalRegistry().RegisterClient("endpoint-compat-iflow-direction", "iflow", []*registry.ModelInfo{{
+		ID:                 "minimax-m2.5-directionality-test",
+		SupportedEndpoints: []string{"/chat/completions"},
+	}})
+	t.Cleanup(func() {
+		registry.GetGlobalRegistry().UnregisterClient("endpoint-compat-iflow-directionality")


⚠️ Potential issue | 🟠 Major

Use the same client ID in cleanup.

Line 45 registers "endpoint-compat-iflow-direction", but Line 50 unregisters "endpoint-compat-iflow-directionality". That leaves the model in the global registry and makes later tests order-dependent.

🧹 Fix the leaked registry entry

t.Cleanup(func() { - registry.GetGlobalRegistry().UnregisterClient("endpoint-compat-iflow-directionality") + registry.GetGlobalRegistry().UnregisterClient("endpoint-compat-iflow-direction") })

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@sdk/api/handlers/openai/endpoint_compat_test.go` around lines 45 - 50, The cleanup is unregistering a different client ID than the one registered, leaking the registry entry; update the UnregisterClient call so it uses the exact same client ID string used in RegisterClient ("endpoint-compat-iflow-direction")—i.e., locate the register call via registry.GetGlobalRegistry().RegisterClient(...) and make the corresponding registry.GetGlobalRegistry().UnregisterClient(...) use that identical ID.

coderabbitai · 2026-03-15T00:29:59Z

+	m.mu.Lock()
+	defer m.mu.Unlock()
+	if replaced, ok := m.executors[key]; ok {
+		if closer, okCloser := replaced.(ExecutionSessionCloser); okCloser {
+			closer.CloseExecutionSession(CloseAllExecutionSessionsID)
+		}
+	}
+	m.executors[key] = executor


⚠️ Potential issue | 🟠 Major

Move CloseExecutionSession outside m.mu.

CloseExecutionSession is external code. Calling it while holding the manager write lock can deadlock if the executor re-enters the manager, and it blocks every auth/executor read until the close returns.

Possible fix

m.mu.Lock() - defer m.mu.Unlock() - if replaced, ok := m.executors[key]; ok { - if closer, okCloser := replaced.(ExecutionSessionCloser); okCloser { - closer.CloseExecutionSession(CloseAllExecutionSessionsID) - } - } - m.executors[key] = executor + replaced := m.executors[key] + m.executors[key] = executor + m.mu.Unlock() + if closer, ok := replaced.(ExecutionSessionCloser); ok { + closer.CloseExecutionSession(CloseAllExecutionSessionsID) + }

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

m.mu.Lock()

defer m.mu.Unlock()

if replaced, ok := m.executors[key]; ok {

if closer, okCloser := replaced.(ExecutionSessionCloser); okCloser {

closer.CloseExecutionSession(CloseAllExecutionSessionsID)

}

}

m.executors[key] = executor

m.mu.Lock()

replaced := m.executors[key]

m.executors[key] = executor

m.mu.Unlock()

if closer, ok := replaced.(ExecutionSessionCloser); ok {

closer.CloseExecutionSession(CloseAllExecutionSessionsID)

}

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@sdk/cliproxy/auth/manager_compat.go` around lines 186 - 193, The manager currently calls CloseExecutionSession while holding m.mu which can deadlock; modify Set/replace logic so you capture the replaced executor (m.executors[key]) while holding m.mu, then release the lock and call CloseExecutionSession(CloseAllExecutionSessionsID) on it outside the lock; specifically, in the function that assigns m.executors[key] (the block that checks for replaced, ok := m.executors[key] and type-asserts to ExecutionSessionCloser), store the replaced value in a local variable while locked, perform m.executors[key] = executor, unlock (defer removal of defer if needed), and only after the mutex is released, if the local variable implements ExecutionSessionCloser call closer.CloseExecutionSession(CloseAllExecutionSessionsID).

coderabbitai · 2026-03-15T00:29:59Z

+func (m *Manager) ExecuteStream(ctx context.Context, providers []string, req cliproxyexecutor.Request, opts cliproxyexecutor.Options) (*cliproxyexecutor.StreamResult, error) {
+	auth, executor, err := m.selectAuthAndExecutor(ctx, providers, req.Model, opts)
+	if err != nil {
+		return nil, err
+	}
+	req.Model = m.applyAPIKeyModelAlias(auth, req.Model)
+	return executor.ExecuteStream(ctx, auth, req, opts)


⚠️ Potential issue | 🟠 Major

Record immediate stream startup failures.

Unlike Execute and ExecuteCount, this path returns ExecuteStream errors without calling recordExecutionResult, so quota/backoff state is never updated when stream creation fails.

Possible fix

req.Model = m.applyAPIKeyModelAlias(auth, req.Model) - return executor.ExecuteStream(ctx, auth, req, opts) + stream, execErr := executor.ExecuteStream(ctx, auth, req, opts) + if execErr != nil { + m.recordExecutionResult(ctx, auth, req.Model, execErr, nil) + return nil, execErr + } + return stream, nil

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

func (m *Manager) ExecuteStream(ctx context.Context, providers []string, req cliproxyexecutor.Request, opts cliproxyexecutor.Options) (*cliproxyexecutor.StreamResult, error) {

auth, executor, err := m.selectAuthAndExecutor(ctx, providers, req.Model, opts)

if err != nil {

return nil, err

}

req.Model = m.applyAPIKeyModelAlias(auth, req.Model)

return executor.ExecuteStream(ctx, auth, req, opts)

func (m *Manager) ExecuteStream(ctx context.Context, providers []string, req cliproxyexecutor.Request, opts cliproxyexecutor.Options) (*cliproxyexecutor.StreamResult, error) {

auth, executor, err := m.selectAuthAndExecutor(ctx, providers, req.Model, opts)

if err != nil {

return nil, err

}

req.Model = m.applyAPIKeyModelAlias(auth, req.Model)

stream, execErr := executor.ExecuteStream(ctx, auth, req, opts)

if execErr != nil {

m.recordExecutionResult(ctx, auth, req.Model, execErr, nil)

return nil, execErr

}

return stream, nil

}

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@sdk/cliproxy/auth/manager_compat.go` around lines 428 - 434, The ExecuteStream path currently returns executor.ExecuteStream errors without updating quota/backoff state; modify Manager.ExecuteStream so after selecting auth/executor (selectAuthAndExecutor) and applying applyAPIKeyModelAlias, you call executor.ExecuteStream and if it returns an error immediately call m.recordExecutionResult(...) with the same ctx, auth, req.Model (or the aliased model) and an appropriate failure result to update quota/backoff state before returning the error; keep the existing success return path unchanged so only startup failures are recorded.

coderabbitai · 2026-03-15T00:29:59Z

+	var candidates []*Auth
+	var executor ProviderExecutor
+	m.mu.RLock()
+	for _, provider := range providers {
+		exec := m.executors[strings.ToLower(strings.TrimSpace(provider))]
+		if exec == nil {
+			continue
+		}
+		for _, auth := range m.auths {
+			if auth == nil || !strings.EqualFold(auth.Provider, provider) {
+				continue
+			}
+			if pinnedAuthID != "" && auth.ID != pinnedAuthID {
+				continue
+			}
+			candidates = append(candidates, auth.Clone())
+		}
+		if executor == nil {
+			executor = exec
+		}
+	}
+	selector := m.selector
+	m.mu.RUnlock()
+	if len(candidates) == 0 || executor == nil {
+		return nil, nil, &Error{Code: "auth_not_found", Message: "no auth candidates", HTTPStatus: http.StatusServiceUnavailable, Retryable: true}
+	}
+	auth, err := selector.Pick(ctx, providers[0], model, opts, candidates)
+	if err != nil {
+		return nil, nil, err
+	}
+	updateAggregatedAvailability(auth, now)
+	if selectedAuthCallback != nil && auth != nil {
+		selectedAuthCallback(auth.ID)
+	}
+	return auth, executor, nil
+}


⚠️ Potential issue | 🔴 Critical

Keep auth selection and executor selection in the same provider scope.

candidates mixes auths from every requested provider, but executor is pinned to the first provider with an executor and Line 478 always passes providers[0] to the selector. With providers = ["gemini", "claude"], the selector can return a claude auth while Execute* still calls the gemini executor.

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@sdk/cliproxy/auth/manager_compat.go` around lines 452 - 487, The selection mixes auths from all providers into candidates but picks an executor only for the first provider with an executor and always calls selector.Pick with providers[0], causing provider mismatch; instead, for each provider iterate building provider-scoped candidates from m.auths (using strings.EqualFold(auth.Provider, provider) and pinnedAuthID filtering), get the corresponding exec from m.executors, and if exec != nil and that provider has candidates call selector.Pick(ctx, provider, model, opts, providerCandidates); if selector.Pick returns an auth, call updateAggregatedAvailability and selectedAuthCallback(auth.ID) and immediately return that auth and the matching exec; only if no provider yields a pick return the auth_not_found error—this ensures executor and auth come from the same provider (referencing candidates, executor, m.executors, providers, selector.Pick, m.auths, pinnedAuthID, updateAggregatedAvailability, selectedAuthCallback).

coderabbitai · 2026-03-15T00:30:00Z

+	switch strings.ToLower(strings.TrimSpace(auth.Provider)) {
+	case "gemini":
+		for _, key := range cfg.GeminiKey {
+			if strings.TrimSpace(key.APIKey) == apiKey && (baseURL == "" || strings.TrimSpace(key.BaseURL) == baseURL) {
+				models := make([]modelAliasEntry, 0, len(key.Models))
+				for _, model := range key.Models {
+					models = append(models, model)
+				}
+				return resolveModelAliasFromConfigModels(requestedModel, models)
+			}
+		}
+	case "claude":
+		for _, key := range cfg.ClaudeKey {
+			if strings.TrimSpace(key.APIKey) == apiKey && (baseURL == "" || strings.TrimSpace(key.BaseURL) == baseURL) {
+				models := make([]modelAliasEntry, 0, len(key.Models))
+				for _, model := range key.Models {
+					models = append(models, model)
+				}
+				return resolveModelAliasFromConfigModels(requestedModel, models)
+			}
+		}
+	case "codex":
+		for _, key := range cfg.CodexKey {
+			if strings.TrimSpace(key.APIKey) == apiKey && (baseURL == "" || strings.TrimSpace(key.BaseURL) == baseURL) {
+				models := make([]modelAliasEntry, 0, len(key.Models))
+				for _, model := range key.Models {
+					models = append(models, model)
+				}
+				return resolveModelAliasFromConfigModels(requestedModel, models)
+			}
+		}
+	}


🛠️ Refactor suggestion | 🟠 Major

Deduplicate provider-specific API-key alias lookup.

The three switch branches are the same scan with a different config slice, and this helper is already over 50 lines. Extract a provider registry/helper so new providers don't keep extending this hot path with copy/paste.

As per coding guidelines, "Build generic building blocks (provider interface + registry) before application logic" and "Maximum function length: 40 lines".

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@sdk/cliproxy/auth/manager_compat.go` around lines 543 - 574, The switch on strings.ToLower(strings.TrimSpace(auth.Provider)) duplicates the same API-key/model lookup for cfg.GeminiKey, cfg.ClaudeKey and cfg.CodexKey; extract a provider registry/helper and replace the three branches with a single generic lookup. Implement a small provider registry (map[string]func(*Config)[]KeyType or an interface) that maps "gemini"/"claude"/"codex" to functions returning the corresponding cfg.<GeminiKey|ClaudeKey|CodexKey> slice, then in the current function call that helper once: iterate the returned keys, compare strings.TrimSpace(key.APIKey) and BaseURL, build models []modelAliasEntry and call resolveModelAliasFromConfigModels(requestedModel, models). Update names referenced here (auth.Provider, cfg.GeminiKey, cfg.ClaudeKey, cfg.CodexKey, resolveModelAliasFromConfigModels, modelAliasEntry) and ensure the refactor keeps the logic identical and reduces the function length below 40 lines.

Co-authored-by: Codex <noreply@openai.com>

KooshaPari and others added 4 commits March 14, 2026 00:17

fix(openai): stabilize thinking and compat paths

b42fb29

Co-authored-by: Codex <noreply@openai.com>

fix(iflow): preserve non-thinking template kwargs

159461b

Co-authored-by: Codex <noreply@openai.com>

fix(sdk): restore auth manager compatibility

3c288b1

Co-authored-by: Codex <noreply@openai.com>

fix(stability): restore auth and executor compatibility

c7079b7

Co-authored-by: Codex <noreply@openai.com>

coderabbitai Bot added HELIOS-CODEX Bundle identifier for HELIOS-CODEX release train HELIOS-CODEX-L0 HELIOS-CODEX foundation layer labels Mar 15, 2026

gemini-code-assist Bot reviewed Mar 15, 2026

View reviewed changes

coderabbitai Bot reviewed Mar 15, 2026

View reviewed changes

KooshaPari and others added 2 commits March 14, 2026 19:07

merge(main): restack stability fix 1

4e0d170

Co-authored-by: Codex <noreply@openai.com>

fix(ci): satisfy fmt and quality checks

2f13c3f

Co-authored-by: Codex <noreply@openai.com>

KooshaPari merged commit 35e929d into main Mar 15, 2026
27 of 29 checks passed

KooshaPari deleted the codex/stability-fix-1 branch March 15, 2026 02:20

coderabbitai Bot mentioned this pull request Mar 28, 2026

feat(executor): fix payloadModelRulesMatch for unconditional rules #927

Merged

Conversation

KooshaPari commented Mar 15, 2026 • edited by coderabbitai Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary by CodeRabbit

Uh oh!

coderabbitai Bot commented Mar 15, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Review failed

Walkthrough

Changes

Estimated code review effort

Possibly related PRs

Poem

❌ Failed checks (1 warning, 1 inconclusive)

Uh oh!

gemini-code-assist Bot commented Mar 15, 2026

Summary of Changes

Highlights

Footnotes

Uh oh!

gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

gemini-code-assist Bot Mar 15, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot Mar 15, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot Mar 15, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot Mar 15, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot Mar 15, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot Mar 15, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot Mar 15, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot Mar 15, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot Mar 15, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot Mar 15, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

KooshaPari commented Mar 15, 2026 •

edited by coderabbitai Bot

Loading

coderabbitai Bot commented Mar 15, 2026 •

edited

Loading