Skip to content

codex/stability fix 1#861

Merged
KooshaPari merged 6 commits intomainfrom
codex/stability-fix-1
Mar 15, 2026
Merged

codex/stability fix 1#861
KooshaPari merged 6 commits intomainfrom
codex/stability-fix-1

Conversation

@KooshaPari
Copy link
Copy Markdown
Owner

@KooshaPari KooshaPari commented Mar 15, 2026

  • fix(openai): stabilize thinking and compat paths
  • fix(iflow): preserve non-thinking template kwargs
  • fix(sdk): restore auth manager compatibility
  • fix(stability): restore auth and executor compatibility

Summary by CodeRabbit

  • New Features

    • Added support for reasoning effort configuration with levels: none, low, medium, and high for improved thinking customization.
    • Enhanced endpoint compatibility resolution with multi-provider support.
    • Improved handling for MiniMax models.
  • Bug Fixes

    • Fixed cleanup of extraneous reasoning-related metadata in request processing.
  • Tests

    • Added comprehensive test coverage for reasoning effort extraction and endpoint override resolution.
  • Documentation

    • Updated code style and formatting for consistency.

KooshaPari and others added 4 commits March 14, 2026 00:17
Co-authored-by: Codex <noreply@openai.com>
Co-authored-by: Codex <noreply@openai.com>
Co-authored-by: Codex <noreply@openai.com>
Co-authored-by: Codex <noreply@openai.com>
@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented Mar 15, 2026

Caution

Review failed

Pull request was closed or merged during review

📝 Walkthrough

Walkthrough

The changes refactor authentication token storage to embed a shared BaseTokenStorage struct across multiple providers (Gemini, Copilot, Kilo), add parsing support for iFlow reasoning effort levels with corresponding tests, enhance endpoint override resolution with provider-specific logic, and apply formatting normalizations to documentation configuration files.

Changes

Cohort / File(s) Summary
Token Storage Refactoring - Auth Handlers
pkg/llmproxy/api/handlers/management/auth_gemini.go, auth_github.go, auth_kilo.go
Updated token storage initialization to embed BaseTokenStorage with Email/AccessToken/Type fields; added base package imports. Each handler initializes the nested struct with provider-specific values.
Token Storage Refactoring - Auth Packages
pkg/llmproxy/auth/gemini/gemini_auth_test.go, sdk/auth/kilo.go
Extended token storage structs to include BaseTokenStorage field; updated test fixtures and initialization logic. Test error message expectations refined to check only "invalid file path".
Token Storage Tests - Path Validation
pkg/llmproxy/auth/kimi/token_path_test.go
Simplified error string assertion in path traversal rejection test from "invalid token file path" to "invalid file path".
Thinking - iFlow Reasoning Effort Parsing
pkg/llmproxy/thinking/apply.go, apply_iflow_test.go
Introduced parseIFlowReasoningEffort() function to map reasoning effort strings ("none", "low", "medium", "high") to ThinkingConfig modes/levels. Added comprehensive test coverage with nested and flat JSON object structures.
Thinking - Field Stripping
pkg/llmproxy/thinking/strip.go, strip_test.go
Extended iFlow provider field removal to include "reasoning", "reasoning_split", "reasoning_effort", "variant"; added cleanup logic to delete empty chat_template_kwargs. Added tests validating field removal while preserving unrelated content.
iFlow Provider - Legacy Field Handling
pkg/llmproxy/thinking/provider/iflow/apply.go, apply_test.go
For MiniMax models, added removal of reasoning metadata fields (reasoning_effort, reasoning, variant) after setting reasoning_split. Comprehensive tests validate field stripping across different thinking modes and model support levels.
Endpoint Override Resolution
sdk/api/handlers/openai/endpoint_compat.go, endpoint_compat_test.go
Enhanced resolveEndpointOverride() to iterate through model providers querying provider-specific overrides; added resolveEndpointOverrideForInfo() helper and shouldForceNonStreamingChatBridge() for MiniMax detection. Updated function signatures across call sites.
OpenAI Handler Integration
sdk/api/handlers/openai/openai_handlers.go, openai_responses_handlers.go
Updated calls to resolveEndpointOverride() to pass new provider parameter (empty string); preserves existing override logic with new signature.
Documentation - Formatting
docs/.vitepress/config.ts, theme/index.ts
Normalized quote style (single to double quotes), added semicolons, adjusted indentation for consistency.
Documentation - Plugin Updates
docs/.vitepress/plugins/content-tabs.ts
Added exported tabsClientScript constant containing client-side initialization; applied formatting normalization (quotes, semicolons).
Documentation - Vue Component Formatting
docs/.vitepress/theme/components/CategorySwitcher.vue
Minor type annotation formatting in defineProps() signature (added trailing semicolon).

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

Possibly related PRs

  • PR #826: Applies identical BaseTokenStorage embedding pattern to token struct initialization across multiple authentication providers (Gemini, Copilot, Kilo).
  • PR #822: Refactors provider auth token structs to embed shared BaseTokenStorage with consistent import and field initialization changes.
  • PR #824: Migrates provider token storage implementations to embed base.BaseTokenStorage, following the same structural consolidation pattern.

Poem

🐰 Tokens now nestle in their Base, so neat,
Reasoning effort makes thoughts complete,
Endpoints skip between providers with grace,
While docs get quoted and spaced just right—
A hop toward clarity, pixel by pixel bright!

🚥 Pre-merge checks | ✅ 1 | ❌ 2

❌ Failed checks (1 warning, 1 inconclusive)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 12.20% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
Title check ❓ Inconclusive The title 'codex/stability fix 1' is vague and generic, using non-descriptive phrasing that does not convey meaningful information about the changeset. Replace with a clear, specific title that summarizes the main change, such as 'Add BaseTokenStorage embedding and iFlow thinking/endpoint support' or 'Refactor auth structures and stabilize thinking provider logic'.
✅ Passed checks (1 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
  • 📝 Generate docstrings (stacked PR)
  • 📝 Generate docstrings (commit on current branch)
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch codex/stability-fix-1
📝 Coding Plan
  • Generate coding plan for human review comments

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@gemini-code-assist
Copy link
Copy Markdown

Summary of Changes

Hello, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request focuses on enhancing the stability, compatibility, and maintainability of the LLM proxy. It introduces a significant refactoring of authentication token storage to use a common base, simplifies the Kiro executor by removing redundant authentication logic, and improves the handling of "thinking" configurations for iFlow models. Additionally, it refines OpenAI endpoint compatibility and restores critical functionalities within the authentication manager, ensuring more robust and consistent operation across different LLM providers and features.

Highlights

  • Authentication Refactoring: Standardized token storage across multiple LLM providers (Gemini, GitHub Copilot, Kilo, Codex, Qwen) by embedding a BaseTokenStorage struct, improving consistency and maintainability.
  • Kiro Executor Simplification: Removed dedicated Kiro authentication logic from the executor, streamlining the executor's responsibilities and reducing dependencies.
  • iFlow Thinking Configuration Enhancements: Improved handling of "thinking" configurations for iFlow models, including new parsing logic for reasoning.effort and more robust stripping of thinking-related fields from request bodies.
  • OpenAI Endpoint Compatibility: Enhanced OpenAI endpoint compatibility, particularly for iFlow models, by refining how endpoint overrides are resolved and introducing logic to force non-streaming chat bridges for specific models.
  • Auth Manager and Cooldown Compatibility Restoration: Re-implemented or restored key functionalities for the authentication manager, including registration, updates, loading, listing, auto-refresh, executor management, retry configurations, and result marking, along with quota cooldown mechanisms.
  • Configuration Import Path Updates: Standardized internal configuration import paths across various authentication packages.
Changelog
  • pkg/llmproxy/api/handlers/management/auth_gemini.go
    • Imported base.BaseTokenStorage and updated GeminiTokenStorage to embed it, centralizing email management.
  • pkg/llmproxy/api/handlers/management/auth_github.go
    • Imported base.BaseTokenStorage and updated CopilotTokenStorage to embed it, centralizing access token and type management.
  • pkg/llmproxy/api/handlers/management/auth_kilo.go
    • Imported base.BaseTokenStorage and updated KiloTokenStorage to embed it, centralizing email and type management.
  • pkg/llmproxy/auth/claude/anthropic_auth.go
    • Updated the import path for the internal configuration package.
  • pkg/llmproxy/auth/claude/utls_transport.go
    • Updated the import path for the internal configuration package.
  • pkg/llmproxy/auth/codex/openai_auth.go
    • Updated the import path for the internal configuration package.
  • pkg/llmproxy/auth/codex/openai_auth_test.go
    • Imported base.BaseTokenStorage and updated CodexTokenStorage initialization to use the base struct.
  • pkg/llmproxy/auth/codex/token_test.go
    • Imported base.BaseTokenStorage and updated CodexTokenStorage initialization to use the base struct.
  • pkg/llmproxy/auth/copilot/copilot_auth.go
    • Updated the import path for the internal configuration package.
  • pkg/llmproxy/auth/copilot/copilot_extra_test.go
    • Imported base.BaseTokenStorage and updated CopilotTokenStorage initialization to use the base struct.
  • pkg/llmproxy/auth/copilot/token_test.go
    • Imported base.BaseTokenStorage and updated CopilotTokenStorage initialization to use the base struct.
  • pkg/llmproxy/auth/gemini/gemini_auth.go
    • Updated the import path for the internal configuration package.
  • pkg/llmproxy/auth/gemini/gemini_auth_test.go
    • Imported base.BaseTokenStorage, updated GeminiTokenStorage initialization to use the base struct, and refined the error message for path traversal.
  • pkg/llmproxy/auth/iflow/iflow_auth.go
    • Updated the import path for the internal configuration package.
  • pkg/llmproxy/auth/kimi/kimi.go
    • Updated the import path for the internal configuration package.
  • pkg/llmproxy/auth/kimi/token_path_test.go
    • Imported base.BaseTokenStorage, updated KimiTokenStorage initialization to use the base struct, and refined the error message for path traversal.
  • pkg/llmproxy/auth/qwen/qwen_auth.go
    • Imported base.BaseTokenStorage and updated CreateTokenStorage to embed BaseTokenStorage.
  • pkg/llmproxy/auth/qwen/qwen_auth_test.go
    • Imported base.BaseTokenStorage and updated QwenTokenStorage initialization to use the base struct.
  • pkg/llmproxy/auth/qwen/qwen_token.go
    • Refactored QwenTokenStorage to embed base.BaseTokenStorage, added LastRefresh and Expire fields, and updated NewQwenTokenStorage and SaveTokenToFile to align with the new structure.
  • pkg/llmproxy/auth/qwen/qwen_token_test.go
    • Imported base.BaseTokenStorage and updated QwenTokenStorage initialization to use the base struct.
  • pkg/llmproxy/executor/kiro_auth.go
    • Removed the entire file, eliminating Kiro-specific authentication logic from the executor package.
  • pkg/llmproxy/executor/kiro_executor.go
    • Removed unused imports and dependencies related to Kiro authentication and usage tracking.
  • pkg/llmproxy/executor/kiro_streaming.go
    • Corrected import aliases for cliproxyauth and cliproxyexecutor to ensure proper package referencing.
  • pkg/llmproxy/executor/kiro_transform.go
    • Corrected the import alias for cliproxyauth to ensure proper package referencing.
  • pkg/llmproxy/thinking/apply.go
    • Introduced parseIFlowReasoningEffort to handle reasoning.effort fields and integrated it into extractIFlowConfig for improved iFlow thinking configuration parsing.
  • pkg/llmproxy/thinking/apply_iflow_test.go
    • Added a new test file to verify the correct extraction of iFlow thinking configurations based on reasoning.effort fields.
  • pkg/llmproxy/thinking/provider/iflow/apply.go
    • Added logic to remove reasoning_effort, reasoning, and variant fields from the request body after applying thinking configurations for iFlow.
  • pkg/llmproxy/thinking/provider/iflow/apply_test.go
    • Added a new test file to confirm that ApplyMiniMax correctly strips reasoning and variant fields from iFlow requests.
  • pkg/llmproxy/thinking/strip.go
    • Expanded the list of fields to strip for iflow provider to include reasoning and variant, and added logic to remove chat_template_kwargs if it becomes empty.
  • pkg/llmproxy/thinking/strip_test.go
    • Added a new test file to validate the stripping of thinking-related fields for the iflow provider, including empty chat_template_kwargs.
  • pkg/llmproxy/usage/metrics.go
    • Updated the import path for the util package.
  • sdk/api/handlers/openai/endpoint_compat.go
    • Enhanced resolveEndpointOverride to consider all model providers and introduced shouldForceNonStreamingChatBridge for specific models.
  • sdk/api/handlers/openai/endpoint_compat_test.go
    • Added a new test file to verify OpenAI endpoint override logic and the shouldForceNonStreamingChatBridge function.
  • sdk/api/handlers/openai/openai_handlers.go
    • Modified the call to resolveEndpointOverride to include the provider constant for more precise endpoint resolution.
  • sdk/api/handlers/openai/openai_responses_handlers.go
    • Modified the call to resolveEndpointOverride to include the provider constant for more precise endpoint resolution.
  • sdk/auth/kilo.go
    • Imported base.BaseTokenStorage and updated KiloTokenStorage initialization to use the base struct.
  • sdk/cliproxy/auth/cooldown_compat.go
    • Added a new file with functions SetQuotaCooldownDisabled and isQuotaCooldownDisabled to manage quota cooldown behavior.
  • sdk/cliproxy/auth/manager_compat.go
    • Added a new file that re-implements core Manager functionalities, including auth registration, updates, loading, listing, auto-refresh, executor management, retry configuration, and result marking, ensuring compatibility.
  • sdk/cliproxy/auth/oauth_model_alias.go
    • Added apiKeyModelAliasTable type and rebuildAPIKeyModelAliasFromRuntimeConfig method to manage API key model aliases.
Activity
  • No specific activity has been recorded for this pull request yet.
Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for GitHub and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

@coderabbitai coderabbitai Bot added HELIOS-CODEX Bundle identifier for HELIOS-CODEX release train HELIOS-CODEX-L0 HELIOS-CODEX foundation layer labels Mar 15, 2026
Copy link
Copy Markdown

@gemini-code-assist gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces a series of stability fixes and refactorings across the authentication and execution paths. Key changes include refactoring various token storage implementations to use a common base struct, which improves maintainability. Additionally, there are fixes to stabilize the 'thinking' feature for the iFlow provider and to improve OpenAI endpoint compatibility. A significant part of this PR appears to be the restoration of the auth manager and executor compatibility, which should enhance overall system stability. My review includes one suggestion for removing redundant code in the Qwen token handling.

Comment thread pkg/llmproxy/auth/qwen/qwen_token.go Outdated
Comment on lines 55 to 57
if _, err := cleanTokenFilePath(authFilePath, "qwen token"); err != nil {
return err
}
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

This path validation check is redundant. The base.BaseTokenStorage.Save method, which is called on line 60, already performs the same path validation. Removing this block and the local cleanTokenFilePath function will help avoid code duplication and rely on the centralized validation in the base class.

Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 16

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (3)
pkg/llmproxy/auth/claude/utls_transport.go (1)

67-82: ⚠️ Potential issue | 🔴 Critical

Race condition: multiple goroutines can create duplicate connections.

When Broadcast() wakes multiple waiting goroutines, they sequentially acquire the lock. The first waiter finds no connection, falls through, and creates a new pending entry. However, subsequent waiters also fall through (the original pending was deleted before Broadcast) without checking if another waiter has already started creating a connection—they create their own pending entry and connection.

This can lead to resource leaks (duplicate connections) and wasted resources.

🐛 Proposed fix: Use a loop to re-check after waking
 func (t *utlsRoundTripper) getOrCreateConnection(host, addr string) (*http2.ClientConn, error) {
 	t.mu.Lock()
 
-	// Check if connection exists and is usable
-	if h2Conn, ok := t.connections[host]; ok && h2Conn.CanTakeNewRequest() {
-		t.mu.Unlock()
-		return h2Conn, nil
-	}
-
-	// Check if another goroutine is already creating a connection
-	if cond, ok := t.pending[host]; ok {
-		// Wait for the other goroutine to finish
-		cond.Wait()
-		// Check if connection is now available
+	for {
+		// Check if connection exists and is usable
 		if h2Conn, ok := t.connections[host]; ok && h2Conn.CanTakeNewRequest() {
 			t.mu.Unlock()
 			return h2Conn, nil
 		}
-		// Connection still not available, we'll create one
+
+		// Check if another goroutine is already creating a connection
+		if cond, ok := t.pending[host]; ok {
+			// Wait for the other goroutine to finish, then re-check
+			cond.Wait()
+			continue
+		}
+
+		// No one is creating a connection, we'll do it
+		break
 	}
 
 	// Mark this host as pending
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@pkg/llmproxy/auth/claude/utls_transport.go` around lines 67 - 82, The current
wait logic can let multiple waiters proceed to create duplicate connections;
change the single Wait() to a loop that re-checks both t.connections[host] and
t.pending[host] after each wake: acquire the lock, and while there is an
existing pending entry for host, call cond.Wait(); after waking, if a usable
connection exists return it; if no pending entry exists anymore then this
goroutine should create the new sync.Cond, assign t.pending[host]=cond and
proceed as the sole creator; ensure all checks reference t.pending,
t.connections, cond.Wait(), and CanTakeNewRequest() so only one goroutine
becomes the creator.
sdk/api/handlers/openai/openai_responses_handlers.go (1)

91-98: ⚠️ Potential issue | 🟠 Major

The Minimax non-stream fallback never takes effect here.

This bridge still honors stream unconditionally. shouldForceNonStreamingChatBridge was added for this compatibility path, but this branch never applies it, so Minimax requests will keep going through the streaming chat bridge instead of the intended non-stream fallback.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@sdk/api/handlers/openai/openai_responses_handlers.go` around lines 91 - 98,
The branch that converts Responses->Chat currently respects the incoming
"stream" flag unconditionally (in the block using resolveEndpointOverride and
responsesconverter.ConvertOpenAIResponsesRequestToOpenAIChatCompletions), so the
Minimax non-stream fallback never triggers; update this block to consult
shouldForceNonStreamingChatBridge(modelName, rawJSON) after creating chatJSON
and override the stream value to false when that function returns true before
calling h.handleStreamingResponseViaChat or h.handleNonStreamingResponseViaChat
so Minimax requests are routed to the non-streaming handler; reference
resolveEndpointOverride, openAIResponsesEndpoint, OpenaiResponse,
openAIChatEndpoint,
responsesconverter.ConvertOpenAIResponsesRequestToOpenAIChatCompletions,
shouldForceNonStreamingChatBridge, h.handleStreamingResponseViaChat and
h.handleNonStreamingResponseViaChat.
sdk/api/handlers/openai/openai_handlers.go (1)

117-130: 🛠️ Refactor suggestion | 🟠 Major

Extract this compat branch out of ChatCompletions.

This routing path pushes ChatCompletions further past the 40-line limit and makes the endpoint/format dispatch harder to follow. Pull the override-and-conversion flow into a helper before adding more cases.

As per coding guidelines, "Maximum function length: 40 lines".

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@sdk/api/handlers/openai/openai_handlers.go` around lines 117 - 130, Extract
the compat branch inside ChatCompletions into a new helper method (e.g.,
handleResponsesCompat or resolveAndHandleResponsesOverride) that accepts the
same inputs used there (modelName, rawJSON, stream flag, and the context/caller
receiver h and c) and returns a boolean indicating “handled” (true means caller
should return). Move the resolveEndpointOverride check, originalChat assignment,
shouldTreatAsResponsesFormat check, codexconverter.ConvertOpenAIRequestToCodex
call, re-evaluation of stream via gjson.GetBytes(rawJSON, "stream").Bool(), and
the conditional calls to h.handleStreamingResponseViaResponses /
h.handleNonStreamingResponseViaResponses into that helper. Replace the original
inlined block with a single if helper(...) { return } so ChatCompletions stays
under 40 lines and routing is clearer.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@pkg/llmproxy/auth/claude/utls_transport.go`:
- Line 13: Current code imports logrus as log (import alias: log
"github.com/sirupsen/logrus"); follow-up refactor should replace that import
with zerolog (e.g., github.com/rs/zerolog) and update all usages of the log
alias (calls like log.Info, log.Errorf, log.Debug, etc.) to zerolog equivalents
(logger.Info().Msg, logger.Error().Err(err).Msg, etc.), initializing a zerolog
logger instance where needed and ensuring structured fields are migrated to
With/Dict patterns; search for the import line and all references to the log
identifier in utls_transport.go (and related files) to perform the replacement
consistently.

In `@pkg/llmproxy/thinking/provider/iflow/apply_test.go`:
- Around line 54-93: Update the
TestApplyMiniMaxStripsReasoningVariantsAndLegacyFields test to include a
regression case where the JSON body contains a top-level "reasoning.effort" key
(e.g. "reasoning.effort":"medium") alongside the existing fields; call
a.Apply(body, cfg, model) as before and add an assertion that
res.Get("reasoning.effort").Exists() is false (or equivalent check) to ensure
Apply removes the literal reasoning.effort key; keep the rest of the assertions
unchanged so unrelated fields (e.g. "foo") are still preserved.

In `@pkg/llmproxy/thinking/provider/iflow/apply.go`:
- Around line 146-148: The cleanup in applyMiniMax misses the literal dotted key
"reasoning.effort" so add an sjson.DeleteBytes call to remove the escaped path
"reasoning\\.effort" alongside the existing deletes; locate the applyMiniMax
function in pkg/llmproxy/thinking/provider/iflow/apply.go and insert a deletion
using sjson.DeleteBytes(result, "reasoning\\.effort") (follow existing error
handling pattern) so that keys accepted by extractIFlowConfig (which reads
"reasoning\\.effort") are not leaked upstream.

In `@pkg/llmproxy/thinking/strip.go`:
- Around line 43-50: The iflow stripping list in pkg/llmproxy/thinking/strip.go
currently removes nested "reasoning" but not a literal top-level key
"reasoning.effort"; update the paths slice (the variable assigned to paths in
strip.go) to include the escaped literal key "reasoning\.effort" so top-level
dot-keys are stripped as well, and consider moving this paths list into a shared
constant used by the iflow cleanup code in provider/iflow/apply.go to avoid
drift.

In `@sdk/api/handlers/openai/endpoint_compat_test.go`:
- Around line 62-81: Register a second provider for the same modelID that only
supports "/responses" (e.g., call registry.GetGlobalRegistry().RegisterClient
with a different client name and provider id like "bridge" and ModelInfo
SupportedEndpoints []string{"/responses"}), add a t.Cleanup to UnregisterClient
that second provider, and then update the assertions around
resolveEndpointOverride(modelID, "/responses", OpenaiResponse) to assert that ok
is true and overrideEndpoint == "/responses" (i.e., the requested /responses
remains unmodified) while keeping the original iflow registration in place so
the mixed-provider case is exercised.
- Around line 45-50: The cleanup is unregistering a different client ID than the
one registered, leaking the registry entry; update the UnregisterClient call so
it uses the exact same client ID string used in RegisterClient
("endpoint-compat-iflow-direction")—i.e., locate the register call via
registry.GetGlobalRegistry().RegisterClient(...) and make the corresponding
registry.GetGlobalRegistry().UnregisterClient(...) use that identical ID.

In `@sdk/api/handlers/openai/endpoint_compat.go`:
- Around line 19-26: The loop treating
resolveEndpointOverrideForInfo(reg.GetModelInfo(...), requestedEndpoint) the
same for both "no override" and "provider already supports endpoint" causes
native-support cases to be overwritten by later providers; update the logic so
that when iterating providers from reg.GetModelProviders(modelName) you stop and
return immediately if the provider already supports the requested endpoint
(e.g., check modelInfo.SupportsEndpoint(requestedEndpoint) on
reg.GetModelInfo(modelName, provider) or change resolveEndpointOverrideForInfo
to return an explicit tri-state like (override, ok, supported)); only continue
scanning when there is no native support and an override is not present, and
only accept an override when resolveEndpointOverrideForInfo indicates an actual
override (override + ok).

In `@sdk/cliproxy/auth/cooldown_compat.go`:
- Around line 3-5: Add a Go doc comment for the exported function
SetQuotaCooldownDisabled describing its purpose and usage; place a brief
sentence or two immediately above the SetQuotaCooldownDisabled function stating
that it enables/disables quota cooldown behavior and when callers should use it,
and reference the underlying quotaCooldownDisabled store variable in the comment
for clarity.

In `@sdk/cliproxy/auth/manager_compat.go`:
- Around line 16-24: Create shared error message constants (e.g.,
ErrMsgAuthRequired, ErrMsgAuthIDRequired, ErrMsgProviderRequired) in the auth
package (or a nearby shared file) and replace the inline string literals in
manager_compat.go (locations around the auth validation branches and the other
occurrences you noted at lines ~54-60, ~439, ~476) with those constants; update
the Error{Message: ...} initializers to use the new constants (or fmt.Sprintf
with template constants if interpolation is needed) so all
register/update/select paths reference the same template values.
- Around line 39-46: The code updates the in-memory map m.auths (via
m.mu.Lock()/m.auths[next.ID] = next.Clone()) before calling m.store.Save(ctx,
next.Clone()), which can leave runtime state ahead of durable state on save
failure or races; change the flow in the methods touching m.auths and
m.store.Save (the two mirrored blocks around m.auths/next.Clone and m.store.Save
guarded by shouldSkipPersist) so you attempt the persistent Save first (call
m.store.Save with the clone), and only if Save succeeds acquire m.mu, assign
m.auths[next.ID] = next.Clone() and then unlock; alternatively hold m.mu for the
duration of Save to serialize races—ensure you use next.Clone for both Save and
the in-memory assignment and return the original error if Save fails.
- Around line 148-163: StartAutoRefresh ignores the interval and does nothing
besides waiting for ctx.Done(); change it to use the provided interval parameter
(remove the blank identifier) and spawn a goroutine that ticks at that interval
to run the Manager's refresh logic until canceled. Specifically: accept the
interval argument in StartAutoRefresh, call StopAutoRefresh() as you already do,
create refreshCtx and cancel, store cancel in m.refreshCancel, then start a
goroutine that creates a time.Ticker from the interval (handle zero/negative by
defaulting or returning) and loops with select { case <-ticker.C: invoke the
manager's refresh routine (e.g., m.Refresh() or the method that renews auths);
case <-refreshCtx.Done(): ticker.Stop(); return }. Retain the existing names
Manager, StartAutoRefresh, StopAutoRefresh, m.refreshCancel, refreshCtx, cancel
to locate the change.
- Around line 452-487: The selection mixes auths from all providers into
candidates but picks an executor only for the first provider with an executor
and always calls selector.Pick with providers[0], causing provider mismatch;
instead, for each provider iterate building provider-scoped candidates from
m.auths (using strings.EqualFold(auth.Provider, provider) and pinnedAuthID
filtering), get the corresponding exec from m.executors, and if exec != nil and
that provider has candidates call selector.Pick(ctx, provider, model, opts,
providerCandidates); if selector.Pick returns an auth, call
updateAggregatedAvailability and selectedAuthCallback(auth.ID) and immediately
return that auth and the matching exec; only if no provider yields a pick return
the auth_not_found error—this ensures executor and auth come from the same
provider (referencing candidates, executor, m.executors, providers,
selector.Pick, m.auths, pinnedAuthID, updateAggregatedAvailability,
selectedAuthCallback).
- Around line 428-434: The ExecuteStream path currently returns
executor.ExecuteStream errors without updating quota/backoff state; modify
Manager.ExecuteStream so after selecting auth/executor (selectAuthAndExecutor)
and applying applyAPIKeyModelAlias, you call executor.ExecuteStream and if it
returns an error immediately call m.recordExecutionResult(...) with the same
ctx, auth, req.Model (or the aliased model) and an appropriate failure result to
update quota/backoff state before returning the error; keep the existing success
return path unchanged so only startup failures are recorded.
- Around line 543-574: The switch on
strings.ToLower(strings.TrimSpace(auth.Provider)) duplicates the same
API-key/model lookup for cfg.GeminiKey, cfg.ClaudeKey and cfg.CodexKey; extract
a provider registry/helper and replace the three branches with a single generic
lookup. Implement a small provider registry (map[string]func(*Config)[]KeyType
or an interface) that maps "gemini"/"claude"/"codex" to functions returning the
corresponding cfg.<GeminiKey|ClaudeKey|CodexKey> slice, then in the current
function call that helper once: iterate the returned keys, compare
strings.TrimSpace(key.APIKey) and BaseURL, build models []modelAliasEntry and
call resolveModelAliasFromConfigModels(requestedModel, models). Update names
referenced here (auth.Provider, cfg.GeminiKey, cfg.ClaudeKey, cfg.CodexKey,
resolveModelAliasFromConfigModels, modelAliasEntry) and ensure the refactor
keeps the logic identical and reduces the function length below 40 lines.
- Around line 186-193: The manager currently calls CloseExecutionSession while
holding m.mu which can deadlock; modify Set/replace logic so you capture the
replaced executor (m.executors[key]) while holding m.mu, then release the lock
and call CloseExecutionSession(CloseAllExecutionSessionsID) on it outside the
lock; specifically, in the function that assigns m.executors[key] (the block
that checks for replaced, ok := m.executors[key] and type-asserts to
ExecutionSessionCloser), store the replaced value in a local variable while
locked, perform m.executors[key] = executor, unlock (defer removal of defer if
needed), and only after the mutex is released, if the local variable implements
ExecutionSessionCloser call
closer.CloseExecutionSession(CloseAllExecutionSessionsID).

In `@sdk/cliproxy/auth/oauth_model_alias.go`:
- Around line 60-65: The method rebuildAPIKeyModelAliasFromRuntimeConfig
currently only clears m.apiKeyModelAlias but its name implies it reads runtime
config and repopulates entries; either rename it to clearAPIKeyModelAliasCache
(leave body as m.apiKeyModelAlias.Store(apiKeyModelAliasTable(nil))) or
implement the intended rebuild: fetch the runtime config from Manager (e.g.
m.runtimeConfig or a getter like m.GetRuntimeConfig()), build the alias table
via apiKeyModelAliasTable(cfg) and then call m.apiKeyModelAlias.Store(...) to
replace the cache; update callers to match the new name or behavior and keep
references to Manager, rebuildAPIKeyModelAliasFromRuntimeConfig,
apiKeyModelAlias, and apiKeyModelAliasTable to locate the change.

---

Outside diff comments:
In `@pkg/llmproxy/auth/claude/utls_transport.go`:
- Around line 67-82: The current wait logic can let multiple waiters proceed to
create duplicate connections; change the single Wait() to a loop that re-checks
both t.connections[host] and t.pending[host] after each wake: acquire the lock,
and while there is an existing pending entry for host, call cond.Wait(); after
waking, if a usable connection exists return it; if no pending entry exists
anymore then this goroutine should create the new sync.Cond, assign
t.pending[host]=cond and proceed as the sole creator; ensure all checks
reference t.pending, t.connections, cond.Wait(), and CanTakeNewRequest() so only
one goroutine becomes the creator.

In `@sdk/api/handlers/openai/openai_handlers.go`:
- Around line 117-130: Extract the compat branch inside ChatCompletions into a
new helper method (e.g., handleResponsesCompat or
resolveAndHandleResponsesOverride) that accepts the same inputs used there
(modelName, rawJSON, stream flag, and the context/caller receiver h and c) and
returns a boolean indicating “handled” (true means caller should return). Move
the resolveEndpointOverride check, originalChat assignment,
shouldTreatAsResponsesFormat check, codexconverter.ConvertOpenAIRequestToCodex
call, re-evaluation of stream via gjson.GetBytes(rawJSON, "stream").Bool(), and
the conditional calls to h.handleStreamingResponseViaResponses /
h.handleNonStreamingResponseViaResponses into that helper. Replace the original
inlined block with a single if helper(...) { return } so ChatCompletions stays
under 40 lines and routing is clearer.

In `@sdk/api/handlers/openai/openai_responses_handlers.go`:
- Around line 91-98: The branch that converts Responses->Chat currently respects
the incoming "stream" flag unconditionally (in the block using
resolveEndpointOverride and
responsesconverter.ConvertOpenAIResponsesRequestToOpenAIChatCompletions), so the
Minimax non-stream fallback never triggers; update this block to consult
shouldForceNonStreamingChatBridge(modelName, rawJSON) after creating chatJSON
and override the stream value to false when that function returns true before
calling h.handleStreamingResponseViaChat or h.handleNonStreamingResponseViaChat
so Minimax requests are routed to the non-streaming handler; reference
resolveEndpointOverride, openAIResponsesEndpoint, OpenaiResponse,
openAIChatEndpoint,
responsesconverter.ConvertOpenAIResponsesRequestToOpenAIChatCompletions,
shouldForceNonStreamingChatBridge, h.handleStreamingResponseViaChat and
h.handleNonStreamingResponseViaChat.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: ASSERTIVE

Plan: Pro

Run ID: ceff7acf-7eb6-4d80-a16e-d01eb1d31360

📥 Commits

Reviewing files that changed from the base of the PR and between a82cb72 and c7079b7.

📒 Files selected for processing (39)
  • pkg/llmproxy/api/handlers/management/auth_gemini.go
  • pkg/llmproxy/api/handlers/management/auth_github.go
  • pkg/llmproxy/api/handlers/management/auth_kilo.go
  • pkg/llmproxy/auth/claude/anthropic_auth.go
  • pkg/llmproxy/auth/claude/utls_transport.go
  • pkg/llmproxy/auth/codex/openai_auth.go
  • pkg/llmproxy/auth/codex/openai_auth_test.go
  • pkg/llmproxy/auth/codex/token_test.go
  • pkg/llmproxy/auth/copilot/copilot_auth.go
  • pkg/llmproxy/auth/copilot/copilot_extra_test.go
  • pkg/llmproxy/auth/copilot/token_test.go
  • pkg/llmproxy/auth/gemini/gemini_auth.go
  • pkg/llmproxy/auth/gemini/gemini_auth_test.go
  • pkg/llmproxy/auth/iflow/iflow_auth.go
  • pkg/llmproxy/auth/kimi/kimi.go
  • pkg/llmproxy/auth/kimi/token_path_test.go
  • pkg/llmproxy/auth/qwen/qwen_auth.go
  • pkg/llmproxy/auth/qwen/qwen_auth_test.go
  • pkg/llmproxy/auth/qwen/qwen_token.go
  • pkg/llmproxy/auth/qwen/qwen_token_test.go
  • pkg/llmproxy/executor/kiro_auth.go
  • pkg/llmproxy/executor/kiro_executor.go
  • pkg/llmproxy/executor/kiro_streaming.go
  • pkg/llmproxy/executor/kiro_transform.go
  • pkg/llmproxy/thinking/apply.go
  • pkg/llmproxy/thinking/apply_iflow_test.go
  • pkg/llmproxy/thinking/provider/iflow/apply.go
  • pkg/llmproxy/thinking/provider/iflow/apply_test.go
  • pkg/llmproxy/thinking/strip.go
  • pkg/llmproxy/thinking/strip_test.go
  • pkg/llmproxy/usage/metrics.go
  • sdk/api/handlers/openai/endpoint_compat.go
  • sdk/api/handlers/openai/endpoint_compat_test.go
  • sdk/api/handlers/openai/openai_handlers.go
  • sdk/api/handlers/openai/openai_responses_handlers.go
  • sdk/auth/kilo.go
  • sdk/cliproxy/auth/cooldown_compat.go
  • sdk/cliproxy/auth/manager_compat.go
  • sdk/cliproxy/auth/oauth_model_alias.go
💤 Files with no reviewable changes (2)
  • pkg/llmproxy/executor/kiro_executor.go
  • pkg/llmproxy/executor/kiro_auth.go
📜 Review details
🧰 Additional context used
📓 Path-based instructions (1)
**/*.go

📄 CodeRabbit inference engine (AGENTS.md)

**/*.go: NEVER create a v2 file - refactor the original instead
NEVER create a new class if an existing one can be made generic
NEVER create custom implementations when an OSS library exists - search pkg.go.dev for existing libraries before writing code
Build generic building blocks (provider interface + registry) before application logic
Use chi for HTTP routing (NOT custom routers)
Use zerolog for logging (NOT fmt.Print)
Use viper for configuration (NOT manual env parsing)
Use go-playground/validator for validation (NOT manual if/else validation)
Use golang.org/x/time/rate for rate limiting (NOT custom limiters)
Use template strings for messages instead of hardcoded messages and config-driven logic instead of code-driven
Zero new lint suppressions without inline justification
All new code must pass: go fmt, go vet, golint
Maximum function length: 40 lines
No placeholder TODOs in committed code

Files:

  • pkg/llmproxy/auth/gemini/gemini_auth_test.go
  • pkg/llmproxy/auth/codex/openai_auth.go
  • pkg/llmproxy/auth/iflow/iflow_auth.go
  • pkg/llmproxy/auth/claude/anthropic_auth.go
  • pkg/llmproxy/auth/claude/utls_transport.go
  • pkg/llmproxy/auth/copilot/copilot_auth.go
  • pkg/llmproxy/thinking/provider/iflow/apply.go
  • sdk/cliproxy/auth/cooldown_compat.go
  • pkg/llmproxy/auth/gemini/gemini_auth.go
  • pkg/llmproxy/auth/codex/openai_auth_test.go
  • sdk/api/handlers/openai/openai_responses_handlers.go
  • pkg/llmproxy/thinking/strip_test.go
  • pkg/llmproxy/thinking/provider/iflow/apply_test.go
  • sdk/cliproxy/auth/oauth_model_alias.go
  • sdk/api/handlers/openai/endpoint_compat_test.go
  • pkg/llmproxy/thinking/apply_iflow_test.go
  • pkg/llmproxy/auth/copilot/copilot_extra_test.go
  • pkg/llmproxy/auth/kimi/kimi.go
  • pkg/llmproxy/executor/kiro_streaming.go
  • sdk/auth/kilo.go
  • sdk/api/handlers/openai/openai_handlers.go
  • pkg/llmproxy/api/handlers/management/auth_kilo.go
  • pkg/llmproxy/usage/metrics.go
  • pkg/llmproxy/api/handlers/management/auth_gemini.go
  • pkg/llmproxy/executor/kiro_transform.go
  • sdk/api/handlers/openai/endpoint_compat.go
  • pkg/llmproxy/auth/qwen/qwen_auth_test.go
  • pkg/llmproxy/auth/codex/token_test.go
  • pkg/llmproxy/api/handlers/management/auth_github.go
  • pkg/llmproxy/thinking/strip.go
  • pkg/llmproxy/auth/qwen/qwen_token_test.go
  • pkg/llmproxy/auth/copilot/token_test.go
  • pkg/llmproxy/auth/qwen/qwen_token.go
  • pkg/llmproxy/auth/qwen/qwen_auth.go
  • pkg/llmproxy/auth/kimi/token_path_test.go
  • pkg/llmproxy/thinking/apply.go
  • sdk/cliproxy/auth/manager_compat.go
🧠 Learnings (1)
📚 Learning: 2026-02-25T10:11:41.448Z
Learnt from: CR
Repo: KooshaPari/cliproxyapi-plusplus PR: 0
File: AGENTS.md:0-0
Timestamp: 2026-02-25T10:11:41.448Z
Learning: Applies to **/*.go : Use template strings for messages instead of hardcoded messages and config-driven logic instead of code-driven

Applied to files:

  • pkg/llmproxy/thinking/strip.go
🔇 Additional comments (38)
pkg/llmproxy/usage/metrics.go (1)

7-7: Import path migration looks correct.

Line 7 now matches the module declared in go.mod and the util package path, which avoids mixed-prefix import/build issues.

pkg/llmproxy/executor/kiro_transform.go (1)

13-13: Import alias rename is clean and non-breaking.

Line 13 updates the alias to cliproxyauth and usage remains consistent in this file.

pkg/llmproxy/executor/kiro_streaming.go (1)

22-23: Alias updates are consistent and safe.

Lines 22-23 align import aliases with current naming (cliproxyauth, cliproxyexecutor) and all in-file references match.

pkg/llmproxy/auth/claude/anthropic_auth.go (1)

16-16: Import path migration looks correct.

This update is consistent with relocating config usage to pkg/llmproxy/config and introduces no functional risk in this segment.

sdk/cliproxy/auth/cooldown_compat.go (1)

7-9: LGTM!

Correct use of atomic.Bool.Load() for thread-safe read access. The unexported getter properly encapsulates access to the package-level variable and aligns with its usage in manager_compat.go.

pkg/llmproxy/auth/iflow/iflow_auth.go (1)

16-16: Import migration verified as complete and consistent across all auth modules.

The import path update from internal/config to pkg/llmproxy/config is correct and has been successfully applied across all auth providers (copilot, codex, synthesizer, qwen, kiro, kimi, claude, diff, gemini, antigravity). No remaining references to internal/config exist in auth modules.

pkg/llmproxy/auth/claude/utls_transport.go (1)

11-11: LGTM!

The import path update from internal/config to pkg/llmproxy/config is consistent with the broader refactoring across auth modules described in the PR objectives.

pkg/llmproxy/auth/kimi/kimi.go (1)

18-18: LGTM!

The import path change from internal/config to pkg/llmproxy/config is consistent with the broader refactoring to expose configuration as a public package.

pkg/llmproxy/auth/codex/openai_auth.go (1)

17-17: LGTM!

The import path update aligns with the consistent migration from internal/config to pkg/llmproxy/config across the codebase.

pkg/llmproxy/auth/codex/openai_auth_test.go (2)

12-12: LGTM!

Import correctly added to support the embedded BaseTokenStorage structure in test initializations.


300-300: LGTM!

Test initialization correctly updated to use the embedded BaseTokenStorage structure, maintaining compatibility with field promotion for assertions on storage.AccessToken.

pkg/llmproxy/api/handlers/management/auth_github.go (2)

10-10: LGTM!

Import for base package correctly added to support the embedded BaseTokenStorage pattern.


55-58: LGTM!

Token storage initialization correctly migrated to use the embedded BaseTokenStorage structure with AccessToken and Type fields properly nested.

pkg/llmproxy/auth/copilot/copilot_extra_test.go (3)

13-13: LGTM!

Import correctly added to support BaseTokenStorage usage in test initializations.


179-181: LGTM!

Test initialization correctly updated to construct CopilotTokenStorage with embedded BaseTokenStorage.


187-189: LGTM!

Consistent with the valid token test case, correctly using embedded BaseTokenStorage for the expired token scenario.

pkg/llmproxy/auth/copilot/copilot_auth.go (1)

13-13: LGTM!

Import path correctly updated to pkg/llmproxy/config, consistent with the codebase-wide migration. The SDKConfig field access via cfg.SDKConfig remains valid per the config struct definition.

pkg/llmproxy/auth/kimi/token_path_test.go (3)

6-7: LGTM!

Import correctly added to support BaseTokenStorage usage.


11-11: LGTM!

Test initialization correctly updated to use embedded BaseTokenStorage structure.


18-18: No action needed. The test correctly validates the error message. The underlying SaveTokenToFile delegates to BaseTokenStorage.Save(), which generates the error message "base token storage: invalid file path: %w". Since the test uses strings.Contains() to check for "invalid file path", it will successfully match this error message. The implementation and test are aligned—there is no mismatch or regression.

			> Likely an incorrect or invalid review comment.
pkg/llmproxy/auth/codex/token_test.go (2)

8-9: LGTM!

Import correctly added to support BaseTokenStorage usage in test initializations.


21-29: LGTM!

Test initialization correctly updated to nest AccessToken, RefreshToken, and Email inside BaseTokenStorage, while keeping IDToken and AccountID at the top level. The test assertions on lines 46 and 49 will work correctly due to Go's embedded struct field promotion.

pkg/llmproxy/auth/copilot/token_test.go (1)

9-9: Base token embedding update in tests looks correct.

The fixture now matches the migrated token storage shape and should keep serialization coverage aligned with production structs.

Also applies to: 22-25

pkg/llmproxy/auth/gemini/gemini_auth.go (1)

17-17: Config import path update is consistent and safe.

This change cleanly aligns Gemini auth with the new pkg/llmproxy/config package location.

pkg/llmproxy/api/handlers/management/auth_kilo.go (1)

10-10: Kilo token storage migration is applied correctly.

Using BaseTokenStorage here keeps the handler aligned with the shared auth storage model while preserving current behavior.

Also applies to: 60-63

pkg/llmproxy/auth/qwen/qwen_auth_test.go (1)

13-13: Qwen auth test fixture update is consistent with BaseTokenStorage embedding.

The changed setup keeps the traversal-path test aligned with the new storage structure.

Also applies to: 157-157

sdk/auth/kilo.go (1)

8-8: SDK Kilo storage initialization is compatible with the embedded base struct.

This update correctly preserves Email/Type through BaseTokenStorage during auth record construction.

Also applies to: 101-104

pkg/llmproxy/auth/qwen/qwen_token_test.go (1)

8-8: Qwen token tests are correctly updated for BaseTokenStorage embedding.

Both test cases now initialize token data using the shared base storage shape.

Also applies to: 17-20, 35-37

pkg/llmproxy/auth/gemini/gemini_auth_test.go (1)

15-15: Gemini auth tests are aligned with the storage refactor and updated error wording.

The changed fixture shape and traversal-path assertion remain coherent with the new token storage contract.

Also applies to: 50-53, 82-82

pkg/llmproxy/api/handlers/management/auth_gemini.go (1)

14-14: Gemini management handler uses the new base token storage pattern correctly.

This keeps handler-generated auth records compatible with the shared storage model.

Also applies to: 142-144

pkg/llmproxy/auth/qwen/qwen_auth.go (2)

16-16: LGTM!

Import correctly added for the new base package containing BaseTokenStorage.


351-363: LGTM!

The refactored CreateTokenStorage correctly initializes the embedded BaseTokenStorage with token data. This approach:

  • Properly sets AccessToken and RefreshToken within the embedded struct
  • Maintains the Qwen-specific fields (LastRefresh, ResourceURL, Expire) at the outer struct level
  • Aligns with the pattern used across other provider modules in this PR
pkg/llmproxy/auth/qwen/qwen_token.go (3)

19-30: LGTM!

The struct refactoring is well-designed:

  • Anonymous embedding of base.BaseTokenStorage promotes fields to the JSON top level, maintaining backward compatibility with existing token files
  • Value embedding (not pointer) ensures the base storage is always initialized, avoiding nil pointer issues
  • Qwen-specific fields remain cleanly separated at the outer struct level

38-42: LGTM!

Constructor correctly initializes only the FilePath field, which is appropriate since other fields will be populated during authentication or when loading from file.


53-61: LGTM!

The SaveTokenToFile method correctly:

  1. Sets the Type field on the embedded BaseTokenStorage before saving
  2. Passes the full QwenTokenStorage (ts) as the second parameter to Save(), ensuring all provider-specific fields are serialized alongside base fields

This aligns with the Save method signature shown in pkg/llmproxy/auth/base/token_storage.go:52-95.

pkg/llmproxy/thinking/apply.go (1)

535-565: Good iFlow compatibility handling for effort parsing.

Supporting both nested reasoning.effort and escaped literal reasoning\.effort is a solid stabilization change.

pkg/llmproxy/thinking/strip_test.go (1)

9-71: Test coverage for iFlow strip behavior looks solid.

These cases clearly validate field stripping and preservation of unrelated payload content.

pkg/llmproxy/thinking/apply_iflow_test.go (1)

5-64: Good targeted tests for extractIFlowConfig.

The table-driven cases exercise both payload shapes and expected mode/level mapping cleanly.

pkgconfig "github.com/kooshapari/cliproxyapi-plusplus/v6/internal/config"
pkgconfig "github.com/kooshapari/cliproxyapi-plusplus/v6/pkg/llmproxy/config"
tls "github.com/refraction-networking/utls"
log "github.com/sirupsen/logrus"
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick | 🔵 Trivial

Consider migrating to zerolog.

The code uses logrus for logging, but the coding guidelines specify using zerolog. While this is pre-existing code, consider aligning with the guideline in a future refactor.

As per coding guidelines: "Use zerolog for logging (NOT fmt.Print)".

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@pkg/llmproxy/auth/claude/utls_transport.go` at line 13, Current code imports
logrus as log (import alias: log "github.com/sirupsen/logrus"); follow-up
refactor should replace that import with zerolog (e.g., github.com/rs/zerolog)
and update all usages of the log alias (calls like log.Info, log.Errorf,
log.Debug, etc.) to zerolog equivalents (logger.Info().Msg,
logger.Error().Err(err).Msg, etc.), initializing a zerolog logger instance where
needed and ensuring structured fields are migrated to With/Dict patterns; search
for the import line and all references to the log identifier in
utls_transport.go (and related files) to perform the replacement consistently.

Comment on lines +54 to +93
func TestApplyMiniMaxStripsReasoningVariantsAndLegacyFields(t *testing.T) {
a := NewApplier()
body := []byte(`{
"reasoning_split":true,
"reasoning_effort":"high",
"reasoning":{"effort":"medium","summary":{"text":"legacy"}},
"variant":"low",
"foo":"bar"
}`)
cfg := thinking.ThinkingConfig{Mode: thinking.ModeLevel, Level: thinking.LevelLow}
model := &registry.ModelInfo{
ID: "minimax-m2.5",
Thinking: &registry.ThinkingSupport{Levels: []string{"low", "medium", "high"}},
}

out, err := a.Apply(body, cfg, model)
if err != nil {
t.Fatalf("unexpected error: %v", err)
}

res := gjson.ParseBytes(out)
if !res.Get("reasoning_split").Bool() {
t.Fatalf("expected reasoning_split=true")
}
if res.Get("reasoning_effort").Exists() {
t.Fatalf("expected reasoning_effort to be removed")
}
if res.Get("reasoning").Exists() {
t.Fatalf("expected reasoning object to be removed")
}
if res.Get("reasoning.effort").Exists() {
t.Fatalf("expected reasoning.effort to be removed")
}
if res.Get("variant").Exists() {
t.Fatalf("expected variant to be removed")
}
if res.Get("foo").String() != "bar" {
t.Fatalf("expected unrelated fields to be preserved")
}
}
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick | 🔵 Trivial

Add a regression case for literal reasoning.effort input.

Please add one MiniMax test body containing a top-level "reasoning.effort" key and assert it is removed after Apply, to guard the compatibility cleanup path.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@pkg/llmproxy/thinking/provider/iflow/apply_test.go` around lines 54 - 93,
Update the TestApplyMiniMaxStripsReasoningVariantsAndLegacyFields test to
include a regression case where the JSON body contains a top-level
"reasoning.effort" key (e.g. "reasoning.effort":"medium") alongside the existing
fields; call a.Apply(body, cfg, model) as before and add an assertion that
res.Get("reasoning.effort").Exists() is false (or equivalent check) to ensure
Apply removes the literal reasoning.effort key; keep the rest of the assertions
unchanged so unrelated fields (e.g. "foo") are still preserved.

Comment on lines +146 to +148
result, _ = sjson.DeleteBytes(result, "reasoning_effort")
result, _ = sjson.DeleteBytes(result, "reasoning")
result, _ = sjson.DeleteBytes(result, "variant")
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🏁 Script executed:

# First, get the file structure and locate the apply.go file
find . -type f -name "apply.go" | head -20

Repository: KooshaPari/cliproxyapi-plusplus

Length of output: 503


🏁 Script executed:

# Read the apply.go file with context around lines 146-148
cat -n pkg/llmproxy/thinking/provider/iflow/apply.go | head -170 | tail -50

Repository: KooshaPari/cliproxyapi-plusplus

Length of output: 1842


🏁 Script executed:

# Look for extractIFlowConfig and understand what fields it expects
rg -B 5 -A 15 "extractIFlowConfig" pkg/llmproxy/thinking/

Repository: KooshaPari/cliproxyapi-plusplus

Length of output: 5435


🏁 Script executed:

# Search for "reasoning.effort" or "reasoning_effort" patterns in the codebase
rg "reasoning[._]effort" --type go

Repository: KooshaPari/cliproxyapi-plusplus

Length of output: 31454


🏁 Script executed:

# Check the test for applyMiniMax to see what's expected
cat -n pkg/llmproxy/thinking/provider/iflow/apply_test.go | grep -A 30 "applyMiniMax"

Repository: KooshaPari/cliproxyapi-plusplus

Length of output: 57


🏁 Script executed:

# Look for how sjson is used with dotted keys in the codebase
rg -B 2 -A 2 "sjson.*\\\\\." --type go

Repository: KooshaPari/cliproxyapi-plusplus

Length of output: 57


🏁 Script executed:

# Check the full apply_iflow_test.go test cases
cat -n pkg/llmproxy/thinking/apply_iflow_test.go | grep -B 10 -A 10 "reasoning.effort"

Repository: KooshaPari/cliproxyapi-plusplus

Length of output: 1105


🏁 Script executed:

# Search for actual sjson usage patterns with special characters
rg "sjson\.(Set|Delete)" pkg/llmproxy/thinking/provider/iflow/ --type go -B 1 -A 1

Repository: KooshaPari/cliproxyapi-plusplus

Length of output: 1368


🏁 Script executed:

# Look for applyMiniMax tests
cat -n pkg/llmproxy/thinking/provider/iflow/apply_test.go

Repository: KooshaPari/cliproxyapi-plusplus

Length of output: 3416


🏁 Script executed:

# Check how sjson handles dotted keys - search for documentation or usage examples
rg "sjson" pkg/llmproxy/thinking/provider/iflow/apply_test.go -B 2 -A 2

Repository: KooshaPari/cliproxyapi-plusplus

Length of output: 57


🏁 Script executed:

# Let's check if there are any tests that verify the cleanup of reasoning.effort (literal key)
rg "reasoning\.effort" pkg/llmproxy/thinking/provider/iflow/ -B 3 -A 3

Repository: KooshaPari/cliproxyapi-plusplus

Length of output: 754


🏁 Script executed:

# Check if there are any tests that specifically test literal "reasoning.effort" key removal in applyMiniMax
cat -n pkg/llmproxy/thinking/provider/iflow/apply_test.go | grep -B 20 "reasoning.effort"

Repository: KooshaPari/cliproxyapi-plusplus

Length of output: 3114


🏁 Script executed:

# Look at the full test to see if literal keys are tested
cat -n pkg/llmproxy/thinking/provider/iflow/apply_test.go | sed -n '54,93p'

Repository: KooshaPari/cliproxyapi-plusplus

Length of output: 1546


🏁 Script executed:

# Search for any sjson escape examples or documentation in the codebase
rg "sjson.*Delete.*\\\\\\\\" --type go

Repository: KooshaPari/cliproxyapi-plusplus

Length of output: 57


🏁 Script executed:

# Check the tidwall/sjson documentation by searching for test patterns
rg -B 5 -A 5 "literal.*key\|key.*literal" pkg/llmproxy/thinking/provider/iflow/

Repository: KooshaPari/cliproxyapi-plusplus

Length of output: 57


🏁 Script executed:

# Run the actual test to see if it passes
cd . && go test ./pkg/llmproxy/thinking/provider/iflow -v -run TestApplyMiniMaxStripsReasoningVariantsAndLegacyFields 2>&1 | head -100

Repository: KooshaPari/cliproxyapi-plusplus

Length of output: 597


🏁 Script executed:

# Check if the test input can actually contain a literal "reasoning.effort" key
python3 << 'EOF'
import json
import subprocess

# Test 1: nested structure (what's in the test)
nested = {
    "reasoning_split": True,
    "reasoning_effort": "high",
    "reasoning": {"effort": "medium", "summary": {"text": "legacy"}},
    "variant": "low",
    "foo": "bar"
}

# Test 2: literal dotted key
literal = {
    "reasoning.effort": "high",  # literal key with a dot
    "foo": "bar"
}

print("Test 1 - Nested structure:")
print(json.dumps(nested))
print("\nTest 2 - Literal dotted key:")
print(json.dumps(literal))
EOF

Repository: KooshaPari/cliproxyapi-plusplus

Length of output: 326


🏁 Script executed:

# Now let's test if applyMiniMax would clean up a literal "reasoning.effort" key
# by creating a simple test
python3 << 'EOF'
import json

# When JSON is parsed, "reasoning.effort" as a key is literally that string
# gjson.Get("reasoning.effort") looks for nested path reasoning -> effort
# But a literal key "reasoning.effort" is at the top level

body_with_literal_key = {
    "reasoning.effort": "high",
    "foo": "bar"
}

print("JSON with literal dotted key:")
print(json.dumps(body_with_literal_key))
print("\nAccessing via gjson path 'reasoning.effort' would NOT find this")
print("because it looks for nested path, not the literal key")
print("\nAccessing via literal path 'reasoning\\.effort' (escaped) would find it")
EOF

Repository: KooshaPari/cliproxyapi-plusplus

Length of output: 338


🏁 Script executed:

# Look for how sjson handles literal keys with dots
rg -B 3 -A 3 "sjson.*Delete.*reasoning" pkg/llmproxy/thinking/ --type go

Repository: KooshaPari/cliproxyapi-plusplus

Length of output: 693


🏁 Script executed:

# Check if there's documentation or examples of cleaning literal dotted keys in tests
cat -n pkg/llmproxy/thinking/strip_test.go | head -50

Repository: KooshaPari/cliproxyapi-plusplus

Length of output: 1770


🏁 Script executed:

# Let's verify what the extractIFlowConfig actually extracts from a literal key
cat -n pkg/llmproxy/thinking/apply.go | grep -A 30 "func extractIFlowConfig"

Repository: KooshaPari/cliproxyapi-plusplus

Length of output: 1409


🏁 Script executed:

# Now create a test to verify the behavior with a literal "reasoning.effort" key
python3 << 'EOF'
import json
import subprocess

# Create test input with literal "reasoning.effort" key
test_json = {
    "reasoning.effort": "high",
    "reasoning_split": False,
    "foo": "bar"
}

print("Test JSON with literal 'reasoning.effort' key:")
print(json.dumps(test_json))

# This would be sent as input to applyMiniMax
# The question is: does the cleanup remove it?
EOF

Repository: KooshaPari/cliproxyapi-plusplus

Length of output: 190


🏁 Script executed:

# Check the actual behavior by looking at strip.go which seems to handle similar cleanup
cat -n pkg/llmproxy/thinking/strip.go | head -80

Repository: KooshaPari/cliproxyapi-plusplus

Length of output: 2353


🏁 Script executed:

# Perfect! I can see the key finding: in extractIFlowConfig (line 541), 
# it specifically looks for escaped literal key "reasoning\\.effort"
# Let's verify if applyMiniMax actually handles this

# Create a test to see if the literal key would be cleaned up
python3 << 'EOF'
# The critical finding:
# extractIFlowConfig DOES handle both:
# 1. Nested: gjson.GetBytes(body, "reasoning.effort") 
# 2. Literal: gjson.GetBytes(body, "reasoning\\.effort")

# But applyMiniMax only cleans:
# - "reasoning_effort" (underscore)
# - "reasoning" (nested object)
# - "variant"

# It does NOT clean the literal "reasoning.effort" key!
# This means if extractIFlowConfig accepts it, but applyMiniMax doesn't remove it,
# it will leak upstream.

print("Confirmed: The literal 'reasoning.effort' key is supported by extractIFlowConfig")
print("but NOT cleaned up by applyMiniMax - this is indeed a bug!")
EOF

Repository: KooshaPari/cliproxyapi-plusplus

Length of output: 214


🏁 Script executed:

# Now let's verify the correct escaping for sjson
# Check how sjson is used elsewhere with dots
rg "sjson.*\\\\\\\\" --type go | head -20

Repository: KooshaPari/cliproxyapi-plusplus

Length of output: 57


🏁 Script executed:

# Let's see if there are any other examples of sjson with escaped keys
rg "sjson.Delete.*\\\\" pkg/llmproxy/ --type go

Repository: KooshaPari/cliproxyapi-plusplus

Length of output: 57


🏁 Script executed:

# Check what the web says about sjson path escaping
cat > /tmp/test_sjson.go << 'EOF'
package main

import (
	"fmt"
	"github.com/tidwall/sjson"
	"github.com/tidwall/gjson"
)

func main() {
	// Test: does escaping work for sjson.DeleteBytes?
	body := []byte(`{"reasoning.effort":"high","foo":"bar"}`)
	fmt.Println("Original:", string(body))
	
	// Try to delete with escaped path
	result, _ := sjson.DeleteBytes(body, "reasoning\\.effort")
	fmt.Println("After delete:", string(result))
	
	// Verify it's gone
	if gjson.GetBytes(result, "reasoning\\.effort").Exists() {
		fmt.Println("ERROR: literal key still exists!")
	} else {
		fmt.Println("SUCCESS: literal key was removed")
	}
}
EOF
cd /tmp && go run test_sjson.go 2>&1

Repository: KooshaPari/cliproxyapi-plusplus

Length of output: 470


🏁 Script executed:

# Let's run the actual test in the repo context to see if there's a test case for literal keys
cd pkg/llmproxy/thinking/provider/iflow && go test -v -run TestApplyMiniMaxStripsReasoningVariantsAndLegacyFields 2>&1 | tail -20

Repository: KooshaPari/cliproxyapi-plusplus

Length of output: 312


🏁 Script executed:

# Check if the test input actually includes a literal "reasoning.effort" key
cat -n pkg/llmproxy/thinking/provider/iflow/apply_test.go | sed -n '54,62p'

Repository: KooshaPari/cliproxyapi-plusplus

Length of output: 407


🏁 Script executed:

# Let's create a focused test in the actual repo to verify the behavior
cat > /tmp/test_literal_key.sh << 'EOF'
#!/bin/bash
cd /repo
cat > test_literal_key.go << 'GOEOF'
package main

import (
	"fmt"
	"github.com/tidwall/sjson"
	"github.com/tidwall/gjson"
)

func main() {
	// Test case: literal "reasoning.effort" key
	body := []byte(`{"reasoning.effort":"high","foo":"bar"}`)
	fmt.Println("Original:", string(body))
	
	// Current applyMiniMax cleanup (no escape)
	result1, _ := sjson.DeleteBytes(body, "reasoning.effort")
	fmt.Println("After unescaped delete:", string(result1))
	fmt.Println("Literal key still exists?", gjson.GetBytes(result1, "reasoning\\.effort").Exists())
	
	// Proposed fix with escape
	result2, _ := sjson.DeleteBytes(body, "reasoning\\.effort")
	fmt.Println("After escaped delete:", string(result2))
	fmt.Println("Literal key still exists?", gjson.GetBytes(result2, "reasoning\\.effort").Exists())
}
GOEOF
go run test_literal_key.go 2>&1
EOF
bash /tmp/test_literal_key.sh

Repository: KooshaPari/cliproxyapi-plusplus

Length of output: 375


Add cleanup for literal reasoning.effort key in applyMiniMax.

The current cleanup removes nested reasoning and reasoning_effort (underscore), but misses the literal dotted key "reasoning.effort". Since extractIFlowConfig explicitly accepts this format (via escaped path lookup "reasoning\\.effort"), it can be present in input. When not removed, it leaks upstream in the transformed request.

Use the escaped path "reasoning\\.effort" with sjson.DeleteBytes to remove the literal key:

Proposed fix
 result, _ := sjson.SetBytes(body, "reasoning_split", reasoningSplit)
 result, _ = sjson.DeleteBytes(result, "reasoning_effort")
+result, _ = sjson.DeleteBytes(result, "reasoning\\.effort")
 result, _ = sjson.DeleteBytes(result, "reasoning")
 result, _ = sjson.DeleteBytes(result, "variant")
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@pkg/llmproxy/thinking/provider/iflow/apply.go` around lines 146 - 148, The
cleanup in applyMiniMax misses the literal dotted key "reasoning.effort" so add
an sjson.DeleteBytes call to remove the escaped path "reasoning\\.effort"
alongside the existing deletes; locate the applyMiniMax function in
pkg/llmproxy/thinking/provider/iflow/apply.go and insert a deletion using
sjson.DeleteBytes(result, "reasoning\\.effort") (follow existing error handling
pattern) so that keys accepted by extractIFlowConfig (which reads
"reasoning\\.effort") are not leaked upstream.

Comment on lines 43 to 50
paths = []string{
"chat_template_kwargs.enable_thinking",
"chat_template_kwargs.clear_thinking",
"reasoning",
"reasoning_split",
"reasoning_effort",
"variant",
}
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

❓ Verification inconclusive

Script executed:

find . -type f -name "*.go" | grep -E "(strip|apply)" | head -20

Repository: KooshaPari/cliproxyapi-plusplus


Repository: KooshaPari/cliproxyapi-plusplus
Exit code: 0

stdout:

./pkg/llmproxy/thinking/apply.go
./pkg/llmproxy/thinking/apply_codex_variant_test.go
./pkg/llmproxy/thinking/apply_iflow_test.go
./pkg/llmproxy/thinking/apply_logging_test.go
./pkg/llmproxy/thinking/provider/antigravity/apply.go
./pkg/llmproxy/thinking/provider/antigravity/apply_test.go
./pkg/llmproxy/thinking/provider/claude/apply.go
./pkg/llmproxy/thinking/provider/claude/apply_test.go
./pkg/llmproxy/thinking/provider/codex/apply.go
./pkg/llmproxy/thinking/provider/gemini/apply.go
./pkg/llmproxy/thinking/provider/gemini/apply_test.go
./pkg/llmproxy/thinking/provider/geminicli/apply.go
./pkg/llmproxy/thinking/provider/geminicli/apply_test.go
./pkg/llmproxy/thinking/provider/iflow/apply.go
./pkg/llmproxy/thinking/provider/iflow/apply_test.go
./pkg/llmproxy/thinking/provider/kimi/apply.go
./pkg/llmproxy/thinking/provider/openai/apply.go
./pkg/llmproxy/thinking/strip.go
./pkg/llmproxy/thinking/strip_test.go

Script executed:

cd . && git ls-files | grep -E "llmproxy.*thinking" | head -20

Repository: KooshaPari/cliproxyapi-plusplus


Repository: KooshaPari/cliproxyapi-plusplus
Exit code: 0

stdout:

pkg/llmproxy/executor/thinking_providers.go
pkg/llmproxy/thinking/apply.go
pkg/llmproxy/thinking/apply_codex_variant_test.go
pkg/llmproxy/thinking/apply_iflow_test.go
pkg/llmproxy/thinking/apply_logging_test.go
pkg/llmproxy/thinking/convert.go
pkg/llmproxy/thinking/convert_test.go
pkg/llmproxy/thinking/errors.go
pkg/llmproxy/thinking/log_redaction.go
pkg/llmproxy/thinking/log_redaction_test.go
pkg/llmproxy/thinking/provider/antigravity/apply.go
pkg/llmproxy/thinking/provider/antigravity/apply_test.go
pkg/llmproxy/thinking/provider/claude/apply.go
pkg/llmproxy/thinking/provider/claude/apply_test.go
pkg/llmproxy/thinking/provider/codex/apply.go
pkg/llmproxy/thinking/provider/gemini/apply.go
pkg/llmproxy/thinking/provider/gemini/apply_test.go
pkg/llmproxy/thinking/provider/geminicli/apply.go
pkg/llmproxy/thinking/provider/geminicli/apply_test.go
pkg/llmproxy/thinking/provider/iflow/apply.go

Script executed:

cat -n pkg/llmproxy/thinking/strip.go

Repository: KooshaPari/cliproxyapi-plusplus


Repository: KooshaPari/cliproxyapi-plusplus
Exit code: 0

stdout:

     1	// Package thinking provides unified thinking configuration processing.
     2	package thinking
     3	
     4	import (
     5		"github.com/tidwall/gjson"
     6		"github.com/tidwall/sjson"
     7	)
     8	
     9	// StripThinkingConfig removes thinking configuration fields from request body.
    10	//
    11	// This function is used when a model doesn't support thinking but the request
    12	// contains thinking configuration. The configuration is silently removed to
    13	// prevent upstream API errors.
    14	//
    15	// Parameters:
    16	//   - body: Original request body JSON
    17	//   - provider: Provider name (determines which fields to strip)
    18	//
    19	// Returns:
    20	//   - Modified request body JSON with thinking configuration removed
    21	//   - Original body is returned unchanged if:
    22	//   - body is empty or invalid JSON
    23	//   - provider is unknown
    24	//   - no thinking configuration found
    25	func StripThinkingConfig(body []byte, provider string) []byte {
    26		if len(body) == 0 || !gjson.ValidBytes(body) {
    27			return body
    28		}
    29	
    30		var paths []string
    31		switch provider {
    32		case "claude":
    33			paths = []string{"thinking"}
    34		case "gemini":
    35			paths = []string{"generationConfig.thinkingConfig"}
    36		case "gemini-cli", "antigravity":
    37			paths = []string{"request.generationConfig.thinkingConfig"}
    38		case "openai":
    39			paths = []string{"reasoning_effort"}
    40		case "codex":
    41			paths = []string{"reasoning.effort"}
    42		case "iflow":
    43			paths = []string{
    44				"chat_template_kwargs.enable_thinking",
    45				"chat_template_kwargs.clear_thinking",
    46				"reasoning",
    47				"reasoning_split",
    48				"reasoning_effort",
    49				"variant",
    50			}
    51		default:
    52			return body
    53		}
    54	
    55		result := body
    56		for _, path := range paths {
    57			result, _ = sjson.DeleteBytes(result, path)
    58		}
    59		if provider == "iflow" {
    60			if kwargs := gjson.GetBytes(result, "chat_template_kwargs"); kwargs.Exists() && len(kwargs.Map()) == 0 {
    61				result, _ = sjson.DeleteBytes(result, "chat_template_kwargs")
    62			}
    63		}
    64		return result
    65	}

Script executed:

cat -n pkg/llmproxy/thinking/provider/iflow/apply.go

Repository: KooshaPari/cliproxyapi-plusplus


Repository: KooshaPari/cliproxyapi-plusplus
Exit code: 0

stdout:

     1	// Package iflow implements thinking configuration for iFlow models.
     2	//
     3	// iFlow models use boolean toggle semantics:
     4	//   - Models using chat_template_kwargs.enable_thinking (boolean toggle)
     5	//   - MiniMax models: reasoning_split (boolean)
     6	//
     7	// Level values are converted to boolean: none=false, all others=true
     8	// See: _bmad-output/planning-artifacts/architecture.md#Epic-9
     9	package iflow
    10	
    11	import (
    12		"strings"
    13	
    14		"github.com/kooshapari/cliproxyapi-plusplus/v6/pkg/llmproxy/registry"
    15		"github.com/kooshapari/cliproxyapi-plusplus/v6/pkg/llmproxy/thinking"
    16		"github.com/tidwall/gjson"
    17		"github.com/tidwall/sjson"
    18	)
    19	
    20	// Applier implements thinking.ProviderApplier for iFlow models.
    21	//
    22	// iFlow-specific behavior:
    23	//   - enable_thinking toggle models: enable_thinking boolean
    24	//   - GLM models: enable_thinking boolean + clear_thinking=false
    25	//   - MiniMax models: reasoning_split boolean
    26	//   - Level to boolean: none=false, others=true
    27	//   - No quantized support (only on/off)
    28	type Applier struct{}
    29	
    30	var _ thinking.ProviderApplier = (*Applier)(nil)
    31	
    32	// NewApplier creates a new iFlow thinking applier.
    33	func NewApplier() *Applier {
    34		return &Applier{}
    35	}
    36	
    37	func init() {
    38		thinking.RegisterProvider("iflow", NewApplier())
    39	}
    40	
    41	// Apply applies thinking configuration to iFlow request body.
    42	//
    43	// Expected output format (GLM):
    44	//
    45	//	{
    46	//	  "chat_template_kwargs": {
    47	//	    "enable_thinking": true,
    48	//	    "clear_thinking": false
    49	//	  }
    50	//	}
    51	//
    52	// Expected output format (MiniMax):
    53	//
    54	//	{
    55	//	  "reasoning_split": true
    56	//	}
    57	func (a *Applier) Apply(body []byte, config thinking.ThinkingConfig, modelInfo *registry.ModelInfo) ([]byte, error) {
    58		if thinking.IsUserDefinedModel(modelInfo) {
    59			return body, nil
    60		}
    61		if modelInfo.Thinking == nil {
    62			return body, nil
    63		}
    64	
    65		if isEnableThinkingModel(modelInfo.ID) {
    66			return applyEnableThinking(body, config, isGLMModel(modelInfo.ID)), nil
    67		}
    68	
    69		if isMiniMaxModel(modelInfo.ID) {
    70			return applyMiniMax(body, config), nil
    71		}
    72	
    73		return body, nil
    74	}
    75	
    76	// configToBoolean converts ThinkingConfig to boolean for iFlow models.
    77	//
    78	// Conversion rules:
    79	//   - ModeNone: false
    80	//   - ModeAuto: true
    81	//   - ModeBudget + Budget=0: false
    82	//   - ModeBudget + Budget>0: true
    83	//   - ModeLevel + Level="none": false
    84	//   - ModeLevel + any other level: true
    85	//   - Default (unknown mode): true
    86	func configToBoolean(config thinking.ThinkingConfig) bool {
    87		switch config.Mode {
    88		case thinking.ModeNone:
    89			return false
    90		case thinking.ModeAuto:
    91			return true
    92		case thinking.ModeBudget:
    93			return config.Budget > 0
    94		case thinking.ModeLevel:
    95			return config.Level != thinking.LevelNone
    96		default:
    97			return true
    98		}
    99	}
   100	
   101	// applyEnableThinking applies thinking configuration for models that use
   102	// chat_template_kwargs.enable_thinking format.
   103	//
   104	// Output format when enabled:
   105	//
   106	//	{"chat_template_kwargs": {"enable_thinking": true, "clear_thinking": false}}
   107	//
   108	// Output format when disabled:
   109	//
   110	//	{"chat_template_kwargs": {"enable_thinking": false}}
   111	//
   112	// Note: clear_thinking is only set for GLM models when thinking is enabled.
   113	func applyEnableThinking(body []byte, config thinking.ThinkingConfig, setClearThinking bool) []byte {
   114		enableThinking := configToBoolean(config)
   115	
   116		if len(body) == 0 || !gjson.ValidBytes(body) {
   117			body = []byte(`{}`)
   118		}
   119	
   120		result, _ := sjson.SetBytes(body, "chat_template_kwargs.enable_thinking", enableThinking)
   121	
   122		// clear_thinking is a GLM-only knob, strip it for other models.
   123		result, _ = sjson.DeleteBytes(result, "chat_template_kwargs.clear_thinking")
   124	
   125		// clear_thinking only needed when thinking is enabled
   126		if enableThinking && setClearThinking {
   127			result, _ = sjson.SetBytes(result, "chat_template_kwargs.clear_thinking", false)
   128		}
   129	
   130		return result
   131	}
   132	
   133	// applyMiniMax applies thinking configuration for MiniMax models.
   134	//
   135	// Output format:
   136	//
   137	//	{"reasoning_split": true/false}
   138	func applyMiniMax(body []byte, config thinking.ThinkingConfig) []byte {
   139		reasoningSplit := configToBoolean(config)
   140	
   141		if len(body) == 0 || !gjson.ValidBytes(body) {
   142			body = []byte(`{}`)
   143		}
   144	
   145		result, _ := sjson.SetBytes(body, "reasoning_split", reasoningSplit)
   146		result, _ = sjson.DeleteBytes(result, "reasoning_effort")
   147		result, _ = sjson.DeleteBytes(result, "reasoning")
   148		result, _ = sjson.DeleteBytes(result, "variant")
   149	
   150		return result
   151	}
   152	
   153	// isEnableThinkingModel determines if the model uses chat_template_kwargs.enable_thinking format.
   154	func isEnableThinkingModel(modelID string) bool {
   155		if isGLMModel(modelID) {
   156			return true
   157		}
   158		id := strings.ToLower(modelID)
   159		switch id {
   160		case "qwen3-max-preview", "deepseek-v3.2", "deepseek-v3.1":
   161			return true
   162		default:
   163			return false
   164		}
   165	}
   166	
   167	// isGLMModel determines if the model is a GLM series model.
   168	func isGLMModel(modelID string) bool {
   169		return strings.HasPrefix(strings.ToLower(modelID), "glm")
   170	}
   171	
   172	// isMiniMaxModel determines if the model is a MiniMax series model.
   173	// MiniMax models use reasoning_split format.
   174	func isMiniMaxModel(modelID string) bool {
   175		return strings.HasPrefix(strings.ToLower(modelID), "minimax")
   176	}

Script executed:

cat -n pkg/llmproxy/thinking/provider/iflow/apply_test.go

Repository: KooshaPari/cliproxyapi-plusplus


Repository: KooshaPari/cliproxyapi-plusplus
Exit code: 0

stdout:

     1	package iflow
     2	
     3	import (
     4		"testing"
     5	
     6		"github.com/kooshapari/cliproxyapi-plusplus/v6/pkg/llmproxy/registry"
     7		"github.com/kooshapari/cliproxyapi-plusplus/v6/pkg/llmproxy/thinking"
     8		"github.com/tidwall/gjson"
     9	)
    10	
    11	func TestApplyMiniMaxStripsReasoningEffort(t *testing.T) {
    12		a := NewApplier()
    13		body := []byte(`{"reasoning_effort":"high","foo":"bar"}`)
    14		cfg := thinking.ThinkingConfig{Mode: thinking.ModeLevel, Level: thinking.LevelHigh}
    15		model := &registry.ModelInfo{
    16			ID:       "minimax-m2.5",
    17			Thinking: &registry.ThinkingSupport{Levels: []string{"low", "medium", "high"}},
    18		}
    19	
    20		out, err := a.Apply(body, cfg, model)
    21		if err != nil {
    22			t.Fatalf("unexpected error: %v", err)
    23		}
    24	
    25		res := gjson.ParseBytes(out)
    26		if !res.Get("reasoning_split").Bool() {
    27			t.Fatalf("expected reasoning_split=true for non-none config")
    28		}
    29		if res.Get("reasoning_effort").Exists() {
    30			t.Fatalf("expected reasoning_effort to be removed")
    31		}
    32	}
    33	
    34	func TestApplyMiniMaxSetsReasoningSplitFalseForModeNone(t *testing.T) {
    35		a := NewApplier()
    36		body := []byte(`{"reasoning_effort":"high","foo":"bar"}`)
    37		cfg := thinking.ThinkingConfig{Mode: thinking.ModeNone}
    38		model := &registry.ModelInfo{
    39			ID:       "minimax-m2",
    40			Thinking: &registry.ThinkingSupport{Levels: []string{"none", "low", "high"}},
    41		}
    42	
    43		out, err := a.Apply(body, cfg, model)
    44		if err != nil {
    45			t.Fatalf("unexpected error: %v", err)
    46		}
    47	
    48		res := gjson.ParseBytes(out)
    49		if res.Get("reasoning_split").Bool() {
    50			t.Fatalf("expected reasoning_split=false for ModeNone")
    51		}
    52	}
    53	
    54	func TestApplyMiniMaxStripsReasoningVariantsAndLegacyFields(t *testing.T) {
    55		a := NewApplier()
    56		body := []byte(`{
    57			"reasoning_split":true,
    58			"reasoning_effort":"high",
    59			"reasoning":{"effort":"medium","summary":{"text":"legacy"}},
    60			"variant":"low",
    61			"foo":"bar"
    62		}`)
    63		cfg := thinking.ThinkingConfig{Mode: thinking.ModeLevel, Level: thinking.LevelLow}
    64		model := &registry.ModelInfo{
    65			ID:       "minimax-m2.5",
    66			Thinking: &registry.ThinkingSupport{Levels: []string{"low", "medium", "high"}},
    67		}
    68	
    69		out, err := a.Apply(body, cfg, model)
    70		if err != nil {
    71			t.Fatalf("unexpected error: %v", err)
    72		}
    73	
    74		res := gjson.ParseBytes(out)
    75		if !res.Get("reasoning_split").Bool() {
    76			t.Fatalf("expected reasoning_split=true")
    77		}
    78		if res.Get("reasoning_effort").Exists() {
    79			t.Fatalf("expected reasoning_effort to be removed")
    80		}
    81		if res.Get("reasoning").Exists() {
    82			t.Fatalf("expected reasoning object to be removed")
    83		}
    84		if res.Get("reasoning.effort").Exists() {
    85			t.Fatalf("expected reasoning.effort to be removed")
    86		}
    87		if res.Get("variant").Exists() {
    88			t.Fatalf("expected variant to be removed")
    89		}
    90		if res.Get("foo").String() != "bar" {
    91			t.Fatalf("expected unrelated fields to be preserved")
    92		}
    93	}

Script executed:

rg "reasoning\.effort" --type go -A 2 -B 2

Repository: KooshaPari/cliproxyapi-plusplus


Repository: KooshaPari/cliproxyapi-plusplus
Exit code: 0

stdout:

test/thinking_conversion_test.go-			model:       "level-model",
test/thinking_conversion_test.go-			inputJSON:   `{"model":"level-model","messages":[{"role":"user","content":"hi"}]}`,
test/thinking_conversion_test.go:			expectField: "reasoning.effort",
test/thinking_conversion_test.go-			expectValue: "medium",
test/thinking_conversion_test.go-			expectErr:   false,
--
test/thinking_conversion_test.go-			model:       "level-model(medium)",
test/thinking_conversion_test.go-			inputJSON:   `{"model":"level-model(medium)","messages":[{"role":"user","content":"hi"}]}`,
test/thinking_conversion_test.go:			expectField: "reasoning.effort",
test/thinking_conversion_test.go-			expectValue: "medium",
test/thinking_conversion_test.go-			expectErr:   false,
--
test/thinking_conversion_test.go-			model:       "level-model(none)",
test/thinking_conversion_test.go-			inputJSON:   `{"model":"level-model(none)","messages":[{"role":"user","content":"hi"}]}`,
test/thinking_conversion_test.go:			expectField: "reasoning.effort",
test/thinking_conversion_test.go-			expectValue: "minimal",
test/thinking_conversion_test.go-			expectErr:   false,
--
test/thinking_conversion_test.go-			model:       "level-model(auto)",
test/thinking_conversion_test.go-			inputJSON:   `{"model":"level-model(auto)","messages":[{"role":"user","content":"hi"}]}`,
test/thinking_conversion_test.go:			expectField: "reasoning.effort",
test/thinking_conversion_test.go-			expectValue: "medium",
test/thinking_conversion_test.go-			expectErr:   false,
--
test/thinking_conversion_test.go-			model:       "level-model",
test/thinking_conversion_test.go-			inputJSON:   `{"model":"level-model","contents":[{"role":"user","parts":[{"text":"hi"}]}]}`,
test/thinking_conversion_test.go:			expectField: "reasoning.effort",
test/thinking_conversion_test.go-			expectValue: "medium",
test/thinking_conversion_test.go-			expectErr:   false,
--
test/thinking_conversion_test.go-			model:       "level-model(8192)",
test/thinking_conversion_test.go-			inputJSON:   `{"model":"level-model(8192)","contents":[{"role":"user","parts":[{"text":"hi"}]}]}`,
test/thinking_conversion_test.go:			expectField: "reasoning.effort",
test/thinking_conversion_test.go-			expectValue: "medium",
test/thinking_conversion_test.go-			expectErr:   false,
--
test/thinking_conversion_test.go-			model:       "level-model(64000)",
test/thinking_conversion_test.go-			inputJSON:   `{"model":"level-model(64000)","contents":[{"role":"user","parts":[{"text":"hi"}]}]}`,
test/thinking_conversion_test.go:			expectField: "reasoning.effort",
test/thinking_conversion_test.go-			expectValue: "high",
test/thinking_conversion_test.go-			expectErr:   false,
--
test/thinking_conversion_test.go-			model:       "level-model(0)",
test/thinking_conversion_test.go-			inputJSON:   `{"model":"level-model(0)","contents":[{"role":"user","parts":[{"text":"hi"}]}]}`,
test/thinking_conversion_test.go:			expectField: "reasoning.effort",
test/thinking_conversion_test.go-			expectValue: "minimal",
test/thinking_conversion_test.go-			expectErr:   false,
--
test/thinking_conversion_test.go-			model:       "level-model(-1)",
test/thinking_conversion_test.go-			inputJSON:   `{"model":"level-model(-1)","contents":[{"role":"user","parts":[{"text":"hi"}]}]}`,
test/thinking_conversion_test.go:			expectField: "reasoning.effort",
test/thinking_conversion_test.go-			expectValue: "medium",
test/thinking_conversion_test.go-			expectErr:   false,
--
test/thinking_conversion_test.go-			model:       "user-defined-model",
test/thinking_conversion_test.go-			inputJSON:   `{"model":"user-defined-model","messages":[{"role":"user","content":"hi"}]}`,
test/thinking_conversion_test.go:			expectField: "reasoning.effort",
test/thinking_conversion_test.go-			expectValue: "medium",
test/thinking_conversion_test.go-			expectErr:   false,
--
test/thinking_conversion_test.go-			model:       "user-defined-model(8192)",
test/thinking_conversion_test.go-			inputJSON:   `{"model":"user-defined-model(8192)","messages":[{"role":"user","content":"hi"}]}`,
test/thinking_conversion_test.go:			expectField: "reasoning.effort",
test/thinking_conversion_test.go-			expectValue: "medium",
test/thinking_conversion_test.go-			expectErr:   false,
--
test/thinking_conversion_test.go-			model:       "user-defined-model(64000)",
test/thinking_conversion_test.go-			inputJSON:   `{"model":"user-defined-model(64000)","messages":[{"role":"user","content":"hi"}]}`,
test/thinking_conversion_test.go:			expectField: "reasoning.effort",
test/thinking_conversion_test.go-			expectValue: "xhigh",
test/thinking_conversion_test.go-			expectErr:   false,
--
test/thinking_conversion_test.go-			model:       "user-defined-model(0)",
test/thinking_conversion_test.go-			inputJSON:   `{"model":"user-defined-model(0)","messages":[{"role":"user","content":"hi"}]}`,
test/thinking_conversion_test.go:			expectField: "reasoning.effort",
test/thinking_conversion_test.go-			expectValue: "none",
test/thinking_conversion_test.go-			expectErr:   false,
--
test/thinking_conversion_test.go-			model:       "user-defined-model(-1)",
test/thinking_conversion_test.go-			inputJSON:   `{"model":"user-defined-model(-1)","messages":[{"role":"user","content":"hi"}]}`,
test/thinking_conversion_test.go:			expectField: "reasoning.effort",
test/thinking_conversion_test.go-			expectValue: "auto",
test/thinking_conversion_test.go-			expectErr:   false,
--
test/thinking_conversion_test.go-			expectErr:   true,
test/thinking_conversion_test.go-		},
test/thinking_conversion_test.go:		// Case 82: OpenAI-Response to Codex, level high → passthrough reasoning.effort
test/thinking_conversion_test.go-		{
test/thinking_conversion_test.go-			name:        "82",
--
test/thinking_conversion_test.go-			model:       "level-model(high)",
test/thinking_conversion_test.go-			inputJSON:   `{"model":"level-model(high)","input":[{"role":"user","content":"hi"}]}`,
test/thinking_conversion_test.go:			expectField: "reasoning.effort",
test/thinking_conversion_test.go-			expectValue: "high",
test/thinking_conversion_test.go-			expectErr:   false,
--
test/thinking_conversion_test.go-			model:       "gpt-5-codex(high)",
test/thinking_conversion_test.go-			inputJSON:   `{"model":"gpt-5-codex(high)","input":[{"role":"user","content":"hi"}]}`,
test/thinking_conversion_test.go:			expectField: "reasoning.effort",
test/thinking_conversion_test.go-			expectValue: "high",
test/thinking_conversion_test.go-			expectErr:   false,
--
test/thinking_conversion_test.go-			model:       "gpt-5.2-codex(xhigh)",
test/thinking_conversion_test.go-			inputJSON:   `{"model":"gpt-5.2-codex(xhigh)","input":[{"role":"user","content":"hi"}]}`,
test/thinking_conversion_test.go:			expectField: "reasoning.effort",
test/thinking_conversion_test.go-			expectValue: "xhigh",
test/thinking_conversion_test.go-			expectErr:   false,
--
test/thinking_conversion_test.go-			model:       "gpt-5.2-codex(none)",
test/thinking_conversion_test.go-			inputJSON:   `{"model":"gpt-5.2-codex(none)","input":[{"role":"user","content":"hi"}]}`,
test/thinking_conversion_test.go:			expectField: "reasoning.effort",
test/thinking_conversion_test.go-			expectValue: "none",
test/thinking_conversion_test.go-			expectErr:   false,
--
test/thinking_conversion_test.go-			model:       "gpt-5-codex(none)",
test/thinking_conversion_test.go-			inputJSON:   `{"model":"gpt-5-codex(none)","input":[{"role":"user","content":"hi"}]}`,
test/thinking_conversion_test.go:			expectField: "reasoning.effort",
test/thinking_conversion_test.go-			expectValue: "low",
test/thinking_conversion_test.go-			expectErr:   false,
--
test/thinking_conversion_test.go-			model:       "level-model",
test/thinking_conversion_test.go-			inputJSON:   `{"model":"level-model","messages":[{"role":"user","content":"hi"}]}`,
test/thinking_conversion_test.go:			expectField: "reasoning.effort",
test/thinking_conversion_test.go-			expectValue: "medium",
test/thinking_conversion_test.go-			expectErr:   false,
--
test/thinking_conversion_test.go-			model:       "level-model",
test/thinking_conversion_test.go-			inputJSON:   `{"model":"level-model","messages":[{"role":"user","content":"hi"}],"reasoning_effort":"medium"}`,
test/thinking_conversion_test.go:			expectField: "reasoning.effort",
test/thinking_conversion_test.go-			expectValue: "medium",
test/thinking_conversion_test.go-			expectErr:   false,
--
test/thinking_conversion_test.go-			model:       "level-model",
test/thinking_conversion_test.go-			inputJSON:   `{"model":"level-model","messages":[{"role":"user","content":"hi"}],"reasoning_effort":"none"}`,
test/thinking_conversion_test.go:			expectField: "reasoning.effort",
test/thinking_conversion_test.go-			expectValue: "minimal",
test/thinking_conversion_test.go-			expectErr:   false,
--
test/thinking_conversion_test.go-			model:       "level-model",
test/thinking_conversion_test.go-			inputJSON:   `{"model":"level-model","messages":[{"role":"user","content":"hi"}],"reasoning_effort":"auto"}`,
test/thinking_conversion_test.go:			expectField: "reasoning.effort",
test/thinking_conversion_test.go-			expectValue: "medium",
test/thinking_conversion_test.go-			expectErr:   false,
--
test/thinking_conversion_test.go-			model:       "level-model",
test/thinking_conversion_test.go-			inputJSON:   `{"model":"level-model","contents":[{"role":"user","parts":[{"text":"hi"}]}]}`,
test/thinking_conversion_test.go:			expectField: "reasoning.effort",
test/thinking_conversion_test.go-			expectValue: "medium",
test/thinking_conversion_test.go-			expectErr:   false,
--
test/thinking_conversion_test.go-			model:       "level-model",
test/thinking_conversion_test.go-			inputJSON:   `{"model":"level-model","contents":[{"role":"user","parts":[{"text":"hi"}]}],"generationConfig":{"thinkingConfig":{"thinkingBudget":8192}}}`,
test/thinking_conversion_test.go:			expectField: "reasoning.effort",
test/thinking_conversion_test.go-			expectValue: "medium",
test/thinking_conversion_test.go-			expectErr:   false,
--
test/thinking_conversion_test.go-			model:       "level-model",
test/thinking_conversion_test.go-			inputJSON:   `{"model":"level-model","contents":[{"role":"user","parts":[{"text":"hi"}]}],"generationConfig":{"thinkingConfig":{"thinkingBudget":64000}}}`,
test/thinking_conversion_test.go:			expectField: "reasoning.effort",
test/thinking_conversion_test.go-			expectValue: "high",
test/thinking_conversion_test.go-			expectErr:   false,
--
test/thinking_conversion_test.go-			model:       "level-model",
test/thinking_conversion_test.go-			inputJSON:   `{"model":"level-model","contents":[{"role":"user","parts":[{"text":"hi"}]}],"generationConfig":{"thinkingConfig":{"thinkingBudget":0}}}`,
test/thinking_conversion_test.go:			expectField: "reasoning.effort",
test/thinking_conversion_test.go-			expectValue: "minimal",
test/thinking_conversion_test.go-			expectErr:   false,
--
test/thinking_conversion_test.go-			model:       "level-model",
test/thinking_conversion_test.go-			inputJSON:   `{"model":"level-model","contents":[{"role":"user","parts":[{"text":"hi"}]}],"generationConfig":{"thinkingConfig":{"thinkingBudget":-1}}}`,
test/thinking_conversion_test.go:			expectField: "reasoning.effort",
test/thinking_conversion_test.go-			expectValue: "medium",
test/thinking_conversion_test.go-			expectErr:   false,
--
test/thinking_conversion_test.go-			model:       "user-defined-model",
test/thinking_conversion_test.go-			inputJSON:   `{"model":"user-defined-model","messages":[{"role":"user","content":"hi"}]}`,
test/thinking_conversion_test.go:			expectField: "reasoning.effort",
test/thinking_conversion_test.go-			expectValue: "medium",
test/thinking_conversion_test.go-			expectErr:   false,
--
test/thinking_conversion_test.go-			model:       "user-defined-model",
test/thinking_conversion_test.go-			inputJSON:   `{"model":"user-defined-model","messages":[{"role":"user","content":"hi"}],"thinking":{"type":"enabled","budget_tokens":8192}}`,
test/thinking_conversion_test.go:			expectField: "reasoning.effort",
test/thinking_conversion_test.go-			expectValue: "medium",
test/thinking_conversion_test.go-			expectErr:   false,
--
test/thinking_conversion_test.go-			model:       "user-defined-model",
test/thinking_conversion_test.go-			inputJSON:   `{"model":"user-defined-model","messages":[{"role":"user","content":"hi"}],"thinking":{"type":"enabled","budget_tokens":64000}}`,
test/thinking_conversion_test.go:			expectField: "reasoning.effort",
test/thinking_conversion_test.go-			expectValue: "xhigh",
test/thinking_conversion_test.go-			expectErr:   false,
--
test/thinking_conversion_test.go-			model:       "user-defined-model",
test/thinking_conversion_test.go-			inputJSON:   `{"model":"user-defined-model","messages":[{"role":"user","content":"hi"}],"thinking":{"type":"enabled","budget_tokens":0}}`,
test/thinking_conversion_test.go:			expectField: "reasoning.effort",
test/thinking_conversion_test.go-			expectValue: "none",
test/thinking_conversion_test.go-			expectErr:   false,
--
test/thinking_conversion_test.go-			model:       "user-defined-model",
test/thinking_conversion_test.go-			inputJSON:   `{"model":"user-defined-model","messages":[{"role":"user","content":"hi"}],"thinking":{"type":"enabled","budget_tokens":-1}}`,
test/thinking_conversion_test.go:			expectField: "reasoning.effort",
test/thinking_conversion_test.go-			expectValue: "auto",
test/thinking_conversion_test.go-			expectErr:   false,
--
test/thinking_conversion_test.go-			expectErr:   false,
test/thinking_conversion_test.go-		},
test/thinking_conversion_test.go:		// Case 78: OpenAI-Response reasoning.effort=medium to Gemini → 8192
test/thinking_conversion_test.go-		{
test/thinking_conversion_test.go-			name:            "78",
--
test/thinking_conversion_test.go-			expectErr:       false,
test/thinking_conversion_test.go-		},
test/thinking_conversion_test.go:		// Case 79: OpenAI-Response reasoning.effort=medium to Claude → 8192
test/thinking_conversion_test.go-		{
test/thinking_conversion_test.go-			name:        "79",
--
test/thinking_conversion_test.go-			expectErr:   true,
test/thinking_conversion_test.go-		},
test/thinking_conversion_test.go:		// Case 82: OpenAI-Response to Codex, reasoning.effort=high → passthrough
test/thinking_conversion_test.go-		{
test/thinking_conversion_test.go-			name:        "82",
--
test/thinking_conversion_test.go-			model:       "level-model",
test/thinking_conversion_test.go-			inputJSON:   `{"model":"level-model","input":[{"role":"user","content":"hi"}],"reasoning":{"effort":"high"}}`,
test/thinking_conversion_test.go:			expectField: "reasoning.effort",
test/thinking_conversion_test.go-			expectValue: "high",
test/thinking_conversion_test.go-			expectErr:   false,
test/thinking_conversion_test.go-		},
test/thinking_conversion_test.go:		// Case 83: OpenAI-Response to Codex, reasoning.effort=xhigh → out of range error
test/thinking_conversion_test.go-		{
test/thinking_conversion_test.go-			name:        "83",
--
test/thinking_conversion_test.go-		// GitHub Copilot tests: /responses endpoint (codex format) - with body params
test/thinking_conversion_test.go-
test/thinking_conversion_test.go:		// Case 118: OpenAI-Response to gpt-5-codex, reasoning.effort=high → high
test/thinking_conversion_test.go-		{
test/thinking_conversion_test.go-			name:        "118",
--
test/thinking_conversion_test.go-			model:       "gpt-5-codex",
test/thinking_conversion_test.go-			inputJSON:   `{"model":"gpt-5-codex","input":[{"role":"user","content":"hi"}],"reasoning":{"effort":"high"}}`,
test/thinking_conversion_test.go:			expectField: "reasoning.effort",
test/thinking_conversion_test.go-			expectValue: "high",
test/thinking_conversion_test.go-			expectErr:   false,
test/thinking_conversion_test.go-		},
test/thinking_conversion_test.go:		// Case 119: OpenAI-Response to gpt-5.2-codex, reasoning.effort=xhigh → xhigh
test/thinking_conversion_test.go-		{
test/thinking_conversion_test.go-			name:        "119",
--
test/thinking_conversion_test.go-			model:       "gpt-5.2-codex",
test/thinking_conversion_test.go-			inputJSON:   `{"model":"gpt-5.2-codex","input":[{"role":"user","content":"hi"}],"reasoning":{"effort":"xhigh"}}`,
test/thinking_conversion_test.go:			expectField: "reasoning.effort",
test/thinking_conversion_test.go-			expectValue: "xhigh",
test/thinking_conversion_test.go-			expectErr:   false,
test/thinking_conversion_test.go-		},
test/thinking_conversion_test.go:		// Case 120: OpenAI-Response to gpt-5.2-codex, reasoning.effort=none → none
test/thinking_conversion_test.go-		{
test/thinking_conversion_test.go-			name:        "120",
--
test/thinking_conversion_test.go-			model:       "gpt-5.2-codex",
test/thinking_conversion_test.go-			inputJSON:   `{"model":"gpt-5.2-codex","input":[{"role":"user","content":"hi"}],"reasoning":{"effort":"none"}}`,
test/thinking_conversion_test.go:			expectField: "reasoning.effort",
test/thinking_conversion_test.go-			expectValue: "none",
test/thinking_conversion_test.go-			expectErr:   false,
test/thinking_conversion_test.go-		},
test/thinking_conversion_test.go:		// Case 121: OpenAI-Response to gpt-5-codex, reasoning.effort=none → clamped to low (ZeroAllowed=false)
test/thinking_conversion_test.go-		{
test/thinking_conversion_test.go-			name:        "121",
--
test/thinking_conversion_test.go-			model:       "gpt-5-codex",
test/thinking_conversion_test.go-			inputJSON:   `{"model":"gpt-5-codex","input":[{"role":"user","content":"hi"}],"reasoning":{"effort":"none"}}`,
test/thinking_conversion_test.go:			expectField: "reasoning.effort",
test/thinking_conversion_test.go-			expectValue: "low",
test/thinking_conversion_test.go-			expectErr:   false,
--
test/thinking_conversion_test.go-			model:       "level-model",
test/thinking_conversion_test.go-			inputJSON:   `{"model":"level-model","messages":[{"role":"user","content":"hi"}],"thinking":{"type":"adaptive"}}`,
test/thinking_conversion_test.go:			expectField: "reasoning.effort",
test/thinking_conversion_test.go-			expectValue: "high",
test/thinking_conversion_test.go-			expectErr:   false,
--
test/thinking_conversion_test.go-					hasThinking = gjson.GetBytes(body, "reasoning_effort").Exists()
test/thinking_conversion_test.go-				case "codex":
test/thinking_conversion_test.go:					hasThinking = gjson.GetBytes(body, "reasoning.effort").Exists() || gjson.GetBytes(body, "reasoning").Exists()
test/thinking_conversion_test.go-				case "iflow":
test/thinking_conversion_test.go-					hasThinking = gjson.GetBytes(body, "chat_template_kwargs.enable_thinking").Exists() || gjson.GetBytes(body, "reasoning_split").Exists()
--
pkg/llmproxy/translator/openai/openai/responses/openai_openai-responses_request.go-	//
pkg/llmproxy/translator/openai/openai/responses/openai_openai-responses_request.go-	// Priority:
pkg/llmproxy/translator/openai/openai/responses/openai_openai-responses_request.go:	// 1. reasoning.effort object field
pkg/llmproxy/translator/openai/openai/responses/openai_openai-responses_request.go:	// 2. flat legacy field "reasoning.effort"
pkg/llmproxy/translator/openai/openai/responses/openai_openai-responses_request.go-	// 3. variant
pkg/llmproxy/translator/openai/openai/responses/openai_openai-responses_request.go:	if reasoningEffort := root.Get("reasoning.effort"); reasoningEffort.Exists() {
pkg/llmproxy/translator/openai/openai/responses/openai_openai-responses_request.go-		effort := strings.ToLower(strings.TrimSpace(reasoningEffort.String()))
pkg/llmproxy/translator/openai/openai/responses/openai_openai-responses_request.go-		if effort != "" {
--
pkg/llmproxy/translator/gemini/openai/responses/gemini_openai-responses_request.go-	}
pkg/llmproxy/translator/gemini/openai/responses/gemini_openai-responses_request.go-
pkg/llmproxy/translator/gemini/openai/responses/gemini_openai-responses_request.go:	// Apply thinking configuration: convert OpenAI Responses API reasoning.effort to Gemini thinkingConfig.
pkg/llmproxy/translator/gemini/openai/responses/gemini_openai-responses_request.go-	// Inline translation-only mapping; capability checks happen later in ApplyThinking.
pkg/llmproxy/translator/gemini/openai/responses/gemini_openai-responses_request.go:	re := root.Get("reasoning.effort")
pkg/llmproxy/translator/gemini/openai/responses/gemini_openai-responses_request.go-	if re.Exists() {
pkg/llmproxy/translator/gemini/openai/responses/gemini_openai-responses_request.go-		effort := strings.ToLower(strings.TrimSpace(re.String()))
--
pkg/llmproxy/translator/codex/openai/chat-completions/codex_openai_request_test.go-	}
pkg/llmproxy/translator/codex/openai/chat-completions/codex_openai_request_test.go-
pkg/llmproxy/translator/codex/openai/chat-completions/codex_openai_request_test.go:	if res.Get("reasoning.effort").String() != "medium" {
pkg/llmproxy/translator/codex/openai/chat-completions/codex_openai_request_test.go:		t.Errorf("expected reasoning.effort medium, got %s", res.Get("reasoning.effort").String())
pkg/llmproxy/translator/codex/openai/chat-completions/codex_openai_request_test.go-	}
pkg/llmproxy/translator/codex/openai/chat-completions/codex_openai_request_test.go-
--
pkg/llmproxy/translator/codex/openai/chat-completions/codex_openai_request_test.go-	res2 := gjson.ParseBytes(got2)
pkg/llmproxy/translator/codex/openai/chat-completions/codex_openai_request_test.go-
pkg/llmproxy/translator/codex/openai/chat-completions/codex_openai_request_test.go:	if res2.Get("reasoning.effort").String() != "high" {
pkg/llmproxy/translator/codex/openai/chat-completions/codex_openai_request_test.go:		t.Errorf("expected reasoning.effort high, got %s", res2.Get("reasoning.effort").String())
pkg/llmproxy/translator/codex/openai/chat-completions/codex_openai_request_test.go-	}
pkg/llmproxy/translator/codex/openai/chat-completions/codex_openai_request_test.go-
--
pkg/llmproxy/translator/codex/openai/chat-completions/codex_openai_request_test.go-	res := gjson.ParseBytes(got)
pkg/llmproxy/translator/codex/openai/chat-completions/codex_openai_request_test.go-
pkg/llmproxy/translator/codex/openai/chat-completions/codex_openai_request_test.go:	if gotEffort := res.Get("reasoning.effort").String(); gotEffort != "high" {
pkg/llmproxy/translator/codex/openai/chat-completions/codex_openai_request_test.go:		t.Fatalf("expected reasoning.effort to use variant fallback high, got %s", gotEffort)
pkg/llmproxy/translator/codex/openai/chat-completions/codex_openai_request_test.go-	}
pkg/llmproxy/translator/codex/openai/chat-completions/codex_openai_request_test.go-}
--
pkg/llmproxy/translator/codex/openai/chat-completions/codex_openai_request_test.go-		"model": "gpt-4o",
pkg/llmproxy/translator/codex/openai/chat-completions/codex_openai_request_test.go-		"messages": [{"role":"user","content":"hello"}],
pkg/llmproxy/translator/codex/openai/chat-completions/codex_openai_request_test.go:		"reasoning.effort": "low"
pkg/llmproxy/translator/codex/openai/chat-completions/codex_openai_request_test.go-	}`)
pkg/llmproxy/translator/codex/openai/chat-completions/codex_openai_request_test.go-	got := ConvertOpenAIRequestToCodex("gpt-4o", input, false)
pkg/llmproxy/translator/codex/openai/chat-completions/codex_openai_request_test.go-	res := gjson.ParseBytes(got)
pkg/llmproxy/translator/codex/openai/chat-completions/codex_openai_request_test.go-
pkg/llmproxy/translator/codex/openai/chat-completions/codex_openai_request_test.go:	if gotEffort := res.Get("reasoning.effort").String(); gotEffort != "low" {
pkg/llmproxy/translator/codex/openai/chat-completions/codex_openai_request_test.go:		t.Fatalf("expected reasoning.effort to use legacy flat field low, got %s", gotEffort)
pkg/llmproxy/translator/codex/openai/chat-completions/codex_openai_request_test.go-	}
pkg/llmproxy/translator/codex/openai/chat-completions/codex_openai_request_test.go-}
--
pkg/llmproxy/translator/codex/openai/chat-completions/codex_openai_request_test.go-	res := gjson.ParseBytes(got)
pkg/llmproxy/translator/codex/openai/chat-completions/codex_openai_request_test.go-
pkg/llmproxy/translator/codex/openai/chat-completions/codex_openai_request_test.go:	if gotEffort := res.Get("reasoning.effort").String(); gotEffort != "low" {
pkg/llmproxy/translator/codex/openai/chat-completions/codex_openai_request_test.go:		t.Fatalf("expected reasoning.effort to prefer reasoning_effort low, got %s", gotEffort)
pkg/llmproxy/translator/codex/openai/chat-completions/codex_openai_request_test.go-	}
pkg/llmproxy/translator/codex/openai/chat-completions/codex_openai_request_test.go-}
--
pkg/llmproxy/translator/codex/openai/responses/codex_openai-responses_request_test.go-	outputStr := string(output)
pkg/llmproxy/translator/codex/openai/responses/codex_openai-responses_request_test.go-
pkg/llmproxy/translator/codex/openai/responses/codex_openai-responses_request_test.go:	if got := gjson.Get(outputStr, "reasoning.effort").String(); got != "high" {
pkg/llmproxy/translator/codex/openai/responses/codex_openai-responses_request_test.go:		t.Fatalf("expected reasoning.effort=high fallback, got %s", got)
pkg/llmproxy/translator/codex/openai/responses/codex_openai-responses_request_test.go-	}
pkg/llmproxy/translator/codex/openai/responses/codex_openai-responses_request_test.go-}
--
pkg/llmproxy/translator/codex/openai/responses/codex_openai-responses_request_test.go-	outputStr := string(output)
pkg/llmproxy/translator/codex/openai/responses/codex_openai-responses_request_test.go-
pkg/llmproxy/translator/codex/openai/responses/codex_openai-responses_request_test.go:	if got := gjson.Get(outputStr, "reasoning.effort").String(); got != "low" {
pkg/llmproxy/translator/codex/openai/responses/codex_openai-responses_request_test.go:		t.Fatalf("expected reasoning.effort to prefer explicit reasoning.effort low, got %s", got)
pkg/llmproxy/translator/codex/openai/responses/codex_openai-responses_request_test.go-	}
pkg/llmproxy/translator/codex/openai/responses/codex_openai-responses_request_test.go-}
--
pkg/llmproxy/translator/codex/openai/chat-completions/codex_openai_request.go-	// Map reasoning effort; support flat legacy field and variant fallback.
pkg/llmproxy/translator/codex/openai/chat-completions/codex_openai_request.go-	if v := gjson.GetBytes(rawJSON, "reasoning_effort"); v.Exists() {
pkg/llmproxy/translator/codex/openai/chat-completions/codex_openai_request.go:		out, _ = sjson.Set(out, "reasoning.effort", v.Value())
pkg/llmproxy/translator/codex/openai/chat-completions/codex_openai_request.go-	} else if v := gjson.GetBytes(rawJSON, `reasoning\.effort`); v.Exists() {
pkg/llmproxy/translator/codex/openai/chat-completions/codex_openai_request.go:		out, _ = sjson.Set(out, "reasoning.effort", v.Value())
pkg/llmproxy/translator/codex/openai/chat-completions/codex_openai_request.go-	} else if v := gjson.GetBytes(rawJSON, "variant"); v.Exists() {
pkg/llmproxy/translator/codex/openai/chat-completions/codex_openai_request.go-		effort := strings.ToLower(strings.TrimSpace(v.String()))
pkg/llmproxy/translator/codex/openai/chat-completions/codex_openai_request.go-		if effort == "" {
pkg/llmproxy/translator/codex/openai/chat-completions/codex_openai_request.go:			out, _ = sjson.Set(out, "reasoning.effort", "medium")
pkg/llmproxy/translator/codex/openai/chat-completions/codex_openai_request.go-		} else {
pkg/llmproxy/translator/codex/openai/chat-completions/codex_openai_request.go:			out, _ = sjson.Set(out, "reasoning.effort", effort)
pkg/llmproxy/translator/codex/openai/chat-completions/codex_openai_request.go-		}
pkg/llmproxy/translator/codex/openai/chat-completions/codex_openai_request.go-	} else {
pkg/llmproxy/translator/codex/openai/chat-completions/codex_openai_request.go:		out, _ = sjson.Set(out, "reasoning.effort", "medium")
pkg/llmproxy/translator/codex/openai/chat-completions/codex_openai_request.go-	}
pkg/llmproxy/translator/codex/openai/chat-completions/codex_openai_request.go-	out, _ = sjson.Set(out, "parallel_tool_calls", true)
--
pkg/llmproxy/translator/codex/gemini/codex_gemini_request.go-	out, _ = sjson.Set(out, "parallel_tool_calls", true)
pkg/llmproxy/translator/codex/gemini/codex_gemini_request.go-
pkg/llmproxy/translator/codex/gemini/codex_gemini_request.go:	// Convert Gemini thinkingConfig to Codex reasoning.effort.
pkg/llmproxy/translator/codex/gemini/codex_gemini_request.go-	// Note: Google official Python SDK sends snake_case fields (thinking_level/thinking_budget).
pkg/llmproxy/translator/codex/gemini/codex_gemini_request.go-	effortSet := false
--
pkg/llmproxy/translator/codex/gemini/codex_gemini_request.go-				effort := strings.ToLower(strings.TrimSpace(thinkingLevel.String()))
pkg/llmproxy/translator/codex/gemini/codex_gemini_request.go-				if effort != "" {
pkg/llmproxy/translator/codex/gemini/codex_gemini_request.go:					out, _ = sjson.Set(out, "reasoning.effort", effort)
pkg/llmproxy/translator/codex/gemini/codex_gemini_request.go-					effortSet = true
pkg/llmproxy/translator/codex/gemini/codex_gemini_request.go-				}
--
pkg/llmproxy/translator/codex/gemini/codex_gemini_request.go-				if thinkingBudget.Exists() {
pkg/llmproxy/translator/codex/gemini/codex_gemini_request.go-					if effort, ok := thinking.ConvertBudgetToLevel(int(thinkingBudget.Int())); ok {
pkg/llmproxy/translator/codex/gemini/codex_gemini_request.go:						out, _ = sjson.Set(out, "reasoning.effort", effort)
pkg/llmproxy/translator/codex/gemini/codex_gemini_request.go-						effortSet = true
pkg/llmproxy/translator/codex/gemini/codex_gemini_request.go-					}
--
pkg/llmproxy/translator/codex/gemini/codex_gemini_request.go-	if !effortSet {
pkg/llmproxy/translator/codex/gemini/codex_gemini_request.go-		// No thinking config, set default effort
pkg/llmproxy/translator/codex/gemini/codex_gemini_request.go:		out, _ = sjson.Set(out, "reasoning.effort", "medium")
pkg/llmproxy/translator/codex/gemini/codex_gemini_request.go-	}
pkg/llmproxy/translator/codex/gemini/codex_gemini_request.go-	out, _ = sjson.Set(out, "reasoning.summary", "auto")
--
pkg/llmproxy/translator/codex/openai/responses/codex_openai-responses_request.go-	rawJSON, _ = sjson.SetBytes(rawJSON, "stream", true)
pkg/llmproxy/translator/codex/openai/responses/codex_openai-responses_request.go-	rawJSON, _ = sjson.SetBytes(rawJSON, "store", false)
pkg/llmproxy/translator/codex/openai/responses/codex_openai-responses_request.go:	// Map variant -> reasoning.effort when reasoning.effort is not explicitly provided.
pkg/llmproxy/translator/codex/openai/responses/codex_openai-responses_request.go:	if !gjson.GetBytes(rawJSON, "reasoning.effort").Exists() {
pkg/llmproxy/translator/codex/openai/responses/codex_openai-responses_request.go-		if variant := gjson.GetBytes(rawJSON, "variant"); variant.Exists() {
pkg/llmproxy/translator/codex/openai/responses/codex_openai-responses_request.go-			effort := strings.ToLower(strings.TrimSpace(variant.String()))
pkg/llmproxy/translator/codex/openai/responses/codex_openai-responses_request.go-			if effort != "" {
pkg/llmproxy/translator/codex/openai/responses/codex_openai-responses_request.go:				rawJSON, _ = sjson.SetBytes(rawJSON, "reasoning.effort", effort)
pkg/llmproxy/translator/codex/openai/responses/codex_openai-responses_request.go-			}
pkg/llmproxy/translator/codex/openai/responses/codex_openai-responses_request.go-		}
--
pkg/llmproxy/thinking/apply_iflow_test.go-	}{
pkg/llmproxy/thinking/apply_iflow_test.go-		{
pkg/llmproxy/thinking/apply_iflow_test.go:			name: "nested reasoning.effort maps to level",
pkg/llmproxy/thinking/apply_iflow_test.go-			body: `{"reasoning":{"effort":"high"}}`,
pkg/llmproxy/thinking/apply_iflow_test.go-			want: ThinkingConfig{Mode: ModeLevel, Level: LevelHigh},
pkg/llmproxy/thinking/apply_iflow_test.go-		},
pkg/llmproxy/thinking/apply_iflow_test.go-		{
pkg/llmproxy/thinking/apply_iflow_test.go:			name: "literal reasoning.effort key maps to level",
pkg/llmproxy/thinking/apply_iflow_test.go:			body: `{"reasoning.effort":"high"}`,
pkg/llmproxy/thinking/apply_iflow_test.go-			want: ThinkingConfig{Mode: ModeLevel, Level: LevelHigh},
pkg/llmproxy/thinking/apply_iflow_test.go-		},
pkg/llmproxy/thinking/apply_iflow_test.go-		{
pkg/llmproxy/thinking/apply_iflow_test.go-			name: "none maps to disabled",
pkg/llmproxy/thinking/apply_iflow_test.go:			body: `{"reasoning.effort":"none"}`,
pkg/llmproxy/thinking/apply_iflow_test.go-			want: ThinkingConfig{Mode: ModeNone, Budget: 0},
pkg/llmproxy/thinking/apply_iflow_test.go-		},
--
pkg/llmproxy/thinking/strip.go-		paths = []string{"reasoning_effort"}
pkg/llmproxy/thinking/strip.go-	case "codex":
pkg/llmproxy/thinking/strip.go:		paths = []string{"reasoning.effort"}
pkg/llmproxy/thinking/strip.go-	case "iflow":
pkg/llmproxy/thinking/strip.go-		paths = []string{
--
pkg/llmproxy/thinking/strip_test.go-		t.Fatalf("expected reasoning to be removed")
pkg/llmproxy/thinking/strip_test.go-	}
pkg/llmproxy/thinking/strip_test.go:	if res.Get("reasoning.effort").Exists() {
pkg/llmproxy/thinking/strip_test.go:		t.Fatalf("expected reasoning.effort to be removed")
pkg/llmproxy/thinking/strip_test.go-	}
pkg/llmproxy/thinking/strip_test.go-	if res.Get("reasoning_split").Exists() {
--
pkg/llmproxy/thinking/provider/codex/apply.go-// Package codex implements thinking configuration for Codex (OpenAI Responses API) models.
pkg/llmproxy/thinking/provider/codex/apply.go-//
pkg/llmproxy/thinking/provider/codex/apply.go:// Codex models use the reasoning.effort format with discrete levels
pkg/llmproxy/thinking/provider/codex/apply.go-// (low/medium/high). This is similar to OpenAI but uses nested field
pkg/llmproxy/thinking/provider/codex/apply.go:// "reasoning.effort" instead of "reasoning_effort".
pkg/llmproxy/thinking/provider/codex/apply.go-// See: _bmad-output/planning-artifacts/architecture.md#Epic-8
pkg/llmproxy/thinking/provider/codex/apply.go-package codex
--
pkg/llmproxy/thinking/provider/codex/apply.go-//
pkg/llmproxy/thinking/provider/codex/apply.go-// Codex-specific behavior:
pkg/llmproxy/thinking/provider/codex/apply.go://   - Output format: reasoning.effort (string: low/medium/high/xhigh)
pkg/llmproxy/thinking/provider/codex/apply.go-//   - Level-only mode: no numeric budget support
pkg/llmproxy/thinking/provider/codex/apply.go-//   - Some models support ZeroAllowed (gpt-5.1, gpt-5.2)
--
pkg/llmproxy/thinking/provider/codex/apply.go-
pkg/llmproxy/thinking/provider/codex/apply.go-	if config.Mode == thinking.ModeLevel {
pkg/llmproxy/thinking/provider/codex/apply.go:		result, _ := sjson.SetBytes(body, "reasoning.effort", string(config.Level))
pkg/llmproxy/thinking/provider/codex/apply.go-		return result, nil
pkg/llmproxy/thinking/provider/codex/apply.go-	}
--
pkg/llmproxy/thinking/provider/codex/apply.go-	}
pkg/llmproxy/thinking/provider/codex/apply.go-
pkg/llmproxy/thinking/provider/codex/apply.go:	result, _ := sjson.SetBytes(body, "reasoning.effort", effort)
pkg/llmproxy/thinking/provider/codex/apply.go-	return result, nil
pkg/llmproxy/thinking/provider/codex/apply.go-}
--
pkg/llmproxy/thinking/provider/codex/apply.go-	}
pkg/llmproxy/thinking/provider/codex/apply.go-
pkg/llmproxy/thinking/provider/codex/apply.go:	result, _ := sjson.SetBytes(body, "reasoning.effort", effort)
pkg/llmproxy/thinking/provider/codex/apply.go-	return result, nil
pkg/llmproxy/thinking/provider/codex/apply.go-}
--
pkg/llmproxy/thinking/provider/iflow/apply_test.go-		t.Fatalf("expected reasoning object to be removed")
pkg/llmproxy/thinking/provider/iflow/apply_test.go-	}
pkg/llmproxy/thinking/provider/iflow/apply_test.go:	if res.Get("reasoning.effort").Exists() {
pkg/llmproxy/thinking/provider/iflow/apply_test.go:		t.Fatalf("expected reasoning.effort to be removed")
pkg/llmproxy/thinking/provider/iflow/apply_test.go-	}
pkg/llmproxy/thinking/provider/iflow/apply_test.go-	if res.Get("variant").Exists() {
--
pkg/llmproxy/translator/claude/openai/responses/claude_openai-responses_request.go-	root := gjson.ParseBytes(rawJSON)
pkg/llmproxy/translator/claude/openai/responses/claude_openai-responses_request.go-
pkg/llmproxy/translator/claude/openai/responses/claude_openai-responses_request.go:	// Convert OpenAI Responses reasoning.effort to Claude thinking config.
pkg/llmproxy/translator/claude/openai/responses/claude_openai-responses_request.go:	if v := root.Get("reasoning.effort"); v.Exists() {
pkg/llmproxy/translator/claude/openai/responses/claude_openai-responses_request.go-		effort := strings.ToLower(strings.TrimSpace(v.String()))
pkg/llmproxy/translator/claude/openai/responses/claude_openai-responses_request.go-		if effort != "" {
--
pkg/llmproxy/thinking/apply.go-//
pkg/llmproxy/thinking/apply.go-// Codex API format (OpenAI Responses API):
pkg/llmproxy/thinking/apply.go://   - reasoning.effort: "none", "low", "medium", "high"
pkg/llmproxy/thinking/apply.go-//
pkg/llmproxy/thinking/apply.go:// This is similar to OpenAI but uses nested field "reasoning.effort" instead of "reasoning_effort".
pkg/llmproxy/thinking/apply.go-func extractCodexConfig(body []byte) ThinkingConfig {
pkg/llmproxy/thinking/apply.go:	// Check reasoning.effort (Codex / OpenAI Responses API format)
pkg/llmproxy/thinking/apply.go:	if effort := gjson.GetBytes(body, "reasoning.effort"); effort.Exists() {
pkg/llmproxy/thinking/apply.go-		value := effort.String()
pkg/llmproxy/thinking/apply.go-		if value == "none" {
--
pkg/llmproxy/thinking/apply.go-
pkg/llmproxy/thinking/apply.go-	// Compatibility fallback: some clients send Claude-style `variant`
pkg/llmproxy/thinking/apply.go:	// instead of OpenAI/Codex `reasoning.effort`.
pkg/llmproxy/thinking/apply.go-	if variant := gjson.GetBytes(body, "variant"); variant.Exists() {
pkg/llmproxy/thinking/apply.go-		switch strings.ToLower(strings.TrimSpace(variant.String())) {
--
pkg/llmproxy/thinking/apply.go-	}
pkg/llmproxy/thinking/apply.go-
pkg/llmproxy/thinking/apply.go:	if effort := gjson.GetBytes(body, "reasoning.effort"); effort.Exists() {
pkg/llmproxy/thinking/apply.go-		if config, ok := parseIFlowReasoningEffort(strings.TrimSpace(effort.String())); ok {
pkg/llmproxy/thinking/apply.go-			return config
--
pkg/llmproxy/translator/codex/claude/codex_claude_request.go-	template, _ = sjson.Set(template, "parallel_tool_calls", true)
pkg/llmproxy/translator/codex/claude/codex_claude_request.go-
pkg/llmproxy/translator/codex/claude/codex_claude_request.go:	// Convert thinking.budget_tokens to reasoning.effort.
pkg/llmproxy/translator/codex/claude/codex_claude_request.go-	reasoningEffort := "medium"
pkg/llmproxy/translator/codex/claude/codex_claude_request.go-	if thinkingConfig := rootResult.Get("thinking"); thinkingConfig.Exists() && thinkingConfig.IsObject() {
--
pkg/llmproxy/translator/codex/claude/codex_claude_request.go-		}
pkg/llmproxy/translator/codex/claude/codex_claude_request.go-	}
pkg/llmproxy/translator/codex/claude/codex_claude_request.go:	template, _ = sjson.Set(template, "reasoning.effort", reasoningEffort)
pkg/llmproxy/translator/codex/claude/codex_claude_request.go-	template, _ = sjson.Set(template, "reasoning.summary", "auto")
pkg/llmproxy/translator/codex/claude/codex_claude_request.go-	template, _ = sjson.Set(template, "stream", true)
--
pkg/llmproxy/executor/codex_executor_cpb0106_test.go-		t.Fatalf("expected upstream stream=true, got %v", out.Bool())
pkg/llmproxy/executor/codex_executor_cpb0106_test.go-	}
pkg/llmproxy/executor/codex_executor_cpb0106_test.go:	if got := gjson.GetBytes(upstreamBody, "reasoning.effort").String(); got != "high" {
pkg/llmproxy/executor/codex_executor_cpb0106_test.go:		t.Fatalf("expected reasoning.effort=high, got %q", got)
pkg/llmproxy/executor/codex_executor_cpb0106_test.go-	}
pkg/llmproxy/executor/codex_executor_cpb0106_test.go-}
--
pkg/llmproxy/executor/codex_executor_cpb0106_test.go-		t.Fatalf("expected upstream stream=false in ExecuteStream path, got %v", got)
pkg/llmproxy/executor/codex_executor_cpb0106_test.go-	}
pkg/llmproxy/executor/codex_executor_cpb0106_test.go:	if got := gjson.GetBytes(upstreamBody, "reasoning.effort").String(); got != "high" {
pkg/llmproxy/executor/codex_executor_cpb0106_test.go:		t.Fatalf("expected reasoning.effort=high, got %q", got)
pkg/llmproxy/executor/codex_executor_cpb0106_test.go-	}
pkg/llmproxy/executor/codex_executor_cpb0106_test.go-}
--
pkg/llmproxy/translator/openai/openai/responses/openai_openai-responses_request_test.go-		"model":"gpt-4.1",
pkg/llmproxy/translator/openai/openai/responses/openai_openai-responses_request_test.go-		"input":[{"type":"message","role":"user","content":[{"type":"input_text","text":"ping"}]}],
pkg/llmproxy/translator/openai/openai/responses/openai_openai-responses_request_test.go:		"reasoning.effort":"low"
pkg/llmproxy/translator/openai/openai/responses/openai_openai-responses_request_test.go-	}`)
pkg/llmproxy/translator/openai/openai/responses/openai_openai-responses_request_test.go-

Script executed:

cat -n pkg/llmproxy/thinking/apply_iflow_test.go | head -50

Repository: KooshaPari/cliproxyapi-plusplus


Repository: KooshaPari/cliproxyapi-plusplus
Exit code: 0

stdout:

     1	package thinking
     2	
     3	import "testing"
     4	
     5	func TestExtractIFlowConfig_ReasoningEffort(t *testing.T) {
     6		tests := []struct {
     7			name string
     8			body string
     9			want ThinkingConfig
    10		}{
    11			{
    12				name: "nested reasoning.effort maps to level",
    13				body: `{"reasoning":{"effort":"high"}}`,
    14				want: ThinkingConfig{Mode: ModeLevel, Level: LevelHigh},
    15			},
    16			{
    17				name: "literal reasoning.effort key maps to level",
    18				body: `{"reasoning.effort":"high"}`,
    19				want: ThinkingConfig{Mode: ModeLevel, Level: LevelHigh},
    20			},
    21			{
    22				name: "none maps to disabled",
    23				body: `{"reasoning.effort":"none"}`,
    24				want: ThinkingConfig{Mode: ModeNone, Budget: 0},
    25			},
    26		}
    27	
    28		for _, tt := range tests {
    29			t.Run(tt.name, func(t *testing.T) {
    30				got := extractIFlowConfig([]byte(tt.body))
    31				if got != tt.want {
    32					t.Fatalf("got=%+v want=%+v", got, tt.want)
    33				}
    34			})
    35		}
    36	}
    37	
    38	func TestExtractIFlowConfig_ReasoningObjectEffort(t *testing.T) {
    39		tests := []struct {
    40			name string
    41			body string
    42			want ThinkingConfig
    43		}{
    44			{
    45				name: "reasoning object effort maps to level",
    46				body: `{"reasoning":{"effort":"medium"}}`,
    47				want: ThinkingConfig{Mode: ModeLevel, Level: LevelMedium},
    48			},
    49			{
    50				name: "empty effort falls back to empty config",

Script executed:

rg "reasoning\.effort" pkg/llmproxy/thinking/ --type go -B 3 -A 3

Repository: KooshaPari/cliproxyapi-plusplus


Repository: KooshaPari/cliproxyapi-plusplus
Exit code: 0

stdout:

pkg/llmproxy/thinking/apply.go-// extractCodexConfig extracts thinking configuration from Codex format request body.
pkg/llmproxy/thinking/apply.go-//
pkg/llmproxy/thinking/apply.go-// Codex API format (OpenAI Responses API):
pkg/llmproxy/thinking/apply.go://   - reasoning.effort: "none", "low", "medium", "high"
pkg/llmproxy/thinking/apply.go-//
pkg/llmproxy/thinking/apply.go:// This is similar to OpenAI but uses nested field "reasoning.effort" instead of "reasoning_effort".
pkg/llmproxy/thinking/apply.go-func extractCodexConfig(body []byte) ThinkingConfig {
pkg/llmproxy/thinking/apply.go:	// Check reasoning.effort (Codex / OpenAI Responses API format)
pkg/llmproxy/thinking/apply.go:	if effort := gjson.GetBytes(body, "reasoning.effort"); effort.Exists() {
pkg/llmproxy/thinking/apply.go-		value := effort.String()
pkg/llmproxy/thinking/apply.go-		if value == "none" {
pkg/llmproxy/thinking/apply.go-			return ThinkingConfig{Mode: ModeNone, Budget: 0}
--
pkg/llmproxy/thinking/apply.go-	}
pkg/llmproxy/thinking/apply.go-
pkg/llmproxy/thinking/apply.go-	// Compatibility fallback: some clients send Claude-style `variant`
pkg/llmproxy/thinking/apply.go:	// instead of OpenAI/Codex `reasoning.effort`.
pkg/llmproxy/thinking/apply.go-	if variant := gjson.GetBytes(body, "variant"); variant.Exists() {
pkg/llmproxy/thinking/apply.go-		switch strings.ToLower(strings.TrimSpace(variant.String())) {
pkg/llmproxy/thinking/apply.go-		case "none":
--
pkg/llmproxy/thinking/apply.go-		return ThinkingConfig{Mode: ModeNone, Budget: 0}
pkg/llmproxy/thinking/apply.go-	}
pkg/llmproxy/thinking/apply.go-
pkg/llmproxy/thinking/apply.go:	if effort := gjson.GetBytes(body, "reasoning.effort"); effort.Exists() {
pkg/llmproxy/thinking/apply.go-		if config, ok := parseIFlowReasoningEffort(strings.TrimSpace(effort.String())); ok {
pkg/llmproxy/thinking/apply.go-			return config
pkg/llmproxy/thinking/apply.go-		}
--
pkg/llmproxy/thinking/strip_test.go-	if res.Get("reasoning").Exists() {
pkg/llmproxy/thinking/strip_test.go-		t.Fatalf("expected reasoning to be removed")
pkg/llmproxy/thinking/strip_test.go-	}
pkg/llmproxy/thinking/strip_test.go:	if res.Get("reasoning.effort").Exists() {
pkg/llmproxy/thinking/strip_test.go:		t.Fatalf("expected reasoning.effort to be removed")
pkg/llmproxy/thinking/strip_test.go-	}
pkg/llmproxy/thinking/strip_test.go-	if res.Get("reasoning_split").Exists() {
pkg/llmproxy/thinking/strip_test.go-		t.Fatalf("expected reasoning_split to be removed")
--
pkg/llmproxy/thinking/strip.go-	case "openai":
pkg/llmproxy/thinking/strip.go-		paths = []string{"reasoning_effort"}
pkg/llmproxy/thinking/strip.go-	case "codex":
pkg/llmproxy/thinking/strip.go:		paths = []string{"reasoning.effort"}
pkg/llmproxy/thinking/strip.go-	case "iflow":
pkg/llmproxy/thinking/strip.go-		paths = []string{
pkg/llmproxy/thinking/strip.go-			"chat_template_kwargs.enable_thinking",
--
pkg/llmproxy/thinking/apply_iflow_test.go-		want ThinkingConfig
pkg/llmproxy/thinking/apply_iflow_test.go-	}{
pkg/llmproxy/thinking/apply_iflow_test.go-		{
pkg/llmproxy/thinking/apply_iflow_test.go:			name: "nested reasoning.effort maps to level",
pkg/llmproxy/thinking/apply_iflow_test.go-			body: `{"reasoning":{"effort":"high"}}`,
pkg/llmproxy/thinking/apply_iflow_test.go-			want: ThinkingConfig{Mode: ModeLevel, Level: LevelHigh},
pkg/llmproxy/thinking/apply_iflow_test.go-		},
pkg/llmproxy/thinking/apply_iflow_test.go-		{
pkg/llmproxy/thinking/apply_iflow_test.go:			name: "literal reasoning.effort key maps to level",
pkg/llmproxy/thinking/apply_iflow_test.go:			body: `{"reasoning.effort":"high"}`,
pkg/llmproxy/thinking/apply_iflow_test.go-			want: ThinkingConfig{Mode: ModeLevel, Level: LevelHigh},
pkg/llmproxy/thinking/apply_iflow_test.go-		},
pkg/llmproxy/thinking/apply_iflow_test.go-		{
pkg/llmproxy/thinking/apply_iflow_test.go-			name: "none maps to disabled",
pkg/llmproxy/thinking/apply_iflow_test.go:			body: `{"reasoning.effort":"none"}`,
pkg/llmproxy/thinking/apply_iflow_test.go-			want: ThinkingConfig{Mode: ModeNone, Budget: 0},
pkg/llmproxy/thinking/apply_iflow_test.go-		},
pkg/llmproxy/thinking/apply_iflow_test.go-	}
--
pkg/llmproxy/thinking/provider/codex/apply.go-// Package codex implements thinking configuration for Codex (OpenAI Responses API) models.
pkg/llmproxy/thinking/provider/codex/apply.go-//
pkg/llmproxy/thinking/provider/codex/apply.go:// Codex models use the reasoning.effort format with discrete levels
pkg/llmproxy/thinking/provider/codex/apply.go-// (low/medium/high). This is similar to OpenAI but uses nested field
pkg/llmproxy/thinking/provider/codex/apply.go:// "reasoning.effort" instead of "reasoning_effort".
pkg/llmproxy/thinking/provider/codex/apply.go-// See: _bmad-output/planning-artifacts/architecture.md#Epic-8
pkg/llmproxy/thinking/provider/codex/apply.go-package codex
pkg/llmproxy/thinking/provider/codex/apply.go-
--
pkg/llmproxy/thinking/provider/codex/apply.go-// Applier implements thinking.ProviderApplier for Codex models.
pkg/llmproxy/thinking/provider/codex/apply.go-//
pkg/llmproxy/thinking/provider/codex/apply.go-// Codex-specific behavior:
pkg/llmproxy/thinking/provider/codex/apply.go://   - Output format: reasoning.effort (string: low/medium/high/xhigh)
pkg/llmproxy/thinking/provider/codex/apply.go-//   - Level-only mode: no numeric budget support
pkg/llmproxy/thinking/provider/codex/apply.go-//   - Some models support ZeroAllowed (gpt-5.1, gpt-5.2)
pkg/llmproxy/thinking/provider/codex/apply.go-type Applier struct{}
--
pkg/llmproxy/thinking/provider/codex/apply.go-	}
pkg/llmproxy/thinking/provider/codex/apply.go-
pkg/llmproxy/thinking/provider/codex/apply.go-	if config.Mode == thinking.ModeLevel {
pkg/llmproxy/thinking/provider/codex/apply.go:		result, _ := sjson.SetBytes(body, "reasoning.effort", string(config.Level))
pkg/llmproxy/thinking/provider/codex/apply.go-		return result, nil
pkg/llmproxy/thinking/provider/codex/apply.go-	}
pkg/llmproxy/thinking/provider/codex/apply.go-
--
pkg/llmproxy/thinking/provider/codex/apply.go-		return body, nil
pkg/llmproxy/thinking/provider/codex/apply.go-	}
pkg/llmproxy/thinking/provider/codex/apply.go-
pkg/llmproxy/thinking/provider/codex/apply.go:	result, _ := sjson.SetBytes(body, "reasoning.effort", effort)
pkg/llmproxy/thinking/provider/codex/apply.go-	return result, nil
pkg/llmproxy/thinking/provider/codex/apply.go-}
pkg/llmproxy/thinking/provider/codex/apply.go-
--
pkg/llmproxy/thinking/provider/codex/apply.go-		return body, nil
pkg/llmproxy/thinking/provider/codex/apply.go-	}
pkg/llmproxy/thinking/provider/codex/apply.go-
pkg/llmproxy/thinking/provider/codex/apply.go:	result, _ := sjson.SetBytes(body, "reasoning.effort", effort)
pkg/llmproxy/thinking/provider/codex/apply.go-	return result, nil
pkg/llmproxy/thinking/provider/codex/apply.go-}
pkg/llmproxy/thinking/provider/codex/apply.go-
--
pkg/llmproxy/thinking/provider/iflow/apply_test.go-	if res.Get("reasoning").Exists() {
pkg/llmproxy/thinking/provider/iflow/apply_test.go-		t.Fatalf("expected reasoning object to be removed")
pkg/llmproxy/thinking/provider/iflow/apply_test.go-	}
pkg/llmproxy/thinking/provider/iflow/apply_test.go:	if res.Get("reasoning.effort").Exists() {
pkg/llmproxy/thinking/provider/iflow/apply_test.go:		t.Fatalf("expected reasoning.effort to be removed")
pkg/llmproxy/thinking/provider/iflow/apply_test.go-	}
pkg/llmproxy/thinking/provider/iflow/apply_test.go-	if res.Get("variant").Exists() {
pkg/llmproxy/thinking/provider/iflow/apply_test.go-		t.Fatalf("expected variant to be removed")

Script executed:

cat > /tmp/test_sjson.go << 'EOF'
package main

import (
	"fmt"
	"github.com/tidwall/gjson"
	"github.com/tidwall/sjson"
)

func main() {
	// Test 1: nested path deletion
	input1 := []byte(`{"reasoning":{"effort":"high"},"foo":"bar"}`)
	result1, _ := sjson.DeleteBytes(input1, "reasoning.effort")
	fmt.Println("Nested delete result:", string(result1))
	fmt.Println("After nested delete, reasoning exists:", gjson.GetBytes(result1, "reasoning").Exists())
	fmt.Println("After nested delete, reasoning.effort exists:", gjson.GetBytes(result1, "reasoning.effort").Exists())

	// Test 2: literal dot-key without escaping
	input2 := []byte(`{"reasoning.effort":"high","foo":"bar"}`)
	result2, _ := sjson.DeleteBytes(input2, "reasoning.effort")
	fmt.Println("\nLiteral key (no escape) delete result:", string(result2))
	fmt.Println("After literal delete, reasoning.effort exists:", gjson.GetBytes(result2, `reasoning\.effort`).Exists())

	// Test 3: literal dot-key WITH escaping
	input3 := []byte(`{"reasoning.effort":"high","foo":"bar"}`)
	result3, _ := sjson.DeleteBytes(input3, `reasoning\.effort`)
	fmt.Println("\nLiteral key (with escape) delete result:", string(result3))
	fmt.Println("After escaped delete, reasoning.effort exists:", gjson.GetBytes(result3, `reasoning\.effort`).Exists())
}
EOF
cd /tmp && go mod init test 2>/dev/null; go get github.com/tidwall/sjson 2>/dev/null; go run test_sjson.go

Repository: KooshaPari/cliproxyapi-plusplus


Repository: KooshaPari/cliproxyapi-plusplus
Exit code: 1

stderr:

go: warning: ignoring go.mod in system temp root /tmp
test_sjson.go:5:2: no required module provides package github.com/tidwall/gjson: go.mod file not found in current directory or any parent directory; see 'go help modules'
test_sjson.go:6:2: no required module provides package github.com/tidwall/sjson: go.mod file not found in current directory or any parent directory; see 'go help modules'

Script executed:

cat -n pkg/llmproxy/thinking/strip_test.go

Repository: KooshaPari/cliproxyapi-plusplus


Repository: KooshaPari/cliproxyapi-plusplus
Exit code: 0

stdout:

     1	package thinking
     2	
     3	import (
     4		"testing"
     5	
     6		"github.com/tidwall/gjson"
     7	)
     8	
     9	func TestStripThinkingConfigIflow(t *testing.T) {
    10		body := []byte(`{
    11			"model":"minimax-m2.5",
    12			"reasoning":{"effort":"high"},
    13			"reasoning_effort":"high",
    14			"reasoning_split":true,
    15			"variant":"medium",
    16			"chat_template_kwargs":{"enable_thinking":true,"clear_thinking":false},
    17			"messages":[{"role":"user","content":"hi"}]
    18		}`)
    19	
    20		out := StripThinkingConfig(body, "iflow")
    21		res := gjson.ParseBytes(out)
    22	
    23		if res.Get("reasoning").Exists() {
    24			t.Fatalf("expected reasoning to be removed")
    25		}
    26		if res.Get("reasoning.effort").Exists() {
    27			t.Fatalf("expected reasoning.effort to be removed")
    28		}
    29		if res.Get("reasoning_split").Exists() {
    30			t.Fatalf("expected reasoning_split to be removed")
    31		}
    32		if res.Get("reasoning_effort").Exists() {
    33			t.Fatalf("expected reasoning_effort to be removed")
    34		}
    35		if res.Get("variant").Exists() {
    36			t.Fatalf("expected variant to be removed")
    37		}
    38		if res.Get("chat_template_kwargs").Exists() {
    39			t.Fatalf("expected chat_template_kwargs to be removed")
    40		}
    41		if res.Get("messages.0.content").String() != "hi" {
    42			t.Fatalf("expected unrelated messages to remain")
    43		}
    44	}
    45	
    46	func TestStripThinkingConfigIflowPreservesUnrelatedChatTemplateKwargs(t *testing.T) {
    47		body := []byte(`{
    48			"chat_template_kwargs":{
    49				"enable_thinking":true,
    50				"clear_thinking":false,
    51				"system_prompt":"keep-me"
    52			},
    53			"messages":[{"role":"user","content":"hi"}]
    54		}`)
    55	
    56		out := StripThinkingConfig(body, "iflow")
    57		res := gjson.ParseBytes(out)
    58	
    59		if res.Get("chat_template_kwargs.enable_thinking").Exists() {
    60			t.Fatalf("expected enable_thinking to be removed")
    61		}
    62		if res.Get("chat_template_kwargs.clear_thinking").Exists() {
    63			t.Fatalf("expected clear_thinking to be removed")
    64		}
    65		if res.Get("chat_template_kwargs.system_prompt").String() != "keep-me" {
    66			t.Fatalf("expected unrelated chat_template_kwargs fields to remain")
    67		}
    68		if res.Get("messages.0.content").String() != "hi" {
    69			t.Fatalf("expected unrelated messages to remain")
    70		}
    71	}

Script executed:

rg "reasoning\.effort" pkg/llmproxy/thinking/strip_test.go

Repository: KooshaPari/cliproxyapi-plusplus


Repository: KooshaPari/cliproxyapi-plusplus
Exit code: 0

stdout:

	if res.Get("reasoning.effort").Exists() {
		t.Fatalf("expected reasoning.effort to be removed")

Script executed:

python3 << 'PYEOF'
# Let's verify the sjson behavior with literal dot-keys
# by checking how iflow extraction tests work
import json

# From apply_iflow_test.go line 17-18:
test_body = '{"reasoning.effort":"high"}'
data = json.loads(test_body)
print("Parsed JSON with literal dot-key:")
print(json.dumps(data, indent=2))
print("\nKey in dict:", "reasoning.effort" in data)
print("Key names:", list(data.keys()))
PYEOF

Repository: KooshaPari/cliproxyapi-plusplus


Repository: KooshaPari/cliproxyapi-plusplus
Exit code: 0

stdout:

Parsed JSON with literal dot-key:
{
  "reasoning.effort": "high"
}

Key in dict: True
Key names: ['reasoning.effort']

Script executed:

rg -A 5 "TestStripThinkingConfigIflow\(" pkg/llmproxy/thinking/strip_test.go | head -20

Repository: KooshaPari/cliproxyapi-plusplus


Repository: KooshaPari/cliproxyapi-plusplus
Exit code: 0

stdout:

func TestStripThinkingConfigIflow(t *testing.T) {
	body := []byte(`{
		"model":"minimax-m2.5",
		"reasoning":{"effort":"high"},
		"reasoning_effort":"high",
		"reasoning_split":true,

Script executed:

# Check if any test input to strip has a literal reasoning.effort key
rg '"reasoning\.effort"' pkg/llmproxy/thinking/strip_test.go

Repository: KooshaPari/cliproxyapi-plusplus


Repository: KooshaPari/cliproxyapi-plusplus
Exit code: 0

stdout:

	if res.Get("reasoning.effort").Exists() {

Script executed:

sed -n '40,65p' pkg/llmproxy/thinking/strip.go

Repository: KooshaPari/cliproxyapi-plusplus


Repository: KooshaPari/cliproxyapi-plusplus
Exit code: 0

stdout:

	case "codex":
		paths = []string{"reasoning.effort"}
	case "iflow":
		paths = []string{
			"chat_template_kwargs.enable_thinking",
			"chat_template_kwargs.clear_thinking",
			"reasoning",
			"reasoning_split",
			"reasoning_effort",
			"variant",
		}
	default:
		return body
	}

	result := body
	for _, path := range paths {
		result, _ = sjson.DeleteBytes(result, path)
	}
	if provider == "iflow" {
		if kwargs := gjson.GetBytes(result, "chat_template_kwargs"); kwargs.Exists() && len(kwargs.Map()) == 0 {
			result, _ = sjson.DeleteBytes(result, "chat_template_kwargs")
		}
	}
	return result
}

Add escaped literal key path "reasoning\.effort" to iflow stripping.

The iflow case removes "reasoning" (nested) but not a literal top-level "reasoning.effort" key. Since iflow inputs can contain this literal dot-key (as shown in apply_iflow_test.go), it must be stripped to prevent unexpected payload shapes. Add the escaped path to the paths list.

Proposed fix
 	case "iflow":
 		paths = []string{
 			"chat_template_kwargs.enable_thinking",
 			"chat_template_kwargs.clear_thinking",
 			"reasoning",
 			"reasoning_split",
 			"reasoning_effort",
+			"reasoning\.effort",
 			"variant",
 		}

Consider centralizing iflow cleanup paths shared with pkg/llmproxy/thinking/provider/iflow/apply.go to prevent future drift.

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
paths = []string{
"chat_template_kwargs.enable_thinking",
"chat_template_kwargs.clear_thinking",
"reasoning",
"reasoning_split",
"reasoning_effort",
"variant",
}
paths = []string{
"chat_template_kwargs.enable_thinking",
"chat_template_kwargs.clear_thinking",
"reasoning",
"reasoning_split",
"reasoning_effort",
"reasoning\.effort",
"variant",
}
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@pkg/llmproxy/thinking/strip.go` around lines 43 - 50, The iflow stripping
list in pkg/llmproxy/thinking/strip.go currently removes nested "reasoning" but
not a literal top-level key "reasoning.effort"; update the paths slice (the
variable assigned to paths in strip.go) to include the escaped literal key
"reasoning\.effort" so top-level dot-keys are stripped as well, and consider
moving this paths list into a shared constant used by the iflow cleanup code in
provider/iflow/apply.go to avoid drift.

Comment on lines +45 to +50
registry.GetGlobalRegistry().RegisterClient("endpoint-compat-iflow-direction", "iflow", []*registry.ModelInfo{{
ID: "minimax-m2.5-directionality-test",
SupportedEndpoints: []string{"/chat/completions"},
}})
t.Cleanup(func() {
registry.GetGlobalRegistry().UnregisterClient("endpoint-compat-iflow-directionality")
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Use the same client ID in cleanup.

Line 45 registers "endpoint-compat-iflow-direction", but Line 50 unregisters "endpoint-compat-iflow-directionality". That leaves the model in the global registry and makes later tests order-dependent.

🧹 Fix the leaked registry entry
 	t.Cleanup(func() {
-		registry.GetGlobalRegistry().UnregisterClient("endpoint-compat-iflow-directionality")
+		registry.GetGlobalRegistry().UnregisterClient("endpoint-compat-iflow-direction")
 	})
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@sdk/api/handlers/openai/endpoint_compat_test.go` around lines 45 - 50, The
cleanup is unregistering a different client ID than the one registered, leaking
the registry entry; update the UnregisterClient call so it uses the exact same
client ID string used in RegisterClient
("endpoint-compat-iflow-direction")—i.e., locate the register call via
registry.GetGlobalRegistry().RegisterClient(...) and make the corresponding
registry.GetGlobalRegistry().UnregisterClient(...) use that identical ID.

Comment thread sdk/cliproxy/auth/manager_compat.go Outdated
Comment on lines +186 to +193
m.mu.Lock()
defer m.mu.Unlock()
if replaced, ok := m.executors[key]; ok {
if closer, okCloser := replaced.(ExecutionSessionCloser); okCloser {
closer.CloseExecutionSession(CloseAllExecutionSessionsID)
}
}
m.executors[key] = executor
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Move CloseExecutionSession outside m.mu.

CloseExecutionSession is external code. Calling it while holding the manager write lock can deadlock if the executor re-enters the manager, and it blocks every auth/executor read until the close returns.

Possible fix
 	m.mu.Lock()
-	defer m.mu.Unlock()
-	if replaced, ok := m.executors[key]; ok {
-		if closer, okCloser := replaced.(ExecutionSessionCloser); okCloser {
-			closer.CloseExecutionSession(CloseAllExecutionSessionsID)
-		}
-	}
-	m.executors[key] = executor
+	replaced := m.executors[key]
+	m.executors[key] = executor
+	m.mu.Unlock()
+	if closer, ok := replaced.(ExecutionSessionCloser); ok {
+		closer.CloseExecutionSession(CloseAllExecutionSessionsID)
+	}
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
m.mu.Lock()
defer m.mu.Unlock()
if replaced, ok := m.executors[key]; ok {
if closer, okCloser := replaced.(ExecutionSessionCloser); okCloser {
closer.CloseExecutionSession(CloseAllExecutionSessionsID)
}
}
m.executors[key] = executor
m.mu.Lock()
replaced := m.executors[key]
m.executors[key] = executor
m.mu.Unlock()
if closer, ok := replaced.(ExecutionSessionCloser); ok {
closer.CloseExecutionSession(CloseAllExecutionSessionsID)
}
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@sdk/cliproxy/auth/manager_compat.go` around lines 186 - 193, The manager
currently calls CloseExecutionSession while holding m.mu which can deadlock;
modify Set/replace logic so you capture the replaced executor (m.executors[key])
while holding m.mu, then release the lock and call
CloseExecutionSession(CloseAllExecutionSessionsID) on it outside the lock;
specifically, in the function that assigns m.executors[key] (the block that
checks for replaced, ok := m.executors[key] and type-asserts to
ExecutionSessionCloser), store the replaced value in a local variable while
locked, perform m.executors[key] = executor, unlock (defer removal of defer if
needed), and only after the mutex is released, if the local variable implements
ExecutionSessionCloser call
closer.CloseExecutionSession(CloseAllExecutionSessionsID).

Comment thread sdk/cliproxy/auth/manager_compat.go Outdated
Comment on lines +428 to +434
func (m *Manager) ExecuteStream(ctx context.Context, providers []string, req cliproxyexecutor.Request, opts cliproxyexecutor.Options) (*cliproxyexecutor.StreamResult, error) {
auth, executor, err := m.selectAuthAndExecutor(ctx, providers, req.Model, opts)
if err != nil {
return nil, err
}
req.Model = m.applyAPIKeyModelAlias(auth, req.Model)
return executor.ExecuteStream(ctx, auth, req, opts)
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Record immediate stream startup failures.

Unlike Execute and ExecuteCount, this path returns ExecuteStream errors without calling recordExecutionResult, so quota/backoff state is never updated when stream creation fails.

Possible fix
 	req.Model = m.applyAPIKeyModelAlias(auth, req.Model)
-	return executor.ExecuteStream(ctx, auth, req, opts)
+	stream, execErr := executor.ExecuteStream(ctx, auth, req, opts)
+	if execErr != nil {
+		m.recordExecutionResult(ctx, auth, req.Model, execErr, nil)
+		return nil, execErr
+	}
+	return stream, nil
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
func (m *Manager) ExecuteStream(ctx context.Context, providers []string, req cliproxyexecutor.Request, opts cliproxyexecutor.Options) (*cliproxyexecutor.StreamResult, error) {
auth, executor, err := m.selectAuthAndExecutor(ctx, providers, req.Model, opts)
if err != nil {
return nil, err
}
req.Model = m.applyAPIKeyModelAlias(auth, req.Model)
return executor.ExecuteStream(ctx, auth, req, opts)
func (m *Manager) ExecuteStream(ctx context.Context, providers []string, req cliproxyexecutor.Request, opts cliproxyexecutor.Options) (*cliproxyexecutor.StreamResult, error) {
auth, executor, err := m.selectAuthAndExecutor(ctx, providers, req.Model, opts)
if err != nil {
return nil, err
}
req.Model = m.applyAPIKeyModelAlias(auth, req.Model)
stream, execErr := executor.ExecuteStream(ctx, auth, req, opts)
if execErr != nil {
m.recordExecutionResult(ctx, auth, req.Model, execErr, nil)
return nil, execErr
}
return stream, nil
}
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@sdk/cliproxy/auth/manager_compat.go` around lines 428 - 434, The
ExecuteStream path currently returns executor.ExecuteStream errors without
updating quota/backoff state; modify Manager.ExecuteStream so after selecting
auth/executor (selectAuthAndExecutor) and applying applyAPIKeyModelAlias, you
call executor.ExecuteStream and if it returns an error immediately call
m.recordExecutionResult(...) with the same ctx, auth, req.Model (or the aliased
model) and an appropriate failure result to update quota/backoff state before
returning the error; keep the existing success return path unchanged so only
startup failures are recorded.

Comment thread sdk/cliproxy/auth/manager_compat.go Outdated
Comment on lines +452 to +487
var candidates []*Auth
var executor ProviderExecutor
m.mu.RLock()
for _, provider := range providers {
exec := m.executors[strings.ToLower(strings.TrimSpace(provider))]
if exec == nil {
continue
}
for _, auth := range m.auths {
if auth == nil || !strings.EqualFold(auth.Provider, provider) {
continue
}
if pinnedAuthID != "" && auth.ID != pinnedAuthID {
continue
}
candidates = append(candidates, auth.Clone())
}
if executor == nil {
executor = exec
}
}
selector := m.selector
m.mu.RUnlock()
if len(candidates) == 0 || executor == nil {
return nil, nil, &Error{Code: "auth_not_found", Message: "no auth candidates", HTTPStatus: http.StatusServiceUnavailable, Retryable: true}
}
auth, err := selector.Pick(ctx, providers[0], model, opts, candidates)
if err != nil {
return nil, nil, err
}
updateAggregatedAvailability(auth, now)
if selectedAuthCallback != nil && auth != nil {
selectedAuthCallback(auth.ID)
}
return auth, executor, nil
}
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

Keep auth selection and executor selection in the same provider scope.

candidates mixes auths from every requested provider, but executor is pinned to the first provider with an executor and Line 478 always passes providers[0] to the selector. With providers = ["gemini", "claude"], the selector can return a claude auth while Execute* still calls the gemini executor.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@sdk/cliproxy/auth/manager_compat.go` around lines 452 - 487, The selection
mixes auths from all providers into candidates but picks an executor only for
the first provider with an executor and always calls selector.Pick with
providers[0], causing provider mismatch; instead, for each provider iterate
building provider-scoped candidates from m.auths (using
strings.EqualFold(auth.Provider, provider) and pinnedAuthID filtering), get the
corresponding exec from m.executors, and if exec != nil and that provider has
candidates call selector.Pick(ctx, provider, model, opts, providerCandidates);
if selector.Pick returns an auth, call updateAggregatedAvailability and
selectedAuthCallback(auth.ID) and immediately return that auth and the matching
exec; only if no provider yields a pick return the auth_not_found error—this
ensures executor and auth come from the same provider (referencing candidates,
executor, m.executors, providers, selector.Pick, m.auths, pinnedAuthID,
updateAggregatedAvailability, selectedAuthCallback).

Comment thread sdk/cliproxy/auth/manager_compat.go Outdated
Comment on lines +543 to +574
switch strings.ToLower(strings.TrimSpace(auth.Provider)) {
case "gemini":
for _, key := range cfg.GeminiKey {
if strings.TrimSpace(key.APIKey) == apiKey && (baseURL == "" || strings.TrimSpace(key.BaseURL) == baseURL) {
models := make([]modelAliasEntry, 0, len(key.Models))
for _, model := range key.Models {
models = append(models, model)
}
return resolveModelAliasFromConfigModels(requestedModel, models)
}
}
case "claude":
for _, key := range cfg.ClaudeKey {
if strings.TrimSpace(key.APIKey) == apiKey && (baseURL == "" || strings.TrimSpace(key.BaseURL) == baseURL) {
models := make([]modelAliasEntry, 0, len(key.Models))
for _, model := range key.Models {
models = append(models, model)
}
return resolveModelAliasFromConfigModels(requestedModel, models)
}
}
case "codex":
for _, key := range cfg.CodexKey {
if strings.TrimSpace(key.APIKey) == apiKey && (baseURL == "" || strings.TrimSpace(key.BaseURL) == baseURL) {
models := make([]modelAliasEntry, 0, len(key.Models))
for _, model := range key.Models {
models = append(models, model)
}
return resolveModelAliasFromConfigModels(requestedModel, models)
}
}
}
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion | 🟠 Major

Deduplicate provider-specific API-key alias lookup.

The three switch branches are the same scan with a different config slice, and this helper is already over 50 lines. Extract a provider registry/helper so new providers don't keep extending this hot path with copy/paste.

As per coding guidelines, "Build generic building blocks (provider interface + registry) before application logic" and "Maximum function length: 40 lines".

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@sdk/cliproxy/auth/manager_compat.go` around lines 543 - 574, The switch on
strings.ToLower(strings.TrimSpace(auth.Provider)) duplicates the same
API-key/model lookup for cfg.GeminiKey, cfg.ClaudeKey and cfg.CodexKey; extract
a provider registry/helper and replace the three branches with a single generic
lookup. Implement a small provider registry (map[string]func(*Config)[]KeyType
or an interface) that maps "gemini"/"claude"/"codex" to functions returning the
corresponding cfg.<GeminiKey|ClaudeKey|CodexKey> slice, then in the current
function call that helper once: iterate the returned keys, compare
strings.TrimSpace(key.APIKey) and BaseURL, build models []modelAliasEntry and
call resolveModelAliasFromConfigModels(requestedModel, models). Update names
referenced here (auth.Provider, cfg.GeminiKey, cfg.ClaudeKey, cfg.CodexKey,
resolveModelAliasFromConfigModels, modelAliasEntry) and ensure the refactor
keeps the logic identical and reduces the function length below 40 lines.

Comment thread sdk/cliproxy/auth/oauth_model_alias.go Outdated
KooshaPari and others added 2 commits March 14, 2026 19:07
Co-authored-by: Codex <noreply@openai.com>
Co-authored-by: Codex <noreply@openai.com>
@KooshaPari KooshaPari merged commit 35e929d into main Mar 15, 2026
27 of 29 checks passed
@KooshaPari KooshaPari deleted the codex/stability-fix-1 branch March 15, 2026 02:20
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

HELIOS-CODEX Bundle identifier for HELIOS-CODEX release train HELIOS-CODEX-L0 HELIOS-CODEX foundation layer

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant