fix: Respect model endpoint overrides by FlowerCode · Pull Request #2203 · QuantumNous/new-api

FlowerCode · 2025-11-10T07:25:21Z

Summary by CodeRabbit

New Features
- HTTP/2 cleartext (h2c) support enabled for improved connectivity and connection handling.
Refactor
- Routing and middleware structure reorganized to improve stability and flexibility of request handling.
- Session handling integrated into router setup to enable consistent session behavior across web routes.

coderabbitai · 2025-11-10T07:25:35Z

Walkthrough

Refactors router composition and middleware types to use gin.IRouter/gin.IRoutes, adds HTTP/2 cleartext (h2c) server setup in main, introduces a session middleware parameter wired into SetRouter, and moves NoRoute registration to engine level while adjusting logger middleware signature and safety checks.

Changes

Cohort / File(s)	Summary
HTTP/2 and Server Setup `main.go`	Adds h2c support (golang.org/x/net/http2, http2/h2c); wraps Gin engine in custom http.Server and calls ListenAndServe; removes direct SetUpLogger call; creates a session middleware and passes it into `router.SetRouter`
Middleware Layer `middleware/logger.go`	Changes `SetUpLogger` signature from `*gin.Engine` to `gin.IRoutes`; adds guarded type assertion when extracting request ID from context/params to avoid panics
Router Composition / Entry `router/main.go`	`SetRouter` signature changed to accept `(engine *gin.Engine, buildFS embed.FS, indexPage []byte, sessionMiddleware gin.HandlerFunc)`; creates `dataPlaneRouter` and `appRouter` groups; mounts Relay/Video on dataPlaneRouter and Api/Dashboard on appRouter; applies session middleware to appRouter; calls `SetWebRouter(engine, appRouter, ...)`; moves NoRoute to engine when applicable
API / Dashboard / Video Routers `router/api-router.go`, `router/dashboard.go`, `router/video-router.go`	Change signatures to accept `gin.IRouter` / `gin.IRouter` or `gin.IRouter` instead of `*gin.Engine` (wider interface usage); internal grouping logic unchanged
Relay Router `router/relay-router.go`	`SetRelayRouter` signature changed to accept `engine *gin.Engine`; creates a base `relayRouter` group off the engine and mounts existing relay routes there
Web Router and 404 Handling `router/web-router.go`	`SetWebRouter` signature changed to `(engine *gin.Engine, router gin.IRouter, buildFS embed.FS, indexPage []byte)`; middleware and static serving stay on `router`; `NoRoute` handler is now registered on `engine` instead of `router`

Sequence Diagram(s)

sequenceDiagram
    autonumber
    participant Main
    participant Sessions as SessionsMiddleware
    participant SetRouter
    participant DataPlane as DataPlaneRouter
    participant AppRouter
    participant Relay as RelayRouter
    participant Video as VideoRouter
    participant Api as ApiRouter
    participant Dashboard as DashboardRouter
    participant Web as SetWebRouter
    participant Server as h2cServer

    Main->>Sessions: create sessions.Sessions("session", store)
    Main->>SetRouter: SetRouter(engine, fs, indexPage, sessionMiddleware)
    SetRouter->>DataPlane: Create dataPlaneRouter group
    SetRouter->>Relay: Mount Relay routes on dataPlaneRouter
    SetRouter->>Video: Mount Video routes on dataPlaneRouter
    SetRouter->>AppRouter: Create appRouter group
    SetRouter->>AppRouter: Apply sessionMiddleware + logger
    SetRouter->>Api: Mount API routes on appRouter
    SetRouter->>Dashboard: Mount Dashboard routes on appRouter
    SetRouter->>Web: SetWebRouter(engine, appRouter, fs, indexPage)
    Web->>Server: Register NoRoute on engine (404 handling)
    Main->>Server: ListenAndServe (h2c)

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

Areas needing attention:
- Verify all updated function signatures have matching call sites across the codebase.
- Confirm middleware ordering and that session middleware is applied only to appRouter.
- Review logger changes for correct extraction of request ID and any lost logging behavior.
- Check NoRoute relocation to engine for 404 precedence implications.

Poem

🐰
I hopped through routes both new and old,
Wrapped h2c dreams in code of gold,
Sessions tucked in middleware light,
Groups arranged just out of sight,
A joyous hop — the server's bold!

Pre-merge checks and finishing touches

❌ Failed checks (4 warnings)

Check name	Status	Explanation	Resolution
Title check	⚠️ Warning	The PR title 'fix: Respect model endpoint overrides' does not align with the actual changes, which involve HTTP/2 h2c support and router refactoring to use gin.IRouter interfaces.	Update the title to accurately reflect the main changes, such as 'refactor: Add h2c support and refactor routing to use gin.IRouter' or 'feat: Add HTTP/2 cleartext support with router refactoring'.
Linked Issues check	⚠️ Warning	The code changes do not address the linked issue #2202 requirements, which focus on model endpoint configuration and the OpenAIResponseOnlyModels list. The actual changes implement h2c support and router refactoring unrelated to endpoint type handling.	Implement the required changes to fix issue #2202: either make OpenAIResponseOnlyModels configurable or respect user-configured endpoint overrides without adding other endpoints automatically.
Out of Scope Changes check	⚠️ Warning	The PR includes extensive out-of-scope changes: HTTP/2 h2c support, session middleware introduction, logger middleware refactoring, and comprehensive router interface changes (gin.IRouter) that are unrelated to respecting model endpoint overrides.	Remove out-of-scope changes and focus solely on fixing the endpoint override issue described in #2202 by modifying endpoint_types.go and related model configuration logic.
Docstring Coverage	⚠️ Warning	Docstring coverage is 4.17% which is insufficient. The required threshold is 80.00%.	You can run `@coderabbitai generate docstrings` to improve docstring coverage.

✅ Passed checks (1 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.

✨ Finishing touches

📝 Generate docstrings

🧪 Generate unit tests (beta)

Create PR with unit tests
Post copyable unit tests in a comment

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

Actionable comments posted: 1

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between d56e162 and 0049b98.

📒 Files selected for processing (1)

model/pricing.go (1 hunks)

🔇 Additional comments (1)

model/pricing.go (1)

206-212: Authoritative override logic is correctly implemented; verify test coverage and edge cases.

The override mechanism at lines 206–212 correctly implements authoritative endpoint replacement for models with custom configurations. Per-model key assignment ensures that models with custom endpoints do not inherit capability-based defaults, while models without custom configurations retain their defaults. This achieves the intended behavior.

However, verification is needed in two areas:

Test coverage: No tests were found for this override behavior. Confirm that tests exist or should be added to validate the override works with both defined and empty endpoint configurations.

Endpoint validation: The code accepts any JSON keys as valid endpoints without pre-validation. Ensure that invalid or unsupported endpoint names are either rejected or handled gracefully downstream at line 220.

coderabbitai

Actionable comments posted: 2

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 0049b98 and 8b359c9.

📒 Files selected for processing (23)

common/api_type.go (1 hunks)
constant/api_type.go (1 hunks)
constant/channel.go (3 hunks)
main.go (3 hunks)
middleware/logger.go (1 hunks)
relay/channel/ali/adaptor.go (1 hunks)
relay/channel/replicate/adaptor.go (1 hunks)
relay/channel/replicate/constants.go (1 hunks)
relay/channel/replicate/dto.go (1 hunks)
relay/channel/task/vidu/adaptor.go (4 hunks)
relay/common/relay_utils.go (1 hunks)
relay/image_handler.go (2 hunks)
relay/relay_adaptor.go (2 hunks)
router/api-router.go (2 hunks)
router/dashboard.go (1 hunks)
router/main.go (1 hunks)
router/relay-router.go (3 hunks)
router/video-router.go (1 hunks)
router/web-router.go (1 hunks)
setting/model_setting/claude.go (1 hunks)
setting/ratio_setting/model_ratio.go (1 hunks)
web/src/constants/channel.constants.js (1 hunks)
web/src/helpers/render.jsx (2 hunks)

🧰 Additional context used

🧠 Learnings (5)

📚 Learning: 2025-11-07T17:46:11.086Z

Learnt from: Sh1n3zZ
Repo: QuantumNous/new-api PR: 2190
File: relay/channel/replicate/adaptor.go:207-209
Timestamp: 2025-11-07T17:46:11.086Z
Learning: In relay/channel/replicate/adaptor.go, the code uses Replicate's synchronous endpoint with the `Prefer: wait` header. With this header, the endpoint waits until the prediction completes and only returns when status is "succeeded" or a terminal failure state. It never returns with "starting" or "processing" statuses, so checking for only "succeeded" status is the correct implementation.

Applied to files:

relay/image_handler.go
relay/channel/replicate/dto.go
relay/relay_adaptor.go
relay/channel/replicate/adaptor.go

📚 Learning: 2025-08-21T06:31:11.073Z

Learnt from: jiajunly
Repo: QuantumNous/new-api PR: 1629
File: relay/channel/openai/relay-openai.go:170-174
Timestamp: 2025-08-21T06:31:11.073Z
Learning: In relay/channel/openai/relay-openai.go, the streaming logic for the AddThinkFirst feature is designed so that only the first chunk of a stream gets the "<think>\n" prefix. The final flush in the streaming handler intentionally uses addThink=false because the last chunk should never receive the prefix, even in single-chunk streams where the prefix would have been applied during normal processing.

Applied to files:

relay/channel/ali/adaptor.go

📚 Learning: 2025-06-21T03:37:41.726Z

Learnt from: 9Ninety
Repo: QuantumNous/new-api PR: 1273
File: relay/channel/gemini/relay-gemini.go:97-116
Timestamp: 2025-06-21T03:37:41.726Z
Learning: In relay/channel/gemini/relay-gemini.go, the thinking budget calculation logic (including the MaxOutputTokens multiplication) was introduced in PR #1247. PR #1273 focused specifically on decoupling the thoughts summary feature from thinking budget settings and did not modify the existing thinking budget behavior.

Applied to files:

relay/channel/ali/adaptor.go

📚 Learning: 2025-08-08T17:12:43.157Z

Learnt from: RedwindA
Repo: QuantumNous/new-api PR: 1537
File: relay/gemini_handler.go:330-342
Timestamp: 2025-08-08T17:12:43.157Z
Learning: In the new-api repository, the `GeminiEmbeddingHandler` function in `relay/gemini_handler.go` is designed specifically for native Gemini embedding requests and therefore does not require the `ConvertGeminiRequest` step that is used in the chat handler. The embedding requests are already in the native Gemini format and don't need conversion.

Applied to files:

relay/channel/replicate/adaptor.go

📚 Learning: 2025-08-05T17:14:17.246Z

Learnt from: neotf
Repo: QuantumNous/new-api PR: 1511
File: setting/ratio_setting/model_ratio.go:118-123
Timestamp: 2025-08-05T17:14:17.246Z
Learning: Claude models handle "-thinking" variants differently from Gemini models. For Claude models, only the base model (without "-thinking") gets an entry in defaultModelRatio map. The "-thinking" variants rely on the Claude relay handler stripping the suffix using strings.TrimSuffix(textRequest.Model, "-thinking") before looking up the ratio, so they automatically use the base model's ratio.

Applied to files:

setting/ratio_setting/model_ratio.go

🧬 Code graph analysis (12)

relay/image_handler.go (2)

constant/api_type.go (1)

APITypeReplicate (37-37)

service/error.go (2)

RelayErrorHandler (84-113)

ResetStatusCode (115-132)

router/api-router.go (5)

middleware/auth.go (1)

RootAuth (169-173)

middleware/rate-limit.go (1)

CriticalRateLimit (104-109)

middleware/disable-cache.go (1)

DisableCache (5-12)

middleware/secure_verification.go (1)

SecureVerificationRequired (21-80)

controller/channel.go (1)

GetChannelKey (391-422)

middleware/logger.go (1)

common/constants.go (1)

RequestIdKey (130-130)

relay/relay_adaptor.go (3)

constant/api_type.go (1)

APITypeReplicate (37-37)

relay/channel/replicate/adaptor.go (1)

Adaptor (27-28)

relay/channel/adapter.go (1)

Adaptor (15-32)

router/relay-router.go (4)

middleware/stats.go (1)

StatsMiddleware (17-29)

middleware/auth.go (2)

TokenAuth (179-290)

UserAuth (157-161)

controller/model.go (1)

ListModels (109-214)

middleware/distributor.go (1)

Distribute (29-125)

router/web-router.go (1)

middleware/rate-limit.go (1)

GlobalWebRateLimit (90-95)

router/main.go (8)

router/relay-router.go (1)

SetRelayRouter (13-183)

router/video-router.go (1)

SetVideoRouter (10-41)

middleware/logger.go (1)

SetUpLogger (10-28)

router/api-router.go (1)

SetApiRouter (11-259)

router/dashboard.go (1)

SetDashboardRouter (10-22)

common/constants.go (1)

IsMasterNode (112-112)

common/sys_log.go (1)

SysLog (11-14)

router/web-router.go (1)

SetWebRouter (16-29)

common/api_type.go (2)

constant/channel.go (1)

ChannelTypeReplicate (56-56)

constant/api_type.go (1)

APITypeReplicate (37-37)

main.go (1)

router/main.go (1)

SetRouter (16-44)

relay/channel/task/vidu/adaptor.go (6)

relay/common/relay_utils.go (3)

ValidateBasicTaskRequest (204-228)

GetTaskRequest (62-72)

HasImage (21-23)

constant/task.go (4)

TaskActionGenerate (14-14)

TaskActionTextGenerate (15-15)

TaskActionFirstTailGenerate (16-16)

TaskActionReferenceGenerate (17-17)

service/error.go (1)

TaskErrorWrapper (140-157)

constant/channel.go (1)

ChannelTypeVidu (52-52)

relay/channel/adapter.go (1)

TaskAdaptor (34-53)

relay/common/relay_info.go (1)

RelayInfo (76-123)

relay/common/relay_utils.go (1)

relay/common/relay_info.go (1)

TaskSubmitReq (488-499)

relay/channel/replicate/adaptor.go (12)

relay/common/relay_info.go (1)

RelayInfo (76-123)

constant/channel.go (2)

ChannelBaseURLs (61-119)

ChannelTypeReplicate (56-56)

relay/common/relay_utils.go (1)

GetFullRequestURL (25-37)

relay/channel/api_request.go (2)

SetupApiRequestHeader (27-39)

DoApiRequest (61-90)

relay/channel/replicate/constants.go (3)

ModelFlux11Pro (7-7)

ModelList (10-12)

ChannelName (5-5)

common/json.go (2)

Unmarshal (9-11)

Marshal (21-23)

types/error.go (5)

NewAPIError (87-95)

NewError (204-224)

ErrorCodeBadResponse (71-71)

ErrorCodeReadResponseBodyFailed (69-69)

ErrorCodeBadResponseBody (72-72)

relay/channel/replicate/dto.go (2)

PredictionResponse (3-7)

FileUploadResponse (15-19)

dto/openai_image.go (1)

ImageData (171-175)

dto/openai_response.go (1)

Usage (222-240)

service/image.go (1)

GetImageFromUrl (68-116)

service/http_client.go (1)

GetHttpClient (49-51)

🔇 Additional comments (20)

setting/model_setting/claude.go (1)

53-65: LGTM! Safe header appending prevents duplicates.

The refactored logic correctly prevents duplicate header values by checking existing entries before appending. This is more robust than the previous approach.

router/video-router.go (1)

10-10: LGTM! Interface abstraction improves flexibility.

Changing from *gin.Engine to gin.IRouter is a good refactoring that enables more flexible router composition and easier testing.

router/api-router.go (2)

11-11: LGTM! Interface abstraction improves flexibility.

Changing from *gin.Engine to gin.IRouter is a good refactoring consistent with other router setup functions in this PR.

138-138: LGTM! RootAuth requirement strengthens security.

Adding RootAuth() middleware to the channel key endpoint appropriately restricts this sensitive operation to root users only, enhancing security posture.

setting/ratio_setting/model_ratio.go (1)

270-297: LGTM! New model pricing entry is consistent.

The addition of "black-forest-labs/flux-1.1-pro" with price 0.04 is consistent with similar image generation models and properly supports the new Replicate channel integration.

middleware/logger.go (2)

10-10: LGTM! Interface abstraction improves flexibility.

Changing from *gin.Engine to gin.IRoutes is consistent with the broader router refactoring in this PR.

14-16: LGTM! Safe type assertion prevents panics.

The addition of the safe type assertion pattern prevents potential runtime panics if the request ID key is missing or has an unexpected type.

constant/channel.go (3)

56-56: LGTM! Replicate channel type addition follows existing pattern.

The new ChannelTypeReplicate = 56 constant is correctly defined with corresponding entries in ChannelBaseURLs and ChannelTypeNames.

118-118: LGTM! Base URL correctly positioned.

The Replicate API base URL is correctly placed at index 56 matching the channel type constant.

174-174: LGTM! Type name mapping complete.

The channel type name mapping for Replicate completes the channel integration.

constant/api_type.go (1)

37-38: No hardcoded values found—verification passed.

The search confirms no code uses hardcoded numeric values for APITypeDummy. The only usage found in controller/model.go:31 correctly references the constant: for i := 0; i < constant.APITypeDummy; i++. The iota shift is safe.

web/src/helpers/render.jsx (1)

58-58: LGTM! Replicate icon integration looks good.

The addition of the Replicate icon import and its case in getChannelIcon is consistent with existing channel icon patterns.

Also applies to: 346-347

common/api_type.go (1)

74-75: LGTM! Channel type mapping is correct.

The mapping of ChannelTypeReplicate to APITypeReplicate follows the established pattern for other channel types.

web/src/constants/channel.constants.js (1)

182-186: LGTM! Channel option definition is consistent.

The Replicate channel option with value 56 follows the established structure and is properly positioned in the CHANNEL_OPTIONS array.

relay/relay_adaptor.go (1)

29-29: LGTM! Replicate adaptor integration is correct.

The import and case addition for the Replicate adaptor follows the established pattern for other channel adaptors.

Also applies to: 117-118

router/dashboard.go (1)

10-10: LGTM! Signature refactor improves flexibility.

Changing the parameter from *gin.Engine to gin.IRouter makes the function more flexible while maintaining backward compatibility since *gin.Engine implements the gin.IRouter interface. This allows the dashboard router to be mounted in different routing contexts.

relay/common/relay_utils.go (2)

62-72: LGTM! GetTaskRequest helper is well implemented.

The new GetTaskRequest function provides proper error handling and type assertion for retrieving task requests from the Gin context.

204-227: VidU's automatic action adjustments remain intact and functional.

The review comment contains an incorrect assumption. The automatic action adjustments for VidU were never removed from ValidateBasicTaskRequest because they were never located there to begin with. VidU's action adjustment logic—based on image count and metadata—is implemented in ValidateRequestAndSetAction (relay/channel/task/vidu/adaptor.go:85-108) and remains unchanged. The function:

Calls ValidateBasicTaskRequest with a default action

Retrieves the parsed request

Adjusts info.Action based on metadata and image count (lines 93-107)

This multi-step flow ensures proper action handling for VidU's three modes: text-to-video, image-to-video (single), first-tail frame, and reference image generation. No changes to callers are required.

Likely an incorrect or invalid review comment.

relay/image_handler.go (1)

96-104: No action needed—201 status handling is correct.

The special handling for Replicate's 201 Created status is safe. While the HTTP status code is normalized from 201 to 200, the actual error validation still occurs downstream in adaptor.DoResponse(). The Replicate adaptor validates both the error messages and prediction status (lines 207-218 of adaptor.go) regardless of the HTTP status code conversion, so legitimate errors will not be masked. Based on Replicate's API behavior with the Prefer: wait header—which only returns succeeded or terminal failure states—the 201 response normalization is appropriate.

router/relay-router.go (1)

13-17: Middleware has moved from engine-level to group-level scope—verify this doesn't break existing functionality.

The refactoring changed how middleware is applied. Previously, middleware.CORS(), middleware.DecompressRequestMiddleware(), and middleware.StatsMiddleware() appear to have been applied at the engine level (affecting all routes). Now they're applied only to the relayRouter group, which means:

Relay routes (/v1/*, /mj/*, /suno/*, etc.): receive CORS, DecompressRequest, Stats middleware ✓

Video routes (/v1/videos/:task_id/content via dataPlaneRouter): do NOT receive these middleware

API/dashboard routes (/api/*, /dashboard, etc. via appRouter): do NOT receive CORS, DecompressRequest, Stats

Verify that:

Video and API/dashboard routes don't require the relay middleware (especially CORS and Stats)

No existing clients depend on CORS headers from non-relay routes

Load stats/metrics collection still captures data from all routes

coderabbitai · 2025-11-13T11:13:45Z

+	//if strings.Contains(request.Model, "thinking") {
+	//	request.EnableThinking = true
+	//	request.Stream = true
+	//	info.IsStream = true
+	//}
+	//// fix: ali parameter.enable_thinking must be set to false for non-streaming calls
+	//if !info.IsStream {
+	//	request.EnableThinking = false
+	//}


🛠️ Refactor suggestion | 🟠 Major

🧩 Analysis chain

Consider removing commented code or documenting the reason for disabling.

The commented-out logic previously enforced thinking-related parameters for Ali models. If this logic is no longer needed (perhaps to respect user configuration as per the PR objective), consider deleting it rather than leaving it commented. If it's temporarily disabled for testing, add a TODO comment explaining when/why it should be re-enabled.

Verify that Ali thinking models still function correctly without this parameter enforcement:

🏁 Script executed:

#!/bin/bash # Search for other references to Ali thinking model handling or enable_thinking parameter rg -n -C3 'enable_thinking|EnableThinking' --type=go

Length of output: 1530

🏁 Script executed:

#!/bin/bash # Search for active (non-commented) assignments to EnableThinking rg -n 'request\.EnableThinking\s*=' --type=go | grep -v '^\s*//' # Also search for where EnableThinking might be read or used rg -n 'EnableThinking' --type=go -A1 -B1 | head -60

Length of output: 758

🏁 Script executed:

#!/bin/bash # Search for any TODO/FIXME related to thinking or enable_thinking rg -n -i 'todo|fixme' relay/channel/ali/ --type=go

Length of output: 161

🏁 Script executed:

#!/bin/bash # Check git history or comments around this file for context head -100 relay/channel/ali/adaptor.go

Length of output: 3359

Remove the commented-out Ali thinking parameter enforcement code.

The verification confirms this commented code was the only place where EnableThinking was ever actively modified. Since it's no longer needed and no active code sets this parameter anywhere in the codebase, it should be deleted to keep the code clean. Ali thinking models will continue to work—users can now control enable_thinking directly through their request parameters.

🤖 Prompt for AI Agents

In relay/channel/ali/adaptor.go around lines 85 to 93, remove the entire commented-out block that attempted to enforce/modify the Ali "thinking" parameters (the lines beginning with //if strings.Contains... through the final commented block that set EnableThinking to false). Delete these commented lines so the file no longer contains dead code; after removal, ensure there are no leftover blank lines or references to the removed block and run gofmt/go vet to keep formatting and imports clean.

coderabbitai · 2025-11-13T11:13:45Z

+func (a *Adaptor) SetupRequestHeader(c *gin.Context, req *http.Header, info *relaycommon.RelayInfo) error {
+	if info == nil {
+		return errors.New("replicate adaptor: relay info is nil")
+	}
+	if info.ApiKey == "" {
+		return errors.New("replicate adaptor: api key is required")
+	}
+	channel.SetupApiRequestHeader(info, c, req)
+	req.Set("Authorization", "Bearer "+info.ApiKey)
+	req.Set("Prefer", "wait")
+	if req.Get("Content-Type") == "" {
+		req.Set("Content-Type", "application/json")
+	}
+	if req.Get("Accept") == "" {
+		req.Set("Accept", "application/json")
+	}
+	return nil


⚠️ Potential issue | 🔴 Critical

Force JSON Content-Type when talking to Replicate.

Image-edit calls arrive as multipart, so channel.SetupApiRequestHeader copies Content-Type: multipart/form-data onto the downstream request, but we send JSON here. Replicate responds with 400/415 because the header and body disagree. Please override the header to the JSON value we actually send.

- channel.SetupApiRequestHeader(info, c, req) - req.Set("Authorization", "Bearer "+info.ApiKey) - req.Set("Prefer", "wait") - if req.Get("Content-Type") == "" { - req.Set("Content-Type", "application/json") - } + channel.SetupApiRequestHeader(info, c, req) + req.Set("Authorization", "Bearer "+info.ApiKey) + req.Set("Prefer", "wait") + req.Set("Content-Type", "application/json")

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

func (a *Adaptor) SetupRequestHeader(c *gin.Context, req *http.Header, info *relaycommon.RelayInfo) error {

if info == nil {

return errors.New("replicate adaptor: relay info is nil")

}

if info.ApiKey == "" {

return errors.New("replicate adaptor: api key is required")

}

channel.SetupApiRequestHeader(info, c, req)

req.Set("Authorization", "Bearer "+info.ApiKey)

req.Set("Prefer", "wait")

if req.Get("Content-Type") == "" {

req.Set("Content-Type", "application/json")

}

if req.Get("Accept") == "" {

req.Set("Accept", "application/json")

}

return nil

func (a *Adaptor) SetupRequestHeader(c *gin.Context, req *http.Header, info *relaycommon.RelayInfo) error {

if info == nil {

return errors.New("replicate adaptor: relay info is nil")

}

if info.ApiKey == "" {

return errors.New("replicate adaptor: api key is required")

}

channel.SetupApiRequestHeader(info, c, req)

req.Set("Authorization", "Bearer "+info.ApiKey)

req.Set("Prefer", "wait")

req.Set("Content-Type", "application/json")

if req.Get("Accept") == "" {

req.Set("Accept", "application/json")

}

return nil

}

🤖 Prompt for AI Agents

In relay/channel/replicate/adaptor.go around lines 47 to 63, the current SetupRequestHeader copies the original request's Content-Type (often multipart/form-data) which causes Replicate to reject the request because we actually send JSON; change the logic to always set the "Content-Type" header to "application/json" (override any existing value) before returning, ensure "Accept" is set to "application/json" if missing, and preserve the existing Authorization and Prefer header behavior.

coderabbitai

Actionable comments posted: 0

🧹 Nitpick comments (1)

router/main.go (1)

17-34: Document middleware architecture and verify CORS redundancy.

The router refactoring introduces a three-tier middleware architecture:

Relay routes (line 17): Own middleware applied internally (CORS, Decompress, Stats)

Video routes (line 25): dataPlaneRouter middleware (CORS, Decompress, Stats)

API/Dashboard routes (lines 33-34): appRouter middleware (session, logger) plus their own internal middleware

This creates middleware duplication, particularly CORS, which is applied:

In SetRelayRouter internally (relay-router.go:13)

On dataPlaneRouter for video routes (line 21)

In SetDashboardRouter internally (dashboard.go:14)

In SetApiRouter internally (per api-router.go summary)

Consider:

Adding inline comments documenting the data-plane vs app-plane architectural intent

Verifying whether multiple CORS middleware layers are necessary or cause conflicts

Standardizing middleware application (either at the group level OR internally, not both)

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between d05cc7f and 78465ea.

📒 Files selected for processing (1)

router/main.go (1 hunks)

🧰 Additional context used

🧬 Code graph analysis (1)

router/main.go (9)

router/relay-router.go (1)

SetRelayRouter (13-183)

middleware/stats.go (1)

StatsMiddleware (17-29)

router/video-router.go (1)

SetVideoRouter (10-41)

middleware/logger.go (1)

SetUpLogger (10-28)

router/api-router.go (1)

SetApiRouter (11-259)

router/dashboard.go (1)

SetDashboardRouter (10-22)

common/constants.go (1)

IsMasterNode (112-112)

common/sys_log.go (1)

SysLog (11-14)

router/web-router.go (1)

SetWebRouter (16-29)

🔇 Additional comments (3)

router/main.go (3)

31-31: Verify intentional omission of logger from relay and video routes.

The logger middleware is applied only to appRouter, which means relay routes (core API endpoints like /v1/chat/completions, /v1/models) and video routes are not logged. This could hinder debugging and monitoring of the data plane.

Is this intentional architectural separation, or should logging be applied globally?

16-16: All callers properly handle the new sessionMiddleware parameter.

Verification confirms that main.go has been updated to provide the sessionMiddleware parameter when calling SetRouter (line 159), with the middleware properly initialized at line 153 using sessions.Sessions("session", store). No issues found.

1-49: Unable to definitively verify: git context unavailable in sandbox environment.

The review comment raises a valid structural concern—the provided code snippet shows only router middleware refactoring without endpoint override logic. However, without access to:

Git diffs showing what actually changed in this PR

PR metadata (description, commit messages)

Complete file context

I cannot confirm whether:

The router changes are necessary infrastructure for the endpoint fix

They are legitimately bundled with the fix or represent scope creep

The actual endpoint override implementation exists in other changed files

To verify and resolve this review comment, manually confirm:

Does the PR include changes to common/endpoint_type.go or controller/model.go where the endpoint override logic should reside?

If so, how do the router middleware changes relate to or enable those fixes?

Should the router refactoring be split into a separate PR?

coderabbitai Bot reviewed Nov 10, 2025

View reviewed changes

Comment thread model/pricing.go Outdated

FlowerCode added 2 commits November 13, 2025 19:11

refactor: slim relay routing middleware

2a56c61

feat: enable h2c server

d05cc7f

coderabbitai Bot reviewed Nov 13, 2025

View reviewed changes

FlowerCode force-pushed the main branch from 8b359c9 to d05cc7f Compare November 13, 2025 11:13

fix: restore data-plane middleware for video proxy routes

78465ea

coderabbitai Bot reviewed Nov 13, 2025

View reviewed changes

FlowerCode closed this Nov 13, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

fix: Respect model endpoint overrides#2203

fix: Respect model endpoint overrides#2203
FlowerCode wants to merge 3 commits into
QuantumNous:mainfrom
FlowerCode:main

FlowerCode commented Nov 10, 2025 •

edited by coderabbitai Bot

Loading

Uh oh!

coderabbitai Bot commented Nov 10, 2025 •

edited

Loading

Uh oh!

coderabbitai Bot left a comment

Uh oh!

Uh oh!

coderabbitai Bot left a comment

Uh oh!

coderabbitai Bot Nov 13, 2025

Uh oh!

coderabbitai Bot Nov 13, 2025

Uh oh!

coderabbitai Bot left a comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Uh oh!

Conversation

FlowerCode commented Nov 10, 2025 • edited by coderabbitai Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary by CodeRabbit

Uh oh!

coderabbitai Bot commented Nov 10, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Sequence Diagram(s)

Estimated code review effort

Poem

Pre-merge checks and finishing touches

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot Nov 13, 2025

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot Nov 13, 2025

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

FlowerCode commented Nov 10, 2025 •

edited by coderabbitai Bot

Loading

coderabbitai Bot commented Nov 10, 2025 •

edited

Loading