[codex] Fallback lite GLM to standard Fireworks#543
Conversation
Greptile SummaryThis PR adds a lite-mode fallback so that GLM requests with Confidence Score: 5/5Safe to merge — logic is correct, all new paths are covered by tests, and existing behaviour is unchanged. The only finding is a P2 style suggestion to merge two duplicated fallback branches; no correctness or reliability issues were identified. No files require special attention.
|
| Filename | Overview |
|---|---|
| web/src/llm-api/fireworks.ts | Adds lite-mode fallback to the standard Fireworks API across four deployment-unavailability scenarios; logic is correct but the two pre-deployment checks contain duplicated fallback branches that could be merged. |
| web/src/llm-api/tests/fireworks-deployment.test.ts | Adds four well-structured tests covering lite-mode fallback for outside-hours, cooldown, 5xx, and thrown-error scenarios; good coverage of the new code paths. |
Comments Outside Diff (1)
-
web/src/llm-api/fireworks.ts, line 733-771 (link)Consolidate the two pre-deployment fallback checks
The "outside hours" and "cooling down" blocks share identical lite-mode behaviour (log + return standard API), so they can be merged into a single guard. This removes one repeated
if (shouldFallbackToStandardApi)branch and keeps both distinct error responses for non-lite callers:if (hasDeployment && (!isDeploymentHours(now) || isDeploymentCoolingDown())) { if (shouldFallbackToStandardApi) { logger.info( { model: originalModel }, 'Falling back to Fireworks standard API (deployment unavailable)', ) return createStandardApiRequest() } if (!isDeploymentHours(now)) { return new Response( JSON.stringify({ error: { message: `${originalModel} is only available during ${FREEBUFF_DEPLOYMENT_HOURS_LABEL}. Use minimax/minimax-m2.7 outside those hours.`, code: 'DEPLOYMENT_OUTSIDE_HOURS', type: 'availability_error', }, }), { status: 503, statusText: 'Service Unavailable' }, ) } return new Response( JSON.stringify({ error: { message: `${originalModel} deployment is temporarily unavailable. Use minimax/minimax-m2.7 while it recovers.`, code: 'DEPLOYMENT_COOLDOWN', type: 'availability_error', }, }), { status: 503, statusText: 'Service Unavailable' }, ) }
Prompt To Fix With AI
This is a comment left during a code review. Path: web/src/llm-api/fireworks.ts Line: 733-771 Comment: **Consolidate the two pre-deployment fallback checks** The "outside hours" and "cooling down" blocks share identical lite-mode behaviour (log + return standard API), so they can be merged into a single guard. This removes one repeated `if (shouldFallbackToStandardApi)` branch and keeps both distinct error responses for non-lite callers: ```typescript if (hasDeployment && (!isDeploymentHours(now) || isDeploymentCoolingDown())) { if (shouldFallbackToStandardApi) { logger.info( { model: originalModel }, 'Falling back to Fireworks standard API (deployment unavailable)', ) return createStandardApiRequest() } if (!isDeploymentHours(now)) { return new Response( JSON.stringify({ error: { message: `${originalModel} is only available during ${FREEBUFF_DEPLOYMENT_HOURS_LABEL}. Use minimax/minimax-m2.7 outside those hours.`, code: 'DEPLOYMENT_OUTSIDE_HOURS', type: 'availability_error', }, }), { status: 503, statusText: 'Service Unavailable' }, ) } return new Response( JSON.stringify({ error: { message: `${originalModel} deployment is temporarily unavailable. Use minimax/minimax-m2.7 while it recovers.`, code: 'DEPLOYMENT_COOLDOWN', type: 'availability_error', }, }), { status: 503, statusText: 'Service Unavailable' }, ) } ``` How can I resolve this? If you propose a fix, please make it concise.
Note: If this suggestion doesn't match your team's coding style, reply to this and let me know. I'll remember it for next time!
Prompt To Fix All With AI
This is a comment left during a code review.
Path: web/src/llm-api/fireworks.ts
Line: 733-771
Comment:
**Consolidate the two pre-deployment fallback checks**
The "outside hours" and "cooling down" blocks share identical lite-mode behaviour (log + return standard API), so they can be merged into a single guard. This removes one repeated `if (shouldFallbackToStandardApi)` branch and keeps both distinct error responses for non-lite callers:
```typescript
if (hasDeployment && (!isDeploymentHours(now) || isDeploymentCoolingDown())) {
if (shouldFallbackToStandardApi) {
logger.info(
{ model: originalModel },
'Falling back to Fireworks standard API (deployment unavailable)',
)
return createStandardApiRequest()
}
if (!isDeploymentHours(now)) {
return new Response(
JSON.stringify({
error: {
message: `${originalModel} is only available during ${FREEBUFF_DEPLOYMENT_HOURS_LABEL}. Use minimax/minimax-m2.7 outside those hours.`,
code: 'DEPLOYMENT_OUTSIDE_HOURS',
type: 'availability_error',
},
}),
{ status: 503, statusText: 'Service Unavailable' },
)
}
return new Response(
JSON.stringify({
error: {
message: `${originalModel} deployment is temporarily unavailable. Use minimax/minimax-m2.7 while it recovers.`,
code: 'DEPLOYMENT_COOLDOWN',
type: 'availability_error',
},
}),
{ status: 503, statusText: 'Service Unavailable' },
)
}
```
How can I resolve this? If you propose a fix, please make it concise.Reviews (1): Last reviewed commit: "Fallback lite GLM to Fireworks API" | Re-trigger Greptile
This updates lite-mode GLM routing to fall back to the standard Fireworks API when the dedicated deployment is outside hours, cooling down, returns 5xxs, or throws before responding.
Previously, deployment-backed GLM requests surfaced provider availability errors instead of retrying serverless, which broke codebuff lite mode when the dedicated deployment was unavailable.
The user impact is that lite-mode GLM stays available without changing freebuff waiting-room behavior, which still preserves the existing hard-stop semantics for free-mode traffic.
Validation:
ANTHROPIC_API_KEY=dummy FIREWORKS_API_KEY=dummy GRAVITY_API_KEY=dummy STRIPE_SUBSCRIPTION_100_PRICE_ID=dummy STRIPE_SUBSCRIPTION_200_PRICE_ID=dummy STRIPE_SUBSCRIPTION_500_PRICE_ID=dummy bun test web/src/llm-api/__tests__/fireworks-deployment.test.ts web/src/app/api/v1/freebuff/session/__tests__/session.test.ts.