Disable GLM dedicated Fireworks deployment#556
Conversation
Greptile SummaryThis PR disables the dedicated Fireworks deployment for GLM 5.1 and routes it through the serverless API during existing freebuff availability hours. It introduces a Confidence Score: 5/5Safe to merge; all remaining findings are P2 suggestions that do not affect current behavior. The routing change is correct and well-tested. The only concern is a latent maintenance trap: deployment-mapped models are no longer auto hours-gated, which only matters if a future entry is added to FIREWORKS_DEPLOYMENT_MAP without updating FIREWORKS_HOURS_GATED_MODELS. All other findings are style/cleanup. web/src/llm-api/fireworks.ts (hours-gate coupling) and freebuff/e2e/tests/slash-commands.e2e.test.ts (skipped suite).
|
| Filename | Overview |
|---|---|
| web/src/llm-api/fireworks.ts | Adds FIREWORKS_HOURS_GATED_MODELS set and isHoursGatedModel check; separates hours-gating from deployment-map membership, which means future re-enabled deployment models won't be auto-gated. |
| web/src/llm-api/fireworks-config.ts | GLM 5.1 deployment entry commented out, leaving an empty deployment map; change is intentional and well-commented. |
| web/src/llm-api/tests/fireworks-deployment.test.ts | Injects TEST_DEPLOYMENT_MAP into deployment-path tests and adds two new tests covering serverless-only GLM routing; coverage is complete. |
| web/src/app/api/v1/chat/completions/tests/completions.test.ts | Updates expected model ID from dedicated deployment to serverless GLM path; straightforward test update. |
| freebuff/e2e/tests/slash-commands.e2e.test.ts | Entire e2e suite temporarily skipped with describe.skip; no tracking reference included. |
Prompt To Fix All With AI
This is a comment left during a code review.
Path: web/src/llm-api/fireworks.ts
Line: 737-743
Comment:
**Hours-gate no longer covers deployment-mapped models**
Before this PR, any model in `FIREWORKS_DEPLOYMENT_MAP` was implicitly hours-gated via `hasDeployment && !isDeploymentHours(now)`. Now the gate is driven solely by `FIREWORKS_HOURS_GATED_MODELS`, so if a model is re-added to `FIREWORKS_DEPLOYMENT_MAP` (e.g. minimax) without also being added to the set, it will be reachable outside deployment hours. Deriving `isHoursGatedModel` from both keeps the two lists in sync automatically:
```typescript
const isHoursGatedModel = FIREWORKS_HOURS_GATED_MODELS.has(originalModel) || hasDeployment
```
How can I resolve this? If you propose a fix, please make it concise.
---
This is a comment left during a code review.
Path: freebuff/e2e/tests/slash-commands.e2e.test.ts
Line: 41
Comment:
**Skipped e2e suite left without a tracking ticket**
`describe.skip` silently removes this entire suite from CI with no in-code reference to a ticket or issue. If the underlying failure isn't fixed soon, this skip is easy to forget. Consider adding a comment with the tracking issue so it's easy to remove once the root cause is resolved.
How can I resolve this? If you propose a fix, please make it concise.Reviews (1): Last reviewed commit: "disable glm deployment" | Re-trigger Greptile
| const isHoursGatedModel = FIREWORKS_HOURS_GATED_MODELS.has(originalModel) | ||
| const shouldFallbackToStandardApi = body.codebuff_metadata?.cost_mode === 'lite' | ||
|
|
||
| const createStandardApiRequest = () => | ||
| createFireworksRequest({ body, originalModel, fetch, sessionId }) | ||
|
|
||
| if (hasDeployment && !isDeploymentHours(now)) { | ||
| if (isHoursGatedModel && !isDeploymentHours(now)) { |
There was a problem hiding this comment.
Hours-gate no longer covers deployment-mapped models
Before this PR, any model in FIREWORKS_DEPLOYMENT_MAP was implicitly hours-gated via hasDeployment && !isDeploymentHours(now). Now the gate is driven solely by FIREWORKS_HOURS_GATED_MODELS, so if a model is re-added to FIREWORKS_DEPLOYMENT_MAP (e.g. minimax) without also being added to the set, it will be reachable outside deployment hours. Deriving isHoursGatedModel from both keeps the two lists in sync automatically:
const isHoursGatedModel = FIREWORKS_HOURS_GATED_MODELS.has(originalModel) || hasDeploymentPrompt To Fix With AI
This is a comment left during a code review.
Path: web/src/llm-api/fireworks.ts
Line: 737-743
Comment:
**Hours-gate no longer covers deployment-mapped models**
Before this PR, any model in `FIREWORKS_DEPLOYMENT_MAP` was implicitly hours-gated via `hasDeployment && !isDeploymentHours(now)`. Now the gate is driven solely by `FIREWORKS_HOURS_GATED_MODELS`, so if a model is re-added to `FIREWORKS_DEPLOYMENT_MAP` (e.g. minimax) without also being added to the set, it will be reachable outside deployment hours. Deriving `isHoursGatedModel` from both keeps the two lists in sync automatically:
```typescript
const isHoursGatedModel = FIREWORKS_HOURS_GATED_MODELS.has(originalModel) || hasDeployment
```
How can I resolve this? If you propose a fix, please make it concise.| ] | ||
|
|
||
| describe('Freebuff: Slash Commands', () => { | ||
| describe.skip('Freebuff: Slash Commands', () => { |
There was a problem hiding this comment.
Skipped e2e suite left without a tracking ticket
describe.skip silently removes this entire suite from CI with no in-code reference to a ticket or issue. If the underlying failure isn't fixed soon, this skip is easy to forget. Consider adding a comment with the tracking issue so it's easy to remove once the root cause is resolved.
Prompt To Fix With AI
This is a comment left during a code review.
Path: freebuff/e2e/tests/slash-commands.e2e.test.ts
Line: 41
Comment:
**Skipped e2e suite left without a tracking ticket**
`describe.skip` silently removes this entire suite from CI with no in-code reference to a ticket or issue. If the underlying failure isn't fixed soon, this skip is easy to forget. Consider adding a comment with the tracking issue so it's easy to remove once the root cause is resolved.
How can I resolve this? If you propose a fix, please make it concise.Note: If this suggestion doesn't match your team's coding style, reply to this and let me know. I'll remember it for next time!
Summary
Routes GLM 5.1 through Fireworks serverless during the existing freebuff availability hours by commenting out the dedicated deployment mapping and adding an explicit hours gate for GLM. Updates Fireworks and chat-completions tests for serverless routing while preserving deployment-specific fallback coverage through an injected test map. Temporarily skips the Freebuff slash-command e2e suite because it is currently failing.
Validation