-
Notifications
You must be signed in to change notification settings - Fork 519
Disable GLM dedicated Fireworks deployment #556
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Changes from all commits
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -2,6 +2,7 @@ import { Agent } from 'undici' | |
|
|
||
| import { | ||
| FREEBUFF_DEPLOYMENT_HOURS_LABEL, | ||
| FREEBUFF_GLM_MODEL_ID, | ||
| isFreebuffDeploymentHours, | ||
| } from '@codebuff/common/constants/freebuff-models' | ||
| import { PROFIT_MARGIN } from '@codebuff/common/constants/limits' | ||
|
|
@@ -38,6 +39,11 @@ const FIREWORKS_MODEL_MAP: Record<string, string> = { | |
| 'z-ai/glm-5.1': 'accounts/fireworks/models/glm-5p1', | ||
| } | ||
|
|
||
| /** Models that stay limited to freebuff deployment hours even on serverless. */ | ||
| const FIREWORKS_HOURS_GATED_MODELS = new Set<string>([ | ||
| FREEBUFF_GLM_MODEL_ID, | ||
| ]) | ||
|
|
||
| /** Flag to enable custom Fireworks deployments (set to false to use global API only) */ | ||
| const FIREWORKS_USE_CUSTOM_DEPLOYMENT = true | ||
|
|
||
|
|
@@ -706,9 +712,10 @@ async function parseFireworksError(response: Response): Promise<FireworksError> | |
| } | ||
|
|
||
| /** | ||
| * Uses custom Fireworks deployments only during deployment hours. Deployment | ||
| * mapped models never fall back to the serverless API outside hours, during | ||
| * cooldown, or after deployment 5xxs; those states surface as provider errors | ||
| * Uses custom Fireworks deployments only during deployment hours. Some models | ||
| * are still availability-gated even when served by the Fireworks serverless | ||
| * API. Deployment-mapped models never fall back to the serverless API during | ||
| * cooldown or after deployment 5xxs; those states surface as provider errors | ||
| * so freebuff can offer MiniMax as the always-on option. | ||
| */ | ||
| export async function createFireworksRequestWithFallback(params: { | ||
|
|
@@ -717,20 +724,23 @@ export async function createFireworksRequestWithFallback(params: { | |
| fetch: typeof globalThis.fetch | ||
| logger: Logger | ||
| useCustomDeployment?: boolean | ||
| deploymentMap?: Record<string, string> | ||
| sessionId: string | ||
| now?: Date | ||
| }): Promise<Response> { | ||
| const { body, originalModel, fetch, logger, sessionId } = params | ||
| const now = params.now ?? new Date() | ||
| const useCustomDeployment = params.useCustomDeployment ?? FIREWORKS_USE_CUSTOM_DEPLOYMENT | ||
| const deploymentModelId = FIREWORKS_DEPLOYMENT_MAP[originalModel] | ||
| const deploymentMap = params.deploymentMap ?? FIREWORKS_DEPLOYMENT_MAP | ||
| const deploymentModelId = deploymentMap[originalModel] | ||
| const hasDeployment = useCustomDeployment && Boolean(deploymentModelId) | ||
| const isHoursGatedModel = FIREWORKS_HOURS_GATED_MODELS.has(originalModel) | ||
| const shouldFallbackToStandardApi = body.codebuff_metadata?.cost_mode === 'lite' | ||
|
|
||
| const createStandardApiRequest = () => | ||
| createFireworksRequest({ body, originalModel, fetch, sessionId }) | ||
|
|
||
| if (hasDeployment && !isDeploymentHours(now)) { | ||
| if (isHoursGatedModel && !isDeploymentHours(now)) { | ||
|
Comment on lines
+737
to
+743
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
Before this PR, any model in const isHoursGatedModel = FIREWORKS_HOURS_GATED_MODELS.has(originalModel) || hasDeploymentPrompt To Fix With AIThis is a comment left during a code review.
Path: web/src/llm-api/fireworks.ts
Line: 737-743
Comment:
**Hours-gate no longer covers deployment-mapped models**
Before this PR, any model in `FIREWORKS_DEPLOYMENT_MAP` was implicitly hours-gated via `hasDeployment && !isDeploymentHours(now)`. Now the gate is driven solely by `FIREWORKS_HOURS_GATED_MODELS`, so if a model is re-added to `FIREWORKS_DEPLOYMENT_MAP` (e.g. minimax) without also being added to the set, it will be reachable outside deployment hours. Deriving `isHoursGatedModel` from both keeps the two lists in sync automatically:
```typescript
const isHoursGatedModel = FIREWORKS_HOURS_GATED_MODELS.has(originalModel) || hasDeployment
```
How can I resolve this? If you propose a fix, please make it concise. |
||
| if (shouldFallbackToStandardApi) { | ||
| logger.info( | ||
| { model: originalModel }, | ||
|
|
||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
describe.skipsilently removes this entire suite from CI with no in-code reference to a ticket or issue. If the underlying failure isn't fixed soon, this skip is easy to forget. Consider adding a comment with the tracking issue so it's easy to remove once the root cause is resolved.Prompt To Fix With AI
Note: If this suggestion doesn't match your team's coding style, reply to this and let me know. I'll remember it for next time!