[codex] Fallback lite GLM to standard Fireworks by brandonkachen · Pull Request #543 · CodebuffAI/codebuff

brandonkachen · 2026-04-25T00:32:31Z

This updates lite-mode GLM routing to fall back to the standard Fireworks API when the dedicated deployment is outside hours, cooling down, returns 5xxs, or throws before responding.

Previously, deployment-backed GLM requests surfaced provider availability errors instead of retrying serverless, which broke codebuff lite mode when the dedicated deployment was unavailable.

The user impact is that lite-mode GLM stays available without changing freebuff waiting-room behavior, which still preserves the existing hard-stop semantics for free-mode traffic.

Validation: ANTHROPIC_API_KEY=dummy FIREWORKS_API_KEY=dummy GRAVITY_API_KEY=dummy STRIPE_SUBSCRIPTION_100_PRICE_ID=dummy STRIPE_SUBSCRIPTION_200_PRICE_ID=dummy STRIPE_SUBSCRIPTION_500_PRICE_ID=dummy bun test web/src/llm-api/__tests__/fireworks-deployment.test.ts web/src/app/api/v1/freebuff/session/__tests__/session.test.ts.

greptile-apps · 2026-04-25T00:34:53Z

Greptile Summary

This PR adds a lite-mode fallback so that GLM requests with cost_mode: 'lite' transparently retry against the standard Fireworks serverless API when the dedicated deployment is outside hours, cooling down, returns a 5xx, or throws before responding. Existing freebuff/non-lite behaviour is fully preserved.

Confidence Score: 5/5

Safe to merge — logic is correct, all new paths are covered by tests, and existing behaviour is unchanged.

The only finding is a P2 style suggestion to merge two duplicated fallback branches; no correctness or reliability issues were identified.

No files require special attention.

Important Files Changed

Filename	Overview
web/src/llm-api/fireworks.ts	Adds lite-mode fallback to the standard Fireworks API across four deployment-unavailability scenarios; logic is correct but the two pre-deployment checks contain duplicated fallback branches that could be merged.
web/src/llm-api/tests/fireworks-deployment.test.ts	Adds four well-structured tests covering lite-mode fallback for outside-hours, cooldown, 5xx, and thrown-error scenarios; good coverage of the new code paths.

Comments Outside Diff (1)

web/src/llm-api/fireworks.ts, line 733-771 (link)

Consolidate the two pre-deployment fallback checks

The "outside hours" and "cooling down" blocks share identical lite-mode behaviour (log + return standard API), so they can be merged into a single guard. This removes one repeated if (shouldFallbackToStandardApi) branch and keeps both distinct error responses for non-lite callers:

if (hasDeployment && (!isDeploymentHours(now) || isDeploymentCoolingDown())) {
  if (shouldFallbackToStandardApi) {
    logger.info(
      { model: originalModel },
      'Falling back to Fireworks standard API (deployment unavailable)',
    )
    return createStandardApiRequest()
  }
  if (!isDeploymentHours(now)) {
    return new Response(
      JSON.stringify({
        error: {
          message: `${originalModel} is only available during ${FREEBUFF_DEPLOYMENT_HOURS_LABEL}. Use minimax/minimax-m2.7 outside those hours.`,
          code: 'DEPLOYMENT_OUTSIDE_HOURS',
          type: 'availability_error',
        },
      }),
      { status: 503, statusText: 'Service Unavailable' },
    )
  }
  return new Response(
    JSON.stringify({
      error: {
        message: `${originalModel} deployment is temporarily unavailable. Use minimax/minimax-m2.7 while it recovers.`,
        code: 'DEPLOYMENT_COOLDOWN',
        type: 'availability_error',
      },
    }),
    { status: 503, statusText: 'Service Unavailable' },
  )
}

Prompt To Fix With AI

This is a comment left during a code review.
Path: web/src/llm-api/fireworks.ts
Line: 733-771

Comment:
**Consolidate the two pre-deployment fallback checks**

The "outside hours" and "cooling down" blocks share identical lite-mode behaviour (log + return standard API), so they can be merged into a single guard. This removes one repeated `if (shouldFallbackToStandardApi)` branch and keeps both distinct error responses for non-lite callers:

```typescript
if (hasDeployment && (!isDeploymentHours(now) || isDeploymentCoolingDown())) {
  if (shouldFallbackToStandardApi) {
    logger.info(
      { model: originalModel },
      'Falling back to Fireworks standard API (deployment unavailable)',
    )
    return createStandardApiRequest()
  }
  if (!isDeploymentHours(now)) {
    return new Response(
      JSON.stringify({
        error: {
          message: `${originalModel} is only available during ${FREEBUFF_DEPLOYMENT_HOURS_LABEL}. Use minimax/minimax-m2.7 outside those hours.`,
          code: 'DEPLOYMENT_OUTSIDE_HOURS',
          type: 'availability_error',
        },
      }),
      { status: 503, statusText: 'Service Unavailable' },
    )
  }
  return new Response(
    JSON.stringify({
      error: {
        message: `${originalModel} deployment is temporarily unavailable. Use minimax/minimax-m2.7 while it recovers.`,
        code: 'DEPLOYMENT_COOLDOWN',
        type: 'availability_error',
      },
    }),
    { status: 503, statusText: 'Service Unavailable' },
  )
}
```

How can I resolve this? If you propose a fix, please make it concise.

Note: If this suggestion doesn't match your team's coding style, reply to this and let me know. I'll remember it for next time!

Prompt To Fix All With AI

This is a comment left during a code review.
Path: web/src/llm-api/fireworks.ts
Line: 733-771

Comment:
**Consolidate the two pre-deployment fallback checks**

The "outside hours" and "cooling down" blocks share identical lite-mode behaviour (log + return standard API), so they can be merged into a single guard. This removes one repeated `if (shouldFallbackToStandardApi)` branch and keeps both distinct error responses for non-lite callers:

```typescript
if (hasDeployment && (!isDeploymentHours(now) || isDeploymentCoolingDown())) {
  if (shouldFallbackToStandardApi) {
    logger.info(
      { model: originalModel },
      'Falling back to Fireworks standard API (deployment unavailable)',
    )
    return createStandardApiRequest()
  }
  if (!isDeploymentHours(now)) {
    return new Response(
      JSON.stringify({
        error: {
          message: `${originalModel} is only available during ${FREEBUFF_DEPLOYMENT_HOURS_LABEL}. Use minimax/minimax-m2.7 outside those hours.`,
          code: 'DEPLOYMENT_OUTSIDE_HOURS',
          type: 'availability_error',
        },
      }),
      { status: 503, statusText: 'Service Unavailable' },
    )
  }
  return new Response(
    JSON.stringify({
      error: {
        message: `${originalModel} deployment is temporarily unavailable. Use minimax/minimax-m2.7 while it recovers.`,
        code: 'DEPLOYMENT_COOLDOWN',
        type: 'availability_error',
      },
    }),
    { status: 503, statusText: 'Service Unavailable' },
  )
}
```

How can I resolve this? If you propose a fix, please make it concise.

_{Reviews (1): Last reviewed commit: "Fallback lite GLM to Fireworks API" | Re-trigger Greptile}

Fallback lite GLM to Fireworks API

e569b7e

jahooma marked this pull request as ready for review April 25, 2026 00:32

jahooma requested review from charleslien and jahooma as code owners April 25, 2026 00:33

jahooma merged commit fc9a76d into main Apr 25, 2026
19 checks passed

jahooma deleted the jahooma/lite-fw-fallback branch April 25, 2026 00:34

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[codex] Fallback lite GLM to standard Fireworks#543

[codex] Fallback lite GLM to standard Fireworks#543
jahooma merged 1 commit intomainfrom
jahooma/lite-fw-fallback

brandonkachen commented Apr 25, 2026

Uh oh!

Uh oh!

greptile-apps Bot commented Apr 25, 2026 •

edited

Loading

Important Files Changed

Comments Outside Diff (1)

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

brandonkachen commented Apr 25, 2026

Uh oh!

Uh oh!

greptile-apps Bot commented Apr 25, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Greptile Summary

Confidence Score: 5/5

Important Files Changed

Comments Outside Diff (1)

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

greptile-apps Bot commented Apr 25, 2026 •

edited

Loading