Skip to content

feat: add deployment hours availability for freebuff GLM 5.1 model#540

Merged
jahooma merged 3 commits intomainfrom
us-hours
Apr 24, 2026
Merged

feat: add deployment hours availability for freebuff GLM 5.1 model#540
jahooma merged 3 commits intomainfrom
us-hours

Conversation

@jahooma
Copy link
Copy Markdown
Contributor

@jahooma jahooma commented Apr 24, 2026

Summary

Add deployment hours availability policy for freebuff models. GLM 5.1 is now only available during 9am ET - 5pm PT on weekdays, while MiniMax M2.7 remains always available.

Availability system:

  • New availability field on FreebuffModelOption: 'always' or 'deployment_hours'
  • New isFreebuffDeploymentHours() helper: checks Mon-Fri, 9am ET to 5pm PT
  • New isFreebuffModelAvailable() and resolveAvailableFreebuffModel() utilities
  • FREEBUFF_DEPLOYMENT_HOURS_LABEL constant: '9am ET-5pm PT'

CLI model selector:

  • GLM 5.1 shown first with "Smartest" tagline, MiniMax as "Fast"
  • Real-time availability checks via useNow hook (updates every minute)
  • Unavailable models shown as "Closed", non-interactable
  • Auto-resets to MiniMax if selected model becomes unavailable

Server-side:

  • GLM 5.1 uses dedicated Fireworks deployment (mjb4i7ea) during deployment hours
  • Session queue gates on deployment hours availability
  • Outside deployment hours: GLM requests return 503 DEPLOYMENT_OUTSIDE_HOURS

Fireworks deployment:

  • GLM 5.1 deployment map entry added
  • Pricing: .40/M input, bash.26/M cached input, .40/M output

Test plan

  • All affected test suites pass (completions, fireworks-deployment, freebuff-session, public-api)
  • Deployment hours logic tested with mocked times
  • No stale references to removed model IDs

jahooma added 3 commits April 23, 2026 21:01
- Switch base2-free, editor-lite, code-reviewer-lite agents from kimi-k2.6 to z-ai/glm-5.1
- Update FREEBUFF_KIMI_MODEL_ID → FREEBUFF_GLM_MODEL_ID constant
- Update Fireworks deployment map (mjb4i7ea), model map, and pricing
- Remove moonshotai/kimi-k2.6 and kimi-k2.6:nitro from ModelName type
- Update freebuff model selector to show GLM first with 'Smartest' tagline
- Update all test files with new model IDs and deployment IDs
- Update docs and scripts to reference GLM instead of Kimi
@jahooma jahooma changed the title feat: replace Kimi K2.6 with GLM 5.1 as freebuff deployment-hours model feat: add deployment hours availability for freebuff GLM 5.1 model Apr 24, 2026
@greptile-apps
Copy link
Copy Markdown
Contributor

greptile-apps Bot commented Apr 24, 2026

Greptile Summary

This PR replaces Kimi K2.6 with GLM 5.1 as the freebuff deployment-hours model and introduces a time-bounded availability system: GLM 5.1 is only selectable 9am ET–5pm PT on weekdays (backed by a dedicated Fireworks deployment), while MiniMax M2.7 remains always-on via serverless. The key architectural change is the removal of the serverless fallback for deployment-model failures — errors now surface explicitly as 503s so the client can redirect users to MiniMax instead of silently degrading quality.

The implementation is thorough: isFreebuffDeploymentHours correctly computes the dual-timezone window using Intl.DateTimeFormat.formatToParts, test clocks are mocked to deterministic times (8am ET for off-hours tests, explicit _tick for in-hours tests), and availability is enforced at every boundary — session admission, the Fireworks routing layer, the CLI selector, and the model store initialization. A pre-existing bug where getFireworksPricing fell back to FIREWORKS_MODEL_MAP (the model-name map) instead of FIREWORKS_PRICING_MAP is also fixed.

Confidence Score: 5/5

Safe to merge — all changes are well-guarded, test clocks are deterministic, and no P0/P1 issues found.

The PR implements a clean model swap with correct dual-timezone window logic, consistent availability enforcement at every layer (session admission, Fireworks routing, CLI store, and selector), and a bonus fix to a pre-existing getFireworksPricing fallback bug. All remaining observations are P2 (UX/design tradeoffs, not defects). Tests are deterministic thanks to injected clocks.

No files require special attention.

Important Files Changed

Filename Overview
common/src/constants/freebuff-models.ts Core of the change: adds availability field, isFreebuffDeploymentHours (dual-timezone window check), isFreebuffModelAvailable, and resolveAvailableFreebuffModel. Logic is correct; h23 hourCycle and formatToParts avoid AM/PM ambiguity.
web/src/llm-api/fireworks.ts Removes serverless fallback; deployment-mapped models now return explicit 503s with structured error codes (DEPLOYMENT_OUTSIDE_HOURS, DEPLOYMENT_COOLDOWN) instead of silently falling back. Also fixes getFireworksPricing fallback to reference FIREWORKS_PRICING_MAP (was FIREWORKS_MODEL_MAP).
web/src/server/free-session/public-api.ts Adds model_unavailable early-return before queue insertion, reuses now across the function, and returns availableHours label in the response.
web/src/server/free-session/admission.ts Marks registered freebuff models as unhealthy during off-hours (blocking admission) without affecting unknown/unregistered models. Logic is sound.
cli/src/components/freebuff-model-selector.tsx Adds GLM-first display ordering, isAvailable gating on interactions, useEffect to fall back to MiniMax when GLM becomes unavailable mid-session, and deployment hours label in the UI.
cli/src/hooks/use-freebuff-session.ts Handles new model_unavailable 409 response: switches the model store to the default (MiniMax) and retries with GET, closing the loop on server-side availability enforcement.
web/src/llm-api/tests/fireworks-deployment.test.ts Updated to use explicit now injection instead of Date.prototype.toLocaleString mocking. Tests now verify the no-fallback behavior: deployment errors return 503 directly.
web/src/app/api/v1/freebuff/session/tests/session.test.ts Clock fixed to 2026-04-17T12:00:00Z (8am ET, outside deployment hours), making model_unavailable assertions deterministic regardless of when CI runs.
web/src/app/api/v1/chat/completions/tests/completions.test.ts Correctly uses conditional assertions (isFreebuffDeploymentHours()) for the GLM test that runs against real clock; other freebuff tests switched to MiniMax (always-available).
web/src/server/free-session/tests/public-api.test.ts Default test clock starts outside deployment hours; tests that need GLM available explicitly call deps._tick(new Date('2026-04-17T16:00:00Z')). New model_unavailable test is deterministic.
web/src/llm-api/fireworks-config.ts Removes Kimi K2.6 deployment entry; adds GLM 5.1 deployment mjb4i7ea. Kimi entries moved to comments.

Flowchart

%%{init: {'theme': 'neutral'}}%%
flowchart TD
    A[User selects model in CLI] --> B{isFreebuffModelAvailable?}
    B -- "GLM 5.1 + inside 9am ET-5pm PT weekday" --> C[POST /freebuff/session with z-ai/glm-5.1]
    B -- "GLM 5.1 + outside hours" --> D[Show Closed · Switch to MiniMax]
    B -- "MiniMax M2.7 always" --> E[POST /freebuff/session with minimax/minimax-m2.7]
    C --> F{Server: isFreebuffModelAvailable?}
    E --> G[Queue / admit immediately]
    F -- "available" --> G
    F -- "unavailable 409 model_unavailable" --> H[Client: setSelectedModel DEFAULT → retry GET]
    G --> I{Admission tick}
    I -- "MiniMax: health check only" --> J[Admit from MiniMax queue]
    I -- "GLM: outside hours → unhealthy" --> K[GLM queue paused]
    I -- "GLM: inside hours + healthy" --> L[Admit from GLM queue]
    L --> M[Chat completions: createFireworksRequestWithFallback]
    J --> N[Chat completions: serverless MiniMax]
    M --> O{isDeploymentHours?}
    O -- "yes + not cooling down" --> P[Try deployment mjb4i7ea]
    O -- "no → 503 DEPLOYMENT_OUTSIDE_HOURS" --> Q[Surface error to client]
    P -- "success" --> R[Return response]
    P -- "5xx" --> S[Return 503 — no serverless fallback]
Loading

Reviews (1): Last reviewed commit: "feat: replace Kimi K2.6 with GLM 5.1 as ..." | Re-trigger Greptile

@jahooma jahooma merged commit 64edebb into main Apr 24, 2026
34 checks passed
@jahooma jahooma deleted the us-hours branch April 24, 2026 22:11
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant