Rate-limit freebuff GLM sessions to 5 per 20 hours by jahooma · Pull Request #537 · CodebuffAI/codebuff

jahooma · 2026-04-22T23:37:59Z

Summary

Adds a free_session_admit audit log (one row per queued→active transition) via a new migration, and records admissions from both admitFromQueue and promoteQueuedUser.
Gates POST /api/v1/freebuff/session against that log so GLM 5.1 users who've already had 5 one-hour sessions in the last 20h get a new rate_limited response (HTTP 429). Minimax stays unlimited.
Queued/active responses now carry an optional rateLimit quota snapshot that the CLI renders as "N / 5 used in last 20h" the moment the user joins the waitlist; the rate_limited terminal screen shows "X of 5 sessions used on <model> in the last 20h. Try again in <retry-after>."

Test plan

574 web unit tests pass, including 5 new rate-limit cases (over-limit block, window rolloff, Minimax unlimited, queued quota display, instant-admit quota bump).
tsc --noEmit clean across web, cli, common, packages/internal.
Manual: run the freebuff CLI, join the GLM queue, verify the quota line appears; seed 5 admits in the last 20h and confirm the 6th attempt shows the rate-limited screen.

🤖 Generated with Claude Code

Adds a free_session_admit audit log (one row per queued→active transition) and gates POST /api/v1/freebuff/session against it so GLM 5.1 users who've already had 5 one-hour sessions in the last 20h are blocked with a new rate_limited status (HTTP 429). Queued/active responses now carry an optional rateLimit quota the CLI renders as "N / 5 used in last 20h" so users see their remaining allowance as soon as they join the waitlist. Minimax is left unlimited. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

greptile-apps · 2026-04-22T23:44:07Z

Greptile Summary

This PR introduces a per-user, per-model session rate limit for freebuff GLM 5.1 users (5 sessions per 20-hour rolling window). It adds a new free_session_admit audit table that records every queued→active promotion, and gates POST /api/v1/freebuff/session against that log. Minimax remains unlimited. The CLI gains a "N / 5 used in last 20h" quota line in the waiting-room screen and a terminal rate_limited screen with a retry-after countdown.

One P1 logic issue was found: the rate-limit check in requestSession runs before joinOrTakeOver, so a user who is promoted from the queue while their CLI is offline (their 5th admission is written by the tick) will get a rate_limited response when they try to reconnect and take over their active slot — effectively losing that session. The fix is to skip the rate-limit gate for existing active/queued rows.

Confidence Score: 3/5

Needs one targeted fix before merge: reconnecting to an already-admitted session incorrectly triggers the rate-limit block.

The overall design is solid — DB schema, transaction safety, CLI rendering, and test coverage are all well done. The P1 bug where the rate-limit check runs before reading the existing session row means a user promoted from the queue while their CLI is offline loses their legitimately-earned 5th session on reconnect. This directly affects the primary user path (rejoining after disconnection at the limit boundary) and is worth fixing before shipping.

web/src/server/free-session/public-api.ts (rate-limit check ordering, lines 235–255)

Important Files Changed

Filename	Overview
web/src/server/free-session/public-api.ts	Core rate-limit gate + quota-snapshot logic; contains a P1 bug where reconnects to an already-admitted session are incorrectly blocked by the rate-limit check.
web/src/server/free-session/store.ts	Adds `freeSessionAdmit` write in both `admitFromQueue` and `promoteQueuedUser` within transactions, and adds `listRecentAdmits` query; implementation is correct and well-structured.
web/src/app/api/v1/freebuff/session/_handlers.ts	Routes `rate_limited` to HTTP 429 consistently with the existing pattern for `banned` (403) and `model_locked` (409); missing `Retry-After` header on the 429 response (P2).
common/src/types/freebuff-session.ts	Adds `FreebuffSessionRateLimit` interface and `rate_limited` discriminant to the shared response union; types are accurate and well-documented.
packages/internal/src/db/migrations/0046_cloudy_firedrake.sql	New `free_session_admit` table with composite index on `(user_id, model, admitted_at)`; schema matches Drizzle definition; missing trailing newline (cosmetic).
web/src/server/free-session/tests/public-api.test.ts	Good coverage of rate-limit happy path, window rolloff, Minimax unlimited, and instant-admit quota bump; missing a test for the reconnect-to-active-after-queue-promotion scenario.
cli/src/hooks/use-freebuff-session.ts	Handles 429 from POST as a terminal non-throw state; `rate_limited` correctly returns `null` from `nextDelayMs` stopping the poll loop.
cli/src/components/waiting-room-screen.tsx	Adds rate-limited terminal screen and inline quota display for rate-limited models; `formatRetryAfter` handles edge cases (non-finite, ≤0) gracefully.
cli/src/app.tsx	Correctly routes `rate_limited` to the `WaitingRoomScreen` alongside other terminal statuses.

Sequence Diagram

sequenceDiagram
    participant CLI
    participant API as POST /session
    participant PublicAPI as public-api.ts
    participant Store as store.ts
    participant DB

    CLI->>API: POST (model=z-ai/glm-5.1)
    API->>PublicAPI: requestSession()
    PublicAPI->>DB: listRecentAdmits(userId, model, since, limit=5)
    DB-->>PublicAPI: [Date, Date, Date, Date, Date] (5 rows)

    alt recentCount >= 5 (rate limited)
        PublicAPI-->>API: { status: rate_limited, retryAfterMs }
        API-->>CLI: 429 { status: rate_limited, ... }
        CLI->>CLI: Show rate-limited screen, stop polling
    else recentCount < 5 (allowed)
        PublicAPI->>DB: joinOrTakeOver (UPSERT queued)
        alt instant-admit capacity available
            PublicAPI->>DB: activeCountForModel()
            DB-->>PublicAPI: count < capacity
            PublicAPI->>Store: promoteQueuedUser()
            Store->>DB: UPDATE free_session SET status=active
            Store->>DB: INSERT free_session_admit (audit row)
            Store-->>PublicAPI: active row
        end
        PublicAPI->>DB: listRecentAdmits() again (attachRateLimit)
        DB-->>PublicAPI: updated count
        PublicAPI-->>API: { status: queued/active, rateLimit }
        API-->>CLI: 200 with quota info
        CLI->>CLI: Show N/5 used in waiting room
    end

    Note over Store,DB: admitFromQueue tick also writes free_session_admit

_{Reviews (1): Last reviewed commit: "Rate-limit freebuff GLM sessions to 5 pe..." | Re-trigger Greptile}

greptile-apps · 2026-04-22T23:44:11Z

+  // Rate-limit check runs before joinOrTakeOver so heavy users never even
+  // create a queued row. Only models listed in RATE_LIMITS are gated; others
+  // (Minimax today) fall through unchanged.
+  const snapshot = await fetchRateLimitSnapshot(params.userId, model, deps)
+  if (snapshot && snapshot.info.recentCount >= snapshot.info.limit) {
+    // Oldest admit's window-anniversary is when one slot opens back up.
+    // Clamped at 0 so a clock skew can't surface a negative retry-after.
+    const windowMs = snapshot.info.windowHours * 60 * 60 * 1000
+    const retryAfterMs = Math.max(
+      0,
+      (snapshot.oldest?.getTime() ?? 0) + windowMs - nowOf(deps).getTime(),
+    )
+    return {
+      status: 'rate_limited',
+      model,
+      limit: snapshot.info.limit,
+      windowHours: snapshot.info.windowHours,
+      recentCount: snapshot.info.recentCount,
+      retryAfterMs,
+    }
+  }


Rate limit check fires before join, blocking reconnect to legitimately-admitted session

The rate-limit check in requestSession runs unconditionally before joinOrTakeOver. This means a user who is promoted from the queue while their CLI is offline (5th admission written by the tick) and then reconnects will hit the check in this order:

CLI starts → GET returns active (they were promoted while away)

Startup-takeover branch fires a POST to rotate instance id

POST enters requestSession → fetchRateLimitSnapshot → recentCount = 5 >= 5

Returns rate_limited without ever calling joinOrTakeOver

User sees "Session limit reached" and their legitimately-earned 5th session expires unused

The correct behaviour is that the rate-limit gate should only block new queue entries, not reconnections to an already-active or already-queued slot. A minimal fix is to read the existing row first and only apply the rate-limit check when no active/queued row exists.

The test suite does not cover this reconnect-while-promoted scenario, which is why it wasn't caught.

greptile-apps · 2026-04-22T23:44:12Z

    const status =
-      state.status === 'model_locked' ? 409 : state.status === 'banned' ? 403 : 200
+      state.status === 'model_locked'
+        ? 409
+        : state.status === 'banned'
+          ? 403
+          : state.status === 'rate_limited'
+            ? 429
+            : 200
    return NextResponse.json(state, { status })


Missing Retry-After HTTP header on 429 response

RFC 6585 and standard API practice expect a Retry-After header on 429 responses so HTTP middleware, proxies, and any future clients can honour the back-off without parsing the JSON body. The state object already carries retryAfterMs, so the header can be set cheaply:

const retryAfterSec = Math.ceil((state as { retryAfterMs?: number }).retryAfterMs ?? 0) / 1000 return NextResponse.json(state, { status: 429, headers: { 'Retry-After': String(Math.ceil(retryAfterSec)) }, })

greptile-apps · 2026-04-22T23:44:13Z

+);
+--> statement-breakpoint
+ALTER TABLE "free_session_admit" ADD CONSTRAINT "free_session_admit_user_id_user_id_fk" FOREIGN KEY ("user_id") REFERENCES "public"."user"("id") ON DELETE cascade ON UPDATE no action;--> statement-breakpoint
+CREATE INDEX "idx_free_session_admit_user_model_time" ON "free_session_admit" USING btree ("user_id","model","admitted_at");


SQL file is missing a trailing newline

The diff ends with \ No newline at end of file. Most SQL linters, editors, and git diff tools treat POSIX files without a trailing newline as malformed. A newline should be added after the final ;.

…limit # Conflicts: # cli/src/hooks/use-freebuff-session.ts # web/src/app/api/v1/freebuff/session/_handlers.ts # web/src/server/free-session/public-api.ts

requestSession is the takeover path as well as the join path, so a user whose 5th GLM admit put them at the cap would get rate_limited on CLI restart and lose access to their still-active session (or their queue position). Skip the quota check when the caller already holds a queued or active+unexpired row for the same model — admit counts only need to gate fresh admissions, not re-anchoring to an existing row. Expired rows still count as fresh and remain blocked. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

jahooma requested review from brandonkachen and charleslien as code owners April 22, 2026 23:38

greptile-apps Bot reviewed Apr 22, 2026

View reviewed changes

jahooma and others added 2 commits April 24, 2026 15:16

Merge remote-tracking branch 'origin/main' into jahooma/freebuff-glm-…

3a26454

…limit # Conflicts: # cli/src/hooks/use-freebuff-session.ts # web/src/app/api/v1/freebuff/session/_handlers.ts # web/src/server/free-session/public-api.ts

jahooma merged commit 585260b into main Apr 24, 2026
34 checks passed

jahooma deleted the jahooma/freebuff-glm-limit branch April 24, 2026 22:42

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Rate-limit freebuff GLM sessions to 5 per 20 hours#537

Rate-limit freebuff GLM sessions to 5 per 20 hours#537
jahooma merged 3 commits intomainfrom
jahooma/freebuff-glm-limit

jahooma commented Apr 22, 2026

Uh oh!

greptile-apps Bot commented Apr 22, 2026

Uh oh!

greptile-apps Bot Apr 22, 2026

Uh oh!

greptile-apps Bot Apr 22, 2026

Uh oh!

greptile-apps Bot Apr 22, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

jahooma commented Apr 22, 2026

Summary

Test plan

Uh oh!

greptile-apps Bot commented Apr 22, 2026

Greptile Summary

Confidence Score: 3/5

Important Files Changed

Sequence Diagram

Uh oh!

greptile-apps Bot Apr 22, 2026

Choose a reason for hiding this comment

Uh oh!

greptile-apps Bot Apr 22, 2026

Choose a reason for hiding this comment

Uh oh!

greptile-apps Bot Apr 22, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant