-
Notifications
You must be signed in to change notification settings - Fork 519
Instant-admit free sessions when below per-model capacity #530
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Changes from all commits
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -176,6 +176,24 @@ export async function queueDepthsByModel(): Promise<Record<string, number>> { | |
| return out | ||
| } | ||
|
|
||
| /** | ||
| * Count of rows currently in `active` status for one model — the threshold | ||
| * check that gates instant admission. Hot-path lookup; callers avoid the | ||
| * full `activeCountsByModel` scan when they only need one model's count. | ||
| */ | ||
| export async function activeCountForModel(model: string): Promise<number> { | ||
| const rows = await db | ||
| .select({ n: count() }) | ||
| .from(schema.freeSession) | ||
| .where( | ||
| and( | ||
| eq(schema.freeSession.status, 'active'), | ||
| eq(schema.freeSession.model, model), | ||
| ), | ||
| ) | ||
| return Number(rows[0]?.n ?? 0) | ||
| } | ||
|
Comment on lines
+184
to
+195
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
Since the comment in // Optional tighter version — only count truly live sessions:
export async function activeCountForModel(model: string, now?: Date): Promise<number> {
const conditions = [
eq(schema.freeSession.status, 'active'),
eq(schema.freeSession.model, model),
...(now ? [gt(schema.freeSession.expires_at, now)] : []),
]
...
}Not a bug given the acknowledged headroom, but worth documenting or addressing in a follow-up. |
||
|
|
||
| /** | ||
| * Single-query read of active-row counts bucketed by model. Mirrors | ||
| * `queueDepthsByModel` so the admission tick can log per-model utilization | ||
|
|
@@ -333,6 +351,43 @@ export async function admitFromQueue(params: { | |
| }) | ||
| } | ||
|
|
||
| /** | ||
| * Promote a specific queued user to active. Used by the instant-admit path | ||
| * in `requestSession` when the model's active-session count is below its | ||
| * configured capacity — skips the FIFO advisory-lock dance because each | ||
| * call targets a distinct (user_id, model) and the UPDATE is a no-op if | ||
| * the row isn't queued any more. | ||
| * | ||
| * Returns the updated row or null if the row was not in the expected | ||
| * (queued, same-model) state. | ||
| */ | ||
| export async function promoteQueuedUser(params: { | ||
| userId: string | ||
| model: string | ||
| sessionLengthMs: number | ||
| now: Date | ||
| }): Promise<InternalSessionRow | null> { | ||
| const { userId, model, sessionLengthMs, now } = params | ||
| const expiresAt = new Date(now.getTime() + sessionLengthMs) | ||
| const [row] = await db | ||
| .update(schema.freeSession) | ||
| .set({ | ||
| status: 'active', | ||
| admitted_at: now, | ||
| expires_at: expiresAt, | ||
| updated_at: now, | ||
| }) | ||
| .where( | ||
| and( | ||
| eq(schema.freeSession.user_id, userId), | ||
| eq(schema.freeSession.status, 'queued'), | ||
| eq(schema.freeSession.model, model), | ||
| ), | ||
| ) | ||
| .returning() | ||
| return (row as InternalSessionRow | undefined) ?? null | ||
| } | ||
|
|
||
| /** Stable 31-bit hash so model-keyed advisory lock ids don't overflow int4. */ | ||
| function hashStringToInt32(s: string): number { | ||
| let h = 0 | ||
|
|
||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
INSTANT_ADMIT_CAPACITYhardcodes model ID strings ('z-ai/glm-5.1','minimax/minimax-m2.7'). If a model is renamed or a new default is added to the@codebuff/common/constants/freebuff-modelsregistry, this map will silently return0for the new ID and every user will fall back to the FIFO queue — without any error or warning.Consider importing the canonical model ID constants from
@codebuff/commonso a rename triggers a compile-time error rather than a silent behaviour change: