Skip to content

feat(exa): add $10/month free allowance with credit billing for overages#2169

Merged
RSO merged 23 commits intomainfrom
RSO/precious-amazonsaurus
Apr 8, 2026
Merged

feat(exa): add $10/month free allowance with credit billing for overages#2169
RSO merged 23 commits intomainfrom
RSO/precious-amazonsaurus

Conversation

@RSO
Copy link
Copy Markdown
Contributor

@RSO RSO commented Apr 8, 2026

Summary

Adds billing for the Exa proxy endpoints. Each user gets $10 in free credits per month, and usage is tracked so we can recompute balances. The $10 is configurable, but only on a monthly basis.

Storage strategy

Two tables to avoid the unbounded row growth that microdollar_usage suffers from:

  • exa_monthly_usage — pre-aggregated counter (one row per user per month). The pre-request balance check reads this single row (O(1) forever).
  • exa_usage_log — per-request audit trail, range-partitioned by month on created_at. Never queried in the hot path.

Verification

  • pnpm typecheck — passes
  • pnpm test -- apps/web/src/app/api/exa — 23 tests pass (auth, paths, streaming stripped, allowance, balance check, cost recording)
  • pnpm test -- apps/web/src/lib/exa-usage.test.ts — 8 tests pass (counter reads, upserts, balance deduction)
  • pnpm format + pnpm lint — clean
  • Test plan:
    • Created a new user
    • Made a call to the Exa API: works
    • Set the Exa total_cost_microdollars to 10000000
    • Call the API again: Fail (no balance, out of free credits)
    • Add money to balance
    • Call the API again: Success ✅

Also made sure that the charged_to_balance prop is correctly set:

CleanShot 2026-04-08 at 15 20 05@2x

Reviewer notes

I feel confident that I've tested most of the edge cases for this feature, but I'd love a second pair of eyes on the logic, as it's likely to be costly if we make a mistake with the accounting.

Implement per-user monthly Exa usage tracking with a free tier + overage model:
- First $10/month is free for all authenticated users
- Beyond $10, usage is charged to the user's (or org's) Kilo credit balance
- Streaming disabled to guarantee cost tracking via costDollars in JSON responses
- Two-table storage strategy: pre-aggregated counter (O(1) lookups) + partitioned audit log
@RSO RSO self-assigned this Apr 8, 2026
Comment thread apps/web/src/lib/exa-usage.ts
@kilo-code-bot
Copy link
Copy Markdown
Contributor

kilo-code-bot Bot commented Apr 8, 2026

Code Review Summary

Status: 4 Issues Found | Recommendation: Address before merge

Overview

Severity Count
CRITICAL 0
WARNING 4
SUGGESTION 0

Fix these issues in Kilo Cloud

Issue Details (click to expand)

WARNING

File Line Issue
apps/web/src/lib/exa-usage.ts 166 Paid Exa usage still updates cached balance columns without writing a microdollar_usage row, so recompute relies on Exa-specific side channels instead of the normal ledger.
apps/web/src/lib/recomputeUserBalances.ts 149 Recompute adds aggregated Exa charged totals only after scanning credit history, so any Exa spend that happened before later credits/expirations is applied out of order and produces incorrect baselines.
apps/web/src/lib/recomputeOrganizationBalances.ts 113 The org recompute path has the same ordering bug: monthly Exa aggregates are appended after the baseline pass, so grants and expirations are recalculated as if all Exa spend happened last.
packages/db/src/migrations/0078_workable_boom_boom.sql 24 The migration only seeds April/May 2026 partitions, so deployments after those months will fail exa_usage_log inserts until a newer partition is created.
Other Observations (not in diff)
  • The PR description still embeds the verification screenshot with an HTML <img> tag. Per the markdown guidance, this should use markdown image syntax instead, e.g. ![Image Name](...).
Files Reviewed (40 files)
  • .agents/skills/durable-objects/SKILL.md - 0 issues
  • .agents/skills/durable-objects/references/rules.md - 0 issues
  • .agents/skills/durable-objects/references/testing.md - 0 issues
  • .agents/skills/durable-objects/references/workers.md - 0 issues
  • .agents/skills/workers-best-practices/SKILL.md - 0 issues
  • .agents/skills/workers-best-practices/references/review.md - 0 issues
  • .agents/skills/workers-best-practices/references/rules.md - 0 issues
  • .agents/skills/wrangler/SKILL.md - 0 issues
  • .plans/exa-per-user-free-allowance.md - 0 issues
  • apps/web/next.config.mjs - 0 issues
  • apps/web/package.json - 0 issues
  • apps/web/src/app/admin/components/UserAdmin/UserAdminGdprRemoval.tsx - 0 issues
  • apps/web/src/app/api/internal/kiloclaw/instance-ready/route.ts - 0 issues
  • apps/web/src/app/api/private/users/route.ts - 0 issues
  • apps/web/src/components/cloud-agent-next/MessageBubble.tsx - 0 issues
  • apps/web/src/lib/config.server.ts - 0 issues
  • apps/web/src/lib/kiloclaw/access-gate.ts - 0 issues
  • apps/web/src/lib/kiloclaw/access-state.ts - 0 issues
  • apps/web/src/lib/kiloclaw/instance-lifecycle.test.ts - 0 issues
  • apps/web/src/lib/kiloclaw/instance-lifecycle.ts - 0 issues
  • apps/web/src/lib/kiloclaw/kiloclaw-internal-client.ts - 0 issues
  • apps/web/src/lib/providers/index.ts - 0 issues
  • apps/web/src/routers/admin-kiloclaw-user-router.test.ts - 0 issues
  • apps/web/src/routers/admin-router.ts - 0 issues
  • apps/web/src/routers/kiloclaw-billing-router.test.ts - 0 issues
  • apps/web/src/routers/kiloclaw-router.ts - 0 issues
  • apps/web/src/routers/kiloclaw-send-chat-message.test.ts - 0 issues
  • apps/web/vercel.json - 0 issues
  • packages/db/src/migrations/0077_puzzling_gabe_jones.sql - 0 issues
  • packages/db/src/migrations/0078_workable_boom_boom.sql - 1 issue
  • packages/db/src/schema-types.ts - 0 issues
  • packages/db/src/schema.ts - 0 issues
  • services/kiloclaw-billing/src/index.test.ts - 0 issues
  • services/kiloclaw-billing/src/lifecycle.test.ts - 0 issues
  • services/kiloclaw-billing/src/lifecycle.ts - 0 issues
  • services/kiloclaw/src/routes/controller.test.ts - 0 issues
  • services/kiloclaw/src/routes/controller.ts - 0 issues
  • services/kiloclaw/src/routes/platform-start-async.test.ts - 0 issues
  • services/kiloclaw/src/routes/platform.ts - 0 issues
  • skills-lock.json - 0 issues

Reviewed by gpt-5.4-20260305 · 950,299 tokens

RSO added 2 commits April 8, 2026 11:33
Add free_allowance_microdollars column to exa_monthly_usage with lock-in
semantics: the allowance is set on the first request of the month (INSERT)
and not overwritten on subsequent requests (excluded from ON CONFLICT UPDATE).

This enables per-user tiered allowances in the future by modifying a single
pure function (getExaFreeAllowanceMicrodollars) without changing the billing
flow. The route handler now uses the stored allowance instead of the global
constant, and the 402 error message reflects the actual allowance.
Comment thread apps/web/src/app/api/exa/[...path]/route.test.ts
RSO added 3 commits April 8, 2026 12:11
Add organization_id to exa_monthly_usage so charged amounts are tracked
per-context (personal vs org). Both recompute functions now include
total_charged_microdollars from exa_monthly_usage in cumulative usage,
preventing paid Exa charges from vanishing on balance recomputation.

Squashes branch migrations 0077+0078 into a single 0077.
The route now passes readDb as a 3rd argument to getBalanceAndOrgSettings,
but two toHaveBeenCalledWith assertions only expected 2 arguments. The
failing assertion triggers Jest's pretty-printer on the real Drizzle
client (a Proxy-backed object), which hangs indefinitely.
Comment thread apps/web/src/lib/recomputeUserBalances.ts Outdated
Comment thread apps/web/src/lib/recomputeOrganizationBalances.ts Outdated
RSO added 4 commits April 8, 2026 14:26
Replace the exa_monthly_usage lump-sum approach with a chronological
merge-sort of exa_usage_log records into the usage stream. This gives
correct credit-expiration baselines when Exa charges are interleaved
with credit grants/expirations.

- Stop dropping old exa_usage_log partitions (retain indefinitely)
- Register exa-partition-maintenance cron in vercel.json
- Promote exa_usage_log insert from fire-and-forget to required
- Recompute functions merge-sort exa_usage_log with microdollar_usage
Comment thread packages/db/src/migrations/0078_light_jocasta.sql
@RSO RSO requested a review from markijbema April 8, 2026 13:21
Comment thread apps/web/next.config.mjs Outdated
@kilo-code-bot

This comment was marked as resolved.

@kilo-code-bot

This comment was marked as resolved.

Comment thread .plans/exa-monthly-allowance.md Outdated
@kilo-code-bot

This comment was marked as resolved.

Comment thread apps/web/src/app/api/cron/exa-partition-maintenance/route.ts Outdated
@kilo-code-bot

This comment was marked as resolved.

Comment thread apps/web/src/app/api/cron/exa-partition-maintenance/route.ts Outdated
Comment thread apps/web/src/app/api/exa/[...path]/route.ts
Comment thread apps/web/src/lib/constants.ts Outdated
@kilo-code-bot

This comment was marked as resolved.

Comment thread apps/web/src/lib/recomputeUserBalances.ts
@markijbema

This comment was marked as resolved.

@kilo-code-bot
Copy link
Copy Markdown
Contributor

kilo-code-bot Bot commented Apr 8, 2026

markijbema's Assessment (expanded)

Bookkeeping & Recoverability Review

3) exa_usage_log insert fails (partition missing): The counter (exa_monthly_usage) is upserted before the log insert. If the log insert fails, the counter is incremented but deductFromBalance never runs. The user effectively gets a free paid request, and recompute can't recover because the log row is missing. Recovery would require manually comparing sum(exa_usage_log.cost WHERE charged) vs exa_monthly_usage.total_charged_microdollars per user-month.

markijbema: Sounds farfetched, but also the fix of reordering the statements is easy I think.

8) Double-counting risk: deductFromBalance (personal) doesn't insert into microdollar_usage: For personal billing, deductFromBalance directly increments kilocode_users.microdollars_used without inserting into microdollar_usage. Recompute picks up the charge from exa_usage_log instead. If someone later changes deductFromBalance to also insert into microdollar_usage, recompute would double-count — once from microdollar_usage and once from exa_usage_log.

markijbema: The advice of adding a comment is sound, especially since it will probably be an agent reading it.

9) Org billing via ingestOrganizationTokenUsage: double-counting risk: For org billing, deductFromBalance calls ingestOrganizationTokenUsage, which may insert into microdollar_usage. If so, recomputeOrganizationBalances would pick up both the microdollar_usage row AND the exa_usage_log row — double-counting every paid org Exa request. This was flagged as the most critical finding to verify before merging.

markijbema: Sounds like bullshit, can you confirm?

Confirmed bullshit. ingestOrganizationTokenUsage does NOT insert into microdollar_usage. It only updates the denormalized organizations.microdollars_used counter and upserts organization_user_usage (daily per-user limit tracking). The bot speculated this was an issue without verifying the actual code. recomputeOrganizationBalances queries microdollar_usage (LLM) and exa_usage_log (Exa) — two distinct ledgers with no overlap. No double-counting.


Comparison: Exa bookkeeping vs. microdollar_usage bookkeeping

1) Org billing: double-counting risk in recompute: ingestOrganizationTokenUsage updates organization_user_usage (per-user daily tracking). The exa path constructs a synthetic MicrodollarUsage record with provider: 'exa' and model: path to satisfy the type. If ingestOrganizationTokenUsage also inserts into microdollar_usage, then recomputeOrganizationBalances (which now queries both microdollar_usage and exa_usage_log) would double-count every paid org Exa request.

markijbema: Sounds like 9 from the previous comment, now I'm starting to doubt whether there is truth to this :P I really hate the personal/org if-code :(

Same false alarm as #9 above. ingestOrganizationTokenUsage does not insert into microdollar_usage, so there is no double-counting. The bot raised this as a hypothetical and never verified it. Note: organization_user_usage daily tracking will include Exa charges (since ingestOrganizationTokenUsage upserts there), meaning daily user limits count Exa spend — probably desirable but undocumented.

5) The after() callback doesn't catch JSON parse failures: In route.ts:99-110, the after() callback does await cloned.json(). If the upstream response isn't valid JSON (network error, truncated response), this throws and the entire after() callback silently fails. The old code had a try/catch around this; the new code removed it. Even non-billing logging failures silently disappear.

markijbema: Is this correct? I think we do want to send something to Sentry?

6) Free allowance is per-user across all contexts, but charges are per-context: The free allowance check aggregates usage across personal AND org rows. So if a user uses $6 personally and $5 through an org, they've used $11 total and the next request is "paid." But the charge goes to whichever context the request is made in. A user could consume their free tier entirely through org requests, then a personal request would be "paid" and charged to their personal balance.

markijbema: Please add a comment somewhere, I assume we'll throw the plans out later?


Architecture Review: Custom Exa Billing vs. Reusing Existing Free-Tier Infrastructure

The bot review noted that the custom approach creates a parallel billing pipeline. The personal path does a bare UPDATE kilocode_users SET microdollars_used += X while the org path goes through ingestOrganizationTokenUsage. This asymmetry means users and orgs are handled very differently for exa billing, unlike the existing LLM billing where both go through insertUsageRecord().

markijbema: This also implies that we handle users and orgs very differently, indeed please do consider.


Plan vs. Implementation Review

6) Plan said insertUsageRecord() for personal billing, implementation does a bare UPDATE: The plan said the code would call insertUsageRecord() to write to microdollar_usage and increment the balance. The implementation deliberately avoids this to prevent double-counting in recompute (which already reads from exa_usage_log). But the deductFromBalance function lacks a comment explaining why it doesn't use insertUsageRecord(). A future developer (or agent) could "fix" this and introduce double-counting.

markijbema: Please add the comment.

7) Org path through ingestOrganizationTokenUsage — potential double-counting: Does ingestOrganizationTokenUsage insert into microdollar_usage? If yes, then recomputeOrganizationBalances will find the charge in both microdollar_usage (from ingestOrganizationTokenUsage) AND exa_usage_log (from step 2 of recordExaUsage), resulting in double-counting every paid org Exa request.

markijbema: Please verify these claims about microdollar usage.

Verified: no double-counting. Same root cause as #9 and Comparison #1ingestOrganizationTokenUsage does not insert into microdollar_usage.


Security Review: Exa Free Tier Abuse Vectors

The review identified 5 abuse vectors: (1) request abort = free usage, (2) no error handling in after(), (3) read-replica TOCTOU burst bypassing the free tier, (4) single-request threshold crossing, (5) multi-account free tier multiplication.

markijbema: This all is fine imho.


Action Items Summary

# Action Source
1 Verify whether ingestOrganizationTokenUsage inserts into microdollar_usage (double-counting risk) Bookkeeping #9, Comparison #1, Plan vs Impl #7
2 Add comment to deductFromBalance explaining why it does NOT use insertUsageRecord() Bookkeeping #8, Plan vs Impl #6
3 Add comment documenting the per-user cross-context free allowance design Comparison #6
4 Consider reordering statements in recordExaUsage (log insert before counter upsert) Bookkeeping #3
5 Add try/catch with Sentry reporting to the after() callback Comparison #5
6 Consider the user/org billing asymmetry Architecture Review

Update: Items #9 (Bookkeeping), #1 (Comparison), and #7 (Plan vs Impl) were all the same false alarm — ingestOrganizationTokenUsage does not insert into microdollar_usage. Verified by reading the actual implementation. Action item #1 is resolved: no double-counting risk.

Copy link
Copy Markdown
Contributor

@markijbema markijbema left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some inline human comments from me, and I think the llm analysis found some valid points summarized in the last post, but I don't think any of them are really blocking

RSO added 7 commits April 8, 2026 19:28
…equest leak

If exa_usage_log insert fails (e.g. missing partition), the counter
was already incremented but deductFromBalance never ran — giving the
user a free paid request with no log row for recompute to recover from.
Reordering so the log insert happens first ensures a failed insert
leaves no side effects, and any later failure is recoverable.
… billing

Recompute already picks up personal Exa charges from exa_usage_log,
so a microdollar_usage row would double-count.
…oss contexts

Org usage counts toward the same free tier as personal usage. Once
exhausted, the charge goes to whichever context makes the request.
This prevents gaming via multiple orgs.
@RSO
Copy link
Copy Markdown
Contributor Author

RSO commented Apr 8, 2026

  1. The after() callback doesn't catch JSON parse failures: In route.ts:99-110, the after() callback does await cloned.json(). If the upstream response isn't valid JSON (network error, truncated response), this throws and the entire after() callback silently fails. The old code had a try/catch around this; the new code removed it. Even non-billing logging failures silently disappear.

markijbema: Is this correct? I think we do want to send something to Sentry?

We only clone responses that have response < 400, so I think the risk is rather small.

@RSO RSO enabled auto-merge (squash) April 8, 2026 17:53
@RSO RSO merged commit 6aa4e3b into main Apr 8, 2026
30 checks passed
@RSO RSO deleted the RSO/precious-amazonsaurus branch April 8, 2026 17:56
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants