Skip to content

fix(ai): checkAiQuota e acquireAiQuota fail-closed (Onda 6, B-7)#196

Merged
adm01-debug merged 2 commits into
mainfrom
cleanup/onda-6-ai-quota-fail-closed
May 14, 2026
Merged

fix(ai): checkAiQuota e acquireAiQuota fail-closed (Onda 6, B-7)#196
adm01-debug merged 2 commits into
mainfrom
cleanup/onda-6-ai-quota-fail-closed

Conversation

@adm01-debug
Copy link
Copy Markdown
Owner

@adm01-debug adm01-debug commented May 14, 2026

Onda 6 do hardening pré-prod. Bloqueador B-7 da auditoria de 10/mai.

PROBLEMA:
Quando a RPC check_ai_quota ou acquire_ai_quota falhava (banco lento,
função recriada, conexão Postgres ruim), ai-usage.ts retornava
allowed: true — comportamento FAIL-OPEN. Isso permitia chamadas
ILIMITADAS de IA enquanto o sistema de quota estava com problema.

Cenário de prejuízo: Gemini 2.5 Pro custa US$ 1.25/1M input + US$ 10/1M
output. 8h não monitoradas com bug ativo = ~US$ 400. Final de semana = US$ 2.000.

DESCOBERTA DURANTE INSPEÇÃO:
A auditoria mencionava só checkAiQuota (linha 62), mas inspecionando
o arquivo achei o irmão MAIS CRÍTICO — acquireAiQuota (linha ~80),
que é chamado por callAiWithTracking, o ponto de entrada de TODAS
as 11 edge functions de IA do sistema (expert-chat, generate-mockup,
voice-agent, visual-search, etc).

checkAiQuota tinha ZERO callers reais em produção (função exportada
mas órfã). Fixed por defesa em profundidade.

FIX:
Ambas as funções agora retornam:
allowed: false
reason: {quota_check|acquire}_failed_security_lock

Callers existentes (callAiWithTracking, ai-router/index.ts) já tratam
allowed: false via QuotaExceededError → cliente recebe HTTP 429
"Limite mensal de IA atingido. Contate o administrador."

OBSERVABILIDADE:
console.error nas falhas é capturado pelo GlitchTip (Onda 5,
captureConsoleIntegration). Toda falha de quota vira issue
rastreável em erros.atomicabr.com.br.

VALIDAÇÃO:

  • Sintaxe TS validada via esbuild standalone
  • 11 callers indiretos confirmados tratando 429
  • 1 caller direto (ai-router) confirmado tratando !allowed
  • RPCs check_ai_quota + acquire_ai_quota confirmadas no banco prod

Risco: baixo. 1 arquivo lógico (~16 linhas), 1 doc novo.
Detalhes: docs/hardening/ONDA-6-AI-QUOTA-FAIL-CLOSED.md


Summary by cubic

Enforces fail-closed behavior in checkAiQuota and acquireAiQuota so quota RPC failures block usage instead of allowing it. Prevents runaway costs and addresses audit blocker B-7 (Onda 6).

  • Bug Fixes
    • In supabase/functions/_shared/ai-usage.ts, both functions now return allowed: false on RPC errors with reasons quota_check_failed_security_lock and acquire_failed_security_lock.
    • Existing callers (callAiWithTracking, _shared/ai-router) convert this to HTTP 429 via QuotaExceededError, matching normal quota-exceeded behavior.
    • Errors are logged via console.error and captured by GlitchTip; added docs/hardening/ONDA-6-AI-QUOTA-FAIL-CLOSED.md with context and runbook.

Written for commit b955a59. Summary will update on new commits.

Summary by CodeRabbit

Notas de Lançamento

  • Bug Fixes

    • Melhorada a segurança da verificação de quota de IA: requisições agora são negadas (HTTP 429) em caso de falha na validação, evitando consumo descontrolado.
  • Documentation

    • Adicionada documentação sobre o ajuste de comportamento de segurança para falhas de quota de IA.

Review Change Stack

Antes: se a RPC check_ai_quota/acquire_ai_quota falhasse (banco lento,
função recriada, etc), retornava `allowed: true` — permitia chamadas
ILIMITADAS de IA. Risco financeiro alto: 8h não monitorado podia
gerar ~US$ 400 em Gemini 2.5 Pro / GPT-5.

Agora: ambas retornam `allowed: false` com reason
`{quota_check,acquire}_failed_security_lock`. Callers existentes
(callAiWithTracking, ai-router) já tratam allowed=false via
QuotaExceededError → cliente recebe HTTP 429 normal.

Console.error agora é capturado pelo GlitchTip (Onda 5) — toda
falha de quota vira issue rastreável em erros.atomicabr.com.br.

Validação:
  - Sintaxe TS validada via esbuild standalone
  - 11 callers indiretos (via callAiWithTracking) já tratam 429
  - 1 caller direto (ai-router/index.ts) já trata !allowed
  - RPCs check_ai_quota e acquire_ai_quota confirmadas no banco prod

Bloqueador: B-7 da auditoria de 10/mai/2026.
Detalhes: docs/hardening/ONDA-6-AI-QUOTA-FAIL-CLOSED.md
Copilot AI review requested due to automatic review settings May 14, 2026 16:43
@vercel
Copy link
Copy Markdown

vercel Bot commented May 14, 2026

The latest updates on your projects. Learn more about Vercel for GitHub.

Project Deployment Actions Updated (UTC)
promo-gifts Ready Ready Preview, Comment May 14, 2026 4:43pm

@supabase
Copy link
Copy Markdown

supabase Bot commented May 14, 2026

This pull request has been ignored for the connected project doufsxqlfjyuvxuezpln due to reaching the limit of concurrent preview branches.
Go to Project Integrations Settings ↗︎ if you wish to update this limit.


Preview Branches by Supabase.
Learn more about Supabase Branching ↗︎.

@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai Bot commented May 14, 2026

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: ddec0075-114a-4170-b293-7c3400ea48de

📥 Commits

Reviewing files that changed from the base of the PR and between 4498dd8 and b955a59.

📒 Files selected for processing (2)
  • docs/hardening/ONDA-6-AI-QUOTA-FAIL-CLOSED.md
  • supabase/functions/_shared/ai-usage.ts

Walkthrough

Este PR implementa fail-closed para quota de IA: checkAiQuota e acquireAiQuota agora retornam allowed: false quando RPC falha, bloqueando chamadas em vez de permitir bypass. Acompanhado de documentação detalhada sobre validação de impacto em callers, observabilidade e rollback.

Changes

Fail-closed em quota de IA — implementação e validação

Layer / File(s) Resumo
Implementação fail-closed em checkAiQuota e acquireAiQuota
supabase/functions/_shared/ai-usage.ts
checkAiQuota e acquireAiQuota passam a retornar allowed: false com reason específico e limites zerados quando RPC falha. Logging via console.error para captura em GlitchTip. Substitui comportamento anterior que retornava allowed: true sem log_id, evitando bypass de quota por indisponibilidade.
Documentação de hardening — Onda 6
docs/hardening/ONDA-6-AI-QUOTA-FAIL-CLOSED.md
Registra contexto fail-open anterior, impacto validado nos callers (tratam allowed: false como QuotaExceededError, resultando HTTP 429), comparação antes/depois, validações empíricas (esbuild, grep, schema inspection de RPCs), observabilidade por reason e GlitchTip, riscos/rollback simples e próximos passos para alerting proativo.

Estimated code review effort

🎯 2 (Simples) | ⏱️ ~12 minutos

🚥 Pre-merge checks | ✅ 5
✅ Passed checks (5 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed O título é específico e descreve claramente a mudança principal: implementação de fail-closed para checkAiQuota e acquireAiQuota, incluindo referência à Onda 6 e ticket B-7.
Docstring Coverage ✅ Passed Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
📝 Generate docstrings
  • Create stacked PR
  • Commit on current branch
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch cleanup/onda-6-ai-quota-fail-closed

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown

@cubic-dev-ai cubic-dev-ai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No issues found across 2 files

Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR changes AI quota handling to fail closed when quota RPC calls fail, reducing the risk of unbounded AI usage during infrastructure/database issues. It also adds a hardening note documenting the rationale and validation.

Changes:

  • checkAiQuota now returns allowed: false on check_ai_quota RPC errors.
  • acquireAiQuota now returns allowed: false on acquire_ai_quota RPC errors.
  • Added Onda 6 hardening documentation for the quota fail-closed change.

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 5 comments.

File Description
supabase/functions/_shared/ai-usage.ts Updates quota RPC error fallbacks from fail-open to fail-closed.
docs/hardening/ONDA-6-AI-QUOTA-FAIL-CLOSED.md Documents the fail-closed change, impact, validation, and rollback notes.
Comments suppressed due to low confidence (1)

supabase/functions/_shared/ai-usage.ts:95

  • The new fail-closed branch is the production enforcement path used by callAiWithTracking, but there is no regression test covering the RPC error case for acquireAiQuota in the existing AI usage tests. Please add coverage that mocks acquire_ai_quota failing and verifies it returns allowed: false/acquire_failed_security_lock, because a future fallback to allowed: true would reintroduce the cost-control bug.
  if (error) {
    // Onda 6 (B-7): fail-CLOSED. Antes era "allow but without log_id" — risco de gasto
    // descontrolado de IA se a RPC acquire_ai_quota falhar. Callers (callAiWithTracking,
    // ai-router) tratam allowed=false via QuotaExceededError → resposta 429 ao cliente.
    // Erro é logado via console.error → capturado pelo GlitchTip (Onda 5).
    console.error("[ai-usage] Atomic quota acquire failed (fail-closed):", error.message);
    return {
      allowed: false,

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines 61 to +67
if (error) {
console.error("[ai-usage] Quota check failed:", error.message);
return { allowed: true, used: 0, limit: -1, remaining: -1, reason: "quota_check_failed" };
// Onda 6 (B-7): fail-CLOSED. Antes era fail-open ("allowed: true") — risco de gasto
// descontrolado de IA se a RPC check_ai_quota falhar (banco lento, função recriada, etc).
// Agora bloqueia. Erro é logado via console.error → capturado pelo GlitchTip (Onda 5).
console.error("[ai-usage] Quota check failed (fail-closed):", error.message);
return {
allowed: false,
Comment on lines +64 to +65
// Agora bloqueia. Erro é logado via console.error → capturado pelo GlitchTip (Onda 5).
console.error("[ai-usage] Quota check failed (fail-closed):", error.message);
Comment on lines +92 to +93
// Erro é logado via console.error → capturado pelo GlitchTip (Onda 5).
console.error("[ai-usage] Atomic quota acquire failed (fail-closed):", error.message);
Comment on lines +130 to +132
A auditoria também recomenda **alerta proativo** quando a branch fail-closed dispara N vezes em M minutos (sintoma de problema infra). Isso vai num PR separado (Onda futura — `alert-quota-failures`), provavelmente como webhook → Slack ou edge function de telemetria.

Por agora, o monitoramento pelo GlitchTip já cobre o caso. Issues nessa rota vão receber tag `quota_check_failed_security_lock` ou `acquire_failed_security_lock` pra filtrar.
Comment on lines +75 to +76
Todos os 11 callers indiretos de `acquireAiQuota` (via `callAiWithTracking`) já tratam `QuotaExceededError`:

@adm01-debug adm01-debug merged commit 3469087 into main May 14, 2026
23 of 26 checks passed
@adm01-debug adm01-debug deleted the cleanup/onda-6-ai-quota-fail-closed branch May 14, 2026 16:49
adm01-debug pushed a commit that referenced this pull request May 14, 2026
…I e testes

- Resolve conflitos em baselines (ESLint 853, TSC 1262) e test files
- Corrige non-null assertion em sentry.ts (lint-staged pre-commit)
- Mantém: SidebarNavGroup tests fixed, CI workflow fixes, baseline melhorado

https://claude.ai/code/session_01MuNDxFSRRaJLsvkBdyQ2dK
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants