Skip to content

feat(rate-limit): minute-bucket middleware with per-key and per-org counters (1.5c)#36

Open
Shaivpidadi wants to merge 2 commits intodevfrom
feat/1.5c-rate-limit-middleware
Open

feat(rate-limit): minute-bucket middleware with per-key and per-org counters (1.5c)#36
Shaivpidadi wants to merge 2 commits intodevfrom
feat/1.5c-rate-limit-middleware

Conversation

@Shaivpidadi
Copy link
Copy Markdown
Member

Summary

Implements §1.5c rate-limit middleware on top of the §1.5b Redis wiring.

  • Minute-bucket sliding-window counters for four dimensions:
    • `req:key:{key_hash}:{minute}` — per-key requests (default 100/min)
    • `tokens:key:{key_hash}:{minute}` — per-key tokens (default 100k/min)
    • `req:org:{org_id}:{minute}` — per-org requests (default 1k/min)
    • `tokens:org:{org_id}:{minute}` — per-org tokens (default 1M/min)
  • FastAPI middleware (`app/rate_limit_middleware.py`) enforces the
    counters before route handlers. Removes the in-route rate-limit calls
    from `precheck` / `postcheck`.
  • `X-RateLimit-Limit`, `X-RateLimit-Remaining`, `X-RateLimit-Reset`
    headers on every authenticated response; `Retry-After` on 429s.
  • Health / readiness / metrics paths bypass the limiter so probes don't
    consume quota.

Cipher security scope (precheck#31 non-blocking #4)

  • REDIS_URL TLS/password posture. `Settings` now rejects
    `REDIS_URL` outside debug mode unless it uses the `rediss://` TLS
    scheme and carries a password. Plaintext is debug-only.
  • Multi-replica fail-open quota-bypass. Default changed to
    fail-closed: when Redis is configured but unreachable the middleware
    returns HTTP 503 `rate limiter unavailable` rather than silently
    falling back to a per-replica in-memory counter (which would multiply
    the effective quota by N replicas). Operators can opt into
    `RATE_LIMIT_FAIL_MODE=open` or `local`; `local` is rejected
    outside `DEBUG=true`.
  • Behavior documented in `PROJECT_SPECS.md#rate-limiting`.

GovernsAI Tracker issue

GOV-1853 / 62aac781-1312-4184-b5e3-39ce37afddb7

Reviewers

Tagging Nexus (code quality) and Cipher (security/arch) — both approvals required.

Test plan

  • `tests/test_rate_limit.py` — 14 unit tests covering minute-bucket
    increment, minute-window reset, partial-window sliding weight, per-key
    vs per-org independence, token cost enforcement, fail-closed /
    fail-open / fail-local Redis outage behavior, and clear().
  • `tests/test_rate_limit_http.py` — 12 integration tests via
    TestClient covering 100-req/min limit, Retry-After header, all three
    X-RateLimit-* headers on success, remaining decrement across requests,
    minute-bucket roll admitting new requests, token-limit 429, and health
    bypass.
  • `tests/test_settings.py` — 10 new tests covering REDIS_URL TLS
    scheme rejection, passwordless rejection, debug-mode plaintext
    acceptance, and `RATE_LIMIT_FAIL_MODE` validation.
  • Full suite: `.venv/bin/python -m pytest tests/ --no-cov` →
    213 passed, 9 skipped.

…ounters (1.5c)

Replaces the sliding-window-log limiter with minute-bucket sliding-window
counters for four dimensions: per-key requests, per-key tokens, per-org
requests, per-org tokens. Counters live under `{dim}:{scope}:{id}:{minute}`
keys with a 2-minute TTL so the previous bucket contributes to the
sliding-window weight.

The limiter now runs as FastAPI middleware (`app.rate_limit_middleware`)
before route handlers, so unauthenticated flood attempts cannot escape the
counter by bailing in `require_api_key`. All responses carry
`X-RateLimit-Limit`, `X-RateLimit-Remaining`, and `X-RateLimit-Reset`
reflecting the most restrictive dimension; denied requests return 429 with
`Retry-After`.

Cipher review scope (precheck#31, non-blocking #4):

- `REDIS_URL` posture validator: non-debug environments must use
  `rediss://` (TLS) and carry a password. Plaintext/passwordless Redis is
  debug-only.
- Multi-replica quota-bypass: resolved by defaulting to `fail-closed` on
  Redis outage. The middleware returns 503 `rate limiter unavailable`
  rather than silently falling back to a per-replica in-memory counter
  that multiplies the effective quota by N. Operators can opt into
  `RATE_LIMIT_FAIL_MODE=open` (accept quota bypass) or `local` (debug-only
  per-replica fallback); `Settings` rejects `local` outside `DEBUG=true`.

Behavior is documented in `precheck/PROJECT_SPECS.md#rate-limiting`.

Refs: 62aac781-1312-4184-b5e3-39ce37afddb7
- black/isort formatting on rate_limit middleware + tests
- cast SQLAlchemy Column[str] to str for org_id return type
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant