-
Notifications
You must be signed in to change notification settings - Fork 47
Description
Summary
TokenRateLimitPolicy applies rate limiting before Gateway API route matching occurs, causing 404 responses to consume user quotas. This results in legitimate requests being blocked when users make mistakes in URLs or follow incorrect documentation.
Environment
- Kuadrant Version: (deployed via Helm charts)
- OpenShift Version: 4.19.9+
- Gateway API Version: OpenShift native implementation
- Cluster: OpenShift on AWS
- Gateway Controller:
openshift.io/gateway-controller/v1
Steps to Reproduce
- Setup: Deploy Kuadrant with TokenRateLimitPolicy on Gateway (rate limit: 5 requests)
- Get authentication token from MaaS API
- Make 5 requests to wrong URL (missing endpoint path):
for i in {1..5}; do curl -H "Authorization: Bearer $TOKEN" \ -H "Content-Type: application/json" \ -d '{"model": "test", "prompt": "Hello"}' \ "http://maas.example.com/llm/model-name" # Missing /v1/chat/completions done
- Make 6th request to wrong URL:
curl -H "Authorization: Bearer $TOKEN"
"http://maas.example.com/llm/model-name" - Test correct URL:
curl -H "Authorization: Bearer $TOKEN"
"http://maas.example.com/llm/model-name/v1/chat/completions"
Expected Behavior
- Steps 1-3: Return 404 Not Found without consuming rate limit quota
- Step 4: Return 404 Not Found without consuming rate limit quota
- Step 5: Return 200 OK with successful response
Actual Behavior
- Steps 1-3: Return 404 Not Found and consume quota (5/5 requests used)
- Step 4: Return 429 Rate Limit Exceeded (quota exhausted)
- Step 5: Return 429 Rate Limit Exceeded (legitimate request blocked)
Root Cause
TokenRateLimitPolicy is applied at the Gateway level before route matching:
Request → TokenRateLimitPolicy (quota consumed) → Gateway Routing → 404
Should be:
Request → Gateway Routing → Route Match → TokenRateLimitPolicy → Backend
Request → Gateway Routing → No Match → 404 (no quota consumption)
Impact
- Poor User Experience: Typos in URLs exhaust user quotas
- Legitimate Requests Blocked: Correct requests fail after URL mistakes
- Billing/Quota Issues: Users charged for failed requests
Configuration
Gateway
apiVersion: gateway.networking.k8s.io/v1
kind: Gateway
metadata:
name: maas-default-gateway
namespace: openshift-ingress
spec:
gatewayClassName: openshift-default
listeners:
- name: https
port: 443
protocol: HTTPS
hostname: maas.apps.cluster.example.com
TokenRateLimitPolicy
apiVersion: kuadrant.io/v1alpha1
kind: TokenRateLimitPolicy
metadata:
name: gateway-token-rate-limits
namespace: openshift-ingress
spec:
targetRef:
group: gateway.networking.k8s.io
kind: Gateway
name: maas-default-gateway
limits:
per-user:
rates:
- limit: 5
duration: 10s
Reproduction Environment
This was discovered and reproduced in the https://github.com/opendatahub-io/maas-billing on customer environment:
- Gateway: maas.apps.cluster-tvp85.tvp85.sandbox1981.opentlc.com
- Model endpoint: /llm/facebook-opt-125m-simulated/v1/chat/completions
Logs
Example sequence showing the issue:
Request 1 to /llm/model-name: 404 (quota: 1/5)
Request 2 to /llm/model-name: 404 (quota: 2/5)
Request 3 to /llm/model-name: 404 (quota: 3/5)
Request 4 to /llm/model-name: 404 (quota: 4/5)
Request 5 to /llm/model-name: 404 (quota: 5/5)
Request 6 to /llm/model-name: 429 Rate Limit Exceeded
Request 7 to /llm/model-name/v1/chat/completions: 429 Rate Limit Exceeded
Suggested Fix
Rate limiting policies should only be evaluated after successful route matching, or provide configuration to exclude certain HTTP status codes (like 404) from quota consumption.
Possible approaches:
- Apply TokenRateLimitPolicy at HTTPRoute level instead of Gateway level
- Add configuration to exclude specific status codes from rate limiting
- Change policy evaluation order to happen after routing
Additional Context
- Related to Gateway API policy attachment and request processing order
- Affects any application using Kuadrant TokenRateLimitPolicy at Gateway level
- Issue discovered during MaaS Platform validation guide testing
Metadata
Metadata
Assignees
Labels
Type
Projects
Status