feat: Add Pool On-Empty Behavior Configuration for Redis Connections#1018
feat: Add Pool On-Empty Behavior Configuration for Redis Connections#1018collin-lee merged 4 commits intoenvoyproxy:mainfrom
Conversation
c590474 to
a505335
Compare
|
hi @collin-lee, @arkodg could you please help me review this PR? |
Signed-off-by: notdu <huudutg@gmail.com>
| // Empty string = use radix default (PoolOnEmptyCreateAfter(1s)) | ||
| if poolOnEmptyBehavior != "" { | ||
| logger.Warnf("Redis pool %s: unknown on-empty behavior '%s', using default (CREATE after 1s)", maskedUrl, poolOnEmptyBehavior) | ||
| } |
There was a problem hiding this comment.
Can we make the default behavior is CREATE and default wait time is 1s?
There was a problem hiding this comment.
What do you think, @collin-lee? Should we set the default value for REDIS_POOL_ON_EMPTY_BEHAVIOR to CREATE and REDIS_POOL_ON_EMPTY_WAIT_DURATION to 1s?
There was a problem hiding this comment.
Isn't it the default behavior when poolOnEmptyBehavior is empty?
// Empty string = use radix default (PoolOnEmptyCreateAfter(1s))
I'm not sure, but my opinion is that I think it's better to fail fast in production. What would be the case to set to CREATE/1s?
There was a problem hiding this comment.
CREATE is the default to maintain backward compatibility with radix's default behavior; ERROR is recommended for production, but changing the default would be a breaking change for existing deployments.
There was a problem hiding this comment.
I've refactored to make it explicit: CREATE with 1s wait duration - which produces the same behavior but is clearer and removes the "magic empty string". Please help me review it @collin-lee
This PR is a follow-up to #987 (REDIS_TIMEOUT configuration).
Current Behavior
The ratelimit service uses radix for Redis connection pooling. The default radix behavior is
PoolOnEmptyCreateAfter(1*time.Second):REDIS_POOL_SIZE=50→ 50 persistent connections in the poolThe ratelimit service currently passes no
PoolOnEmptyoptions to radix, so there is no way to change this behavior.Problem
When Redis becomes unresponsive (paused, network partition, or slow), this default behavior causes connection storms:
This is problematic because:
REDIS_POOL_SIZEappears to limit connections but doesn't enforce itProposed Solution
Introduce configurable pool on-empty behavior via environment variables:
REDIS_POOL_ON_EMPTY_BEHAVIORERROR,CREATE, orWAIT""(radix default: CREATE after 1s)REDIS_POOL_ON_EMPTY_WAIT_DURATION0(immediate)Behavior options:
ERROR: Return error after wait duration. Enforces strict pool size limit. Recommended for production to fail-fast during Redis outages.CREATE: Create overflow connection after wait duration. Current default behavior.WAIT: Block until a connection becomes available. Enforces pool size but risks goroutine buildup.Test Result:
REDIS_POOL_ON_EMPTY_BEHAVIORConfiguration
Memory Limits:
Redis Configuration:
k6 Load Test:
Redis: PAUSED for 5 minutes
Before
REDIS_POOL_ON_EMPTY_BEHAVIORconfigAfter
REDIS_POOL_ON_EMPTY_BEHAVIOR=ERRORconfigChanges
REDIS_POOL_ON_EMPTY_BEHAVIORandREDIS_POOL_ON_EMPTY_WAIT_DURATIONsettingsREDIS_PERSECOND_POOL_ON_EMPTY_BEHAVIORandREDIS_PERSECOND_POOL_ON_EMPTY_WAIT_DURATIONsettingsRelated