You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
The ruvbrain Cloud Run service currently stores MCP SSE sessions in a process-local DashMap (line 182 of routes.rs). This forces max-instances=1 to prevent session-not-found 404s when Cloud Run routes requests across instances. To restore horizontal scaling, sessions must move to a shared external store.
# crates/mcp-brain-server/Cargo.tomlredis = { version = "0.25", features = ["tokio-comp", "connection-manager"] }
2. Replace DashMap with Redis-backed session store
// New: RedisSessionStorepubstructRedisSessionStore{pool: redis::aio::ConnectionManager,local_senders:DashMap<String, tokio::sync::mpsc::Sender<String>>,}implRedisSessionStore{pubasyncfnnew(redis_url:&str) -> Result<Self>{let client = redis::Client::open(redis_url)?;let pool = redis::aio::ConnectionManager::new(client).await?;Ok(Self{ pool,local_senders:DashMap::new()})}pubasyncfnregister(&self,session_id:&str,sender:Sender<String>){// Store sender locally (channel is not serializable)self.local_senders.insert(session_id.to_string(), sender);// Register session existence in Redis (with TTL)letmut conn = self.pool.clone();let _:() = redis::cmd("SET").arg(format!("mcp:session:{session_id}")).arg("1").arg("EX").arg(3600)// 1 hour TTL.query_async(&mut conn).await.unwrap_or(());}pubasyncfnget(&self,session_id:&str) -> Option<Sender<String>>{// Check local first (same instance)ifletSome(s) = self.local_senders.get(session_id){returnSome(s.clone());}// Check Redis (session exists on another instance)// In this case, we can't forward — return None and let the// client reconnect. Log for observability.None}}
3. Update messages_handler
asyncfnmessages_handler(...) -> StatusCode{let sender = match state.sessions.get(&query.session_id).await{Some(s) => s,None => {// Log session miss for monitoring
tracing::warn!(session_id = %query.session_id,"Session not found (may be on another instance)");returnStatusCode::NOT_FOUND;}};// ... rest unchanged}
Migration Plan
Provision Memorystore Redis + VPC connector
Add redis dependency to mcp-brain-server
Implement RedisSessionStore with local sender cache + Redis existence tracking
Add session TTL (1 hour) and cleanup
Update Cloud Run service: add VPC connector, set REDIS_HOST env var
Restore max-instances=10
Test: SSE connect on Instance A, POST on Instance B → verify session found
Monitor: track session miss rate in logs
Alternative: Streamable HTTP Transport
MCP protocol v2025+ supports Streamable HTTP transport which is stateless (no sessions). This would eliminate the session problem entirely. However, Claude Code currently uses SSE transport, so this is a longer-term migration.
Summary
The
ruvbrainCloud Run service currently stores MCP SSE sessions in a process-localDashMap(line 182 ofroutes.rs). This forcesmax-instances=1to prevent session-not-found 404s when Cloud Run routes requests across instances. To restore horizontal scaling, sessions must move to a shared external store.Current State
Proposed Solution: Memorystore for Redis
Google Cloud Memorystore provides a managed Redis instance that all Cloud Run instances can share.
GCloud Resources Required
Estimated Monthly Cost
Code Changes
1. Add
redisdependency tomcp-brain-server2. Replace DashMap with Redis-backed session store
3. Update messages_handler
Migration Plan
redisdependency tomcp-brain-serverRedisSessionStorewith local sender cache + Redis existence trackingREDIS_HOSTenv varmax-instances=10Alternative: Streamable HTTP Transport
MCP protocol v2025+ supports Streamable HTTP transport which is stateless (no sessions). This would eliminate the session problem entirely. However, Claude Code currently uses SSE transport, so this is a longer-term migration.
Context
max-instances=1crates/mcp-brain-server/src/routes.rs:182ruvbraininus-central1(2 CPU, 2GB RAM)🤖 Generated with claude-flow