Skip to content

fix(engine): bound ToolResponseStore size and cleanup consumed entries#50

Closed
echobt wants to merge 1 commit intomainfrom
fix/tool-response-store
Closed

fix(engine): bound ToolResponseStore size and cleanup consumed entries#50
echobt wants to merge 1 commit intomainfrom
fix/tool-response-store

Conversation

@echobt
Copy link
Contributor

@echobt echobt commented Feb 4, 2026

Summary

Fixes #5292 and #5293 - ToolResponseStore memory issues.

Problem

  1. Store grows without limit
  2. Consumed responses not removed

Solution

Added ToolResponseStore with:

  • Size limit: MAX_STORE_SIZE constant (500 entries) to prevent unbounded growth
  • Cleanup on read: take() method removes entries when consumed
  • Periodic cleanup: Automatic cleanup of expired entries based on TTL (default 5 minutes)
  • Oldest eviction: When capacity is reached, oldest entries are evicted to make room

API

use cortex_engine::tools::{ToolResponseStore, create_shared_store};

// Create a shared store
let store = create_shared_store();

// Store a response
store.store("call-id", "Read", result).await;

// Get without removing (peek)
let result = store.get("call-id").await;

// Take and remove (consume)
let result = store.take("call-id").await; // Entry is removed

// Manual cleanup
store.cleanup_expired().await;

// Stats and info
let stats = store.stats().await;
let info = store.info().await;

Configuration

  • MAX_STORE_SIZE: 500 (default max entries)
  • DEFAULT_TTL: 5 minutes (entry expiration)
  • CLEANUP_INTERVAL: 1 minute (automatic cleanup frequency)

Fixes #5292 and #5293 - ToolResponseStore memory issues.

Problem:
1. Store grows without limit
2. Consumed responses not removed

Solution:
- Added ToolResponseStore with configurable max size (default: 500 entries)
- Entries are removed when consumed via take() method (#5293)
- Automatic periodic cleanup of expired entries based on TTL
- Eviction of oldest entries when capacity is reached (#5292)

Features:
- MAX_STORE_SIZE constant (500) to prevent unbounded growth
- DEFAULT_TTL (5 minutes) for automatic expiration
- CLEANUP_INTERVAL (1 minute) for periodic cleanup
- get() for peeking without removal
- take() for consuming and removing entries
- cleanup_expired() and cleanup_read() for manual cleanup
- Stats tracking for monitoring store behavior
@greptile-apps
Copy link

greptile-apps bot commented Feb 4, 2026

Greptile Overview

Greptile Summary

Added bounded ToolResponseStore to prevent unbounded memory growth and ensure consumed responses are removed.

Key features implemented:

  • Size limit: MAX_STORE_SIZE (500 entries) with LRU eviction of oldest entries when at capacity
  • Cleanup on consume: take() method removes entries after reading
  • Periodic TTL cleanup: Automatic removal of expired entries (5-minute TTL)
  • Statistics tracking: Monitors stores, reads, takes, evictions, and cleanups
  • Peek vs consume: get() marks as read without removing, take() consumes and removes

Minor issues found:

  • Unused warn import on line 15
  • remove_on_read config field (line 72) is defined but never actually used in the implementation - get() never removes and take() always removes regardless of this setting

Confidence Score: 4/5

  • Safe to merge with minor cleanup - unused config field should be addressed
  • Well-structured implementation with comprehensive tests covering all main paths. The unused remove_on_read config field is a logic issue that should be fixed, but won't cause runtime errors. Unused import is a trivial syntax issue.
  • Pay attention to response_store.rs line 72 - decide whether to implement remove_on_read behavior or remove the unused field

Important Files Changed

Filename Overview
src/cortex-engine/src/tools/response_store.rs New bounded storage for tool responses with size limits, TTL-based cleanup, and statistics tracking - has unused imports and config field
src/cortex-engine/src/tools/mod.rs Added module exports for new response_store with all public types and functions

Sequence Diagram

sequenceDiagram
    participant Client
    participant Store as ToolResponseStore
    participant HashMap as responses: RwLock<HashMap>
    participant Stats as stats: RwLock<StoreStats>
    
    Note over Client,Stats: Store Tool Response
    Client->>Store: store(call_id, tool_name, result)
    Store->>Store: maybe_cleanup() (periodic)
    Store->>HashMap: write().await
    alt at capacity (>= MAX_STORE_SIZE)
        Store->>HashMap: find_oldest_key()
        Store->>HashMap: remove(oldest_key)
        Note over Store: evicted = true
    end
    Store->>HashMap: insert(call_id, StoredResponse)
    Store->>Stats: write().await
    Store->>Stats: total_stored += 1
    alt if evicted
        Store->>Stats: evictions += 1
    end
    Store-->>Client: evicted (bool)
    
    Note over Client,Stats: Peek (Get without removing)
    Client->>Store: get(call_id)
    Store->>HashMap: write().await
    alt entry exists
        Store->>HashMap: set read = true
        Store->>Stats: write().await
        Store->>Stats: reads += 1
        Store-->>Client: Some(ToolResult)
    else
        Store-->>Client: None
    end
    
    Note over Client,Stats: Consume (Take and remove)
    Client->>Store: take(call_id)
    Store->>HashMap: write().await
    alt entry exists
        Store->>HashMap: remove(call_id)
        Store->>Stats: write().await
        Store->>Stats: takes += 1
        Store-->>Client: Some(ToolResult)
    else
        Store-->>Client: None
    end
    
    Note over Client,Stats: Cleanup Expired
    Client->>Store: cleanup_expired()
    Store->>HashMap: write().await
    Store->>HashMap: retain(|_, v| !v.is_expired(ttl))
    alt removed > 0
        Store->>Stats: write().await
        Store->>Stats: expired_cleanups += removed
    end
    Store-->>Client: removed (usize)
Loading

Copy link

@greptile-apps greptile-apps bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

2 files reviewed, 2 comments

Edit Code Review Agent Settings | Greptile

use std::time::{Duration, Instant};

use tokio::sync::RwLock;
use tracing::{debug, warn};
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

warn imported but never used

Suggested change
use tracing::{debug, warn};
use tracing::debug;

Note: If this suggestion doesn't match your team's coding style, reply to this and let me know. I'll remember it for next time!

Prompt To Fix With AI
This is a comment left during a code review.
Path: src/cortex-engine/src/tools/response_store.rs
Line: 15:15

Comment:
`warn` imported but never used

```suggestion
use tracing::debug;
```

<sub>Note: If this suggestion doesn't match your team's coding style, reply to this and let me know. I'll remember it for next time!</sub>

How can I resolve this? If you propose a fix, please make it concise.

/// Time-to-live for stored responses.
pub ttl: Duration,
/// Whether to remove entries on read (peek vs consume).
pub remove_on_read: bool,
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

remove_on_read config field is never used - the get() method always keeps entries and take() always removes them regardless of this setting

Prompt To Fix With AI
This is a comment left during a code review.
Path: src/cortex-engine/src/tools/response_store.rs
Line: 72:72

Comment:
`remove_on_read` config field is never used - the `get()` method always keeps entries and `take()` always removes them regardless of this setting

How can I resolve this? If you propose a fix, please make it concise.

echobt added a commit that referenced this pull request Feb 4, 2026
This PR consolidates the following memory and storage fixes:
- #44: Add cleanup for stale file locks to prevent memory leak
- #45: Add cache size limits to prevent unbounded memory growth
- #47: Add fsync after file writes to prevent data loss
- #50: Bound ToolResponseStore size and cleanup consumed entries
- #51: Eliminate TOCTOU races in MCP server and plugin registry
- #52: Improve path validation and tilde expansion

Key changes:
- Added periodic cleanup of stale file locks
- Implemented LRU cache limits for config discovery and tokenizer
- Added fsync calls after critical file writes
- Created bounded ToolResponseStore with automatic cleanup
- Fixed time-of-check-time-of-use races
- Improved path validation security
@echobt
Copy link
Contributor Author

echobt commented Feb 4, 2026

Consolidated into #80 - fix: consolidated memory and storage improvements

@echobt echobt closed this Feb 4, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant