fix(engine): add cache size limits to prevent unbounded memory growth #45

echobt · 2026-02-04T15:05:33Z

Summary

Fixes #5146 and #5109 - Unbounded cache growth in config and token caches.

Problem

Global caches grow without limit, potentially causing OOM in long sessions.

Solution

Added MAX_CACHE_SIZE limit with simple eviction when capacity exceeded.

greptile-apps · 2026-02-04T15:08:51Z

Greptile Overview

Greptile Summary

Added MAX_CACHE_SIZE limits (1000 entries) to prevent unbounded memory growth in CONFIG_CACHE, PROJECT_ROOT_CACHE, and TokenCounter caches.

Key Changes:

Implemented insert_with_eviction helper that removes an entry when cache reaches capacity
Both caches now initialize with HashMap::with_capacity(MAX_CACHE_SIZE)
Eviction triggers at 1000 entries across all three cache instances

Issues Found:

Critical: Eviction strategy uses HashMap::keys().next() which returns a non-deterministic entry in Rust, not necessarily the oldest or least-recently-used
Bug: When updating an existing key at capacity, the code unnecessarily evicts an entry before the update (should check contains_key first)
Recommendation: Consider implementing proper LRU eviction (as done in src/cortex-file-search/src/cache.rs) or FIFO with VecDeque for predictable behavior

Confidence Score: 3/5

This PR is safe to merge but requires follow-up fixes for the eviction logic
The change successfully addresses unbounded memory growth by adding cache limits, which is the stated goal. However, the eviction strategy has two logical issues: (1) HashMap iteration is non-deterministic in Rust, making the eviction unpredictable, and (2) updating existing keys at capacity causes unnecessary evictions. These issues won't cause crashes or data corruption, but they degrade cache effectiveness and could lead to unexpected behavior in production
Both config_discovery.rs and tokenizer.rs need the eviction logic fixed to use deterministic ordering (LRU or FIFO)

Important Files Changed

Filename	Overview
src/cortex-engine/src/config/config_discovery.rs	Added MAX_CACHE_SIZE limit and insert_with_eviction helper, but eviction strategy is non-deterministic
src/cortex-engine/src/tokenizer.rs	Added MAX_CACHE_SIZE limit and insert_with_eviction helper with same non-deterministic eviction issue

Sequence Diagram

sequenceDiagram
    participant Client
    participant find_up/count
    participant Cache as HashMap Cache
    participant insert_with_eviction

    Client->>find_up/count: Request config/token count
    find_up/count->>Cache: Check cache (read lock)
    alt Cache hit
        Cache-->>find_up/count: Return cached value
        find_up/count-->>Client: Return result
    else Cache miss
        find_up/count->>find_up/count: Compute result
        find_up/count->>Cache: Acquire write lock
        find_up/count->>insert_with_eviction: Insert with eviction
        alt Cache at MAX_CACHE_SIZE
            insert_with_eviction->>Cache: keys().next()
            Cache-->>insert_with_eviction: Arbitrary key
            insert_with_eviction->>Cache: Remove arbitrary entry
        end
        insert_with_eviction->>Cache: Insert new entry
        Cache-->>find_up/count: Release lock
        find_up/count-->>Client: Return result
    end

greptile-apps

_{2 files reviewed, 4 comments}

_{Edit Code Review Agent Settings | Greptile}

greptile-apps · 2026-02-04T15:08:55Z

src/cortex-engine/src/config/config_discovery.rs

+fn insert_with_eviction<K: Eq + Hash + Clone, V>(cache: &mut HashMap<K, V>, key: K, value: V) {
+    if cache.len() >= MAX_CACHE_SIZE {
+        // Remove first entry (simple eviction strategy)
+        if let Some(k) = cache.keys().next().cloned() {
+            cache.remove(&k);
+        }
+    }
+    cache.insert(key, value);
+}


HashMap iteration order is non-deterministic in Rust - keys().next() can return different entries across runs, making eviction unpredictable. Consider using a proper LRU cache (see src/cortex-file-search/src/cache.rs for an example with access_order tracking) or a simpler FIFO with VecDeque to track insertion order.

Prompt To Fix With AI

This is a comment left during a code review. Path: src/cortex-engine/src/config/config_discovery.rs Line: 28:36 Comment: HashMap iteration order is non-deterministic in Rust - `keys().next()` can return different entries across runs, making eviction unpredictable. Consider using a proper LRU cache (see `src/cortex-file-search/src/cache.rs` for an example with `access_order` tracking) or a simpler FIFO with `VecDeque` to track insertion order. How can I resolve this? If you propose a fix, please make it concise.

greptile-apps · 2026-02-04T15:08:56Z

src/cortex-engine/src/tokenizer.rs

+fn insert_with_eviction<K: Eq + Hash + Clone, V>(cache: &mut HashMap<K, V>, key: K, value: V) {
+    if cache.len() >= MAX_CACHE_SIZE {
+        // Remove first entry (simple eviction strategy)
+        if let Some(k) = cache.keys().next().cloned() {
+            cache.remove(&k);
+        }
+    }
+    cache.insert(key, value);
+}


HashMap iteration order is non-deterministic in Rust - keys().next() returns an arbitrary entry, not necessarily the oldest. For token caching, an LRU policy would be more effective since frequently-used text patterns should remain cached. Consider using a VecDeque or the existing LRU pattern from src/cortex-file-search/src/cache.rs.

Prompt To Fix With AI

This is a comment left during a code review. Path: src/cortex-engine/src/tokenizer.rs Line: 15:23 Comment: HashMap iteration order is non-deterministic in Rust - `keys().next()` returns an arbitrary entry, not necessarily the oldest. For token caching, an LRU policy would be more effective since frequently-used text patterns should remain cached. Consider using a `VecDeque` or the existing LRU pattern from `src/cortex-file-search/src/cache.rs`. How can I resolve this? If you propose a fix, please make it concise.

greptile-apps · 2026-02-04T15:08:57Z

src/cortex-engine/src/config/config_discovery.rs

+/// Insert a key-value pair into the cache with eviction when full.
+/// When the cache reaches MAX_CACHE_SIZE, removes an arbitrary entry before inserting.
+fn insert_with_eviction<K: Eq + Hash + Clone, V>(cache: &mut HashMap<K, V>, key: K, value: V) {
+    if cache.len() >= MAX_CACHE_SIZE {


Edge case: if the cache is exactly at MAX_CACHE_SIZE and we insert a key that already exists (updating its value), we'll unnecessarily evict an entry first. Consider checking if the key exists before eviction:

if cache.len() >= MAX_CACHE_SIZE && !cache.contains_key(&key) {

Prompt To Fix With AI

This is a comment left during a code review. Path: src/cortex-engine/src/config/config_discovery.rs Line: 29:29 Comment: Edge case: if the cache is exactly at `MAX_CACHE_SIZE` and we insert a key that already exists (updating its value), we'll unnecessarily evict an entry first. Consider checking if the key exists before eviction: ``` if cache.len() >= MAX_CACHE_SIZE && !cache.contains_key(&key) { ``` How can I resolve this? If you propose a fix, please make it concise.

greptile-apps · 2026-02-04T15:08:58Z

src/cortex-engine/src/tokenizer.rs

+/// Insert a key-value pair into the cache with eviction when full.
+/// When the cache reaches MAX_CACHE_SIZE, removes an arbitrary entry before inserting.
+fn insert_with_eviction<K: Eq + Hash + Clone, V>(cache: &mut HashMap<K, V>, key: K, value: V) {
+    if cache.len() >= MAX_CACHE_SIZE {


Edge case: if the cache is exactly at MAX_CACHE_SIZE and we insert a key that already exists (updating its value), we'll unnecessarily evict an entry first. Consider checking if the key exists before eviction:

if cache.len() >= MAX_CACHE_SIZE && !cache.contains_key(&key) {

Prompt To Fix With AI

This is a comment left during a code review. Path: src/cortex-engine/src/tokenizer.rs Line: 16:16 Comment: Edge case: if the cache is exactly at `MAX_CACHE_SIZE` and we insert a key that already exists (updating its value), we'll unnecessarily evict an entry first. Consider checking if the key exists before eviction: ``` if cache.len() >= MAX_CACHE_SIZE && !cache.contains_key(&key) { ``` How can I resolve this? If you propose a fix, please make it concise.

This PR consolidates the following memory and storage fixes: - #44: Add cleanup for stale file locks to prevent memory leak - #45: Add cache size limits to prevent unbounded memory growth - #47: Add fsync after file writes to prevent data loss - #50: Bound ToolResponseStore size and cleanup consumed entries - #51: Eliminate TOCTOU races in MCP server and plugin registry - #52: Improve path validation and tilde expansion Key changes: - Added periodic cleanup of stale file locks - Implemented LRU cache limits for config discovery and tokenizer - Added fsync calls after critical file writes - Created bounded ToolResponseStore with automatic cleanup - Fixed time-of-check-time-of-use races - Improved path validation security

echobt · 2026-02-04T15:48:56Z

Consolidated into #80 - fix: consolidated memory and storage improvements

fix(engine): add cache size limits to prevent unbounded memory growth

abe58f3

greptile-apps bot reviewed Feb 4, 2026

View reviewed changes

echobt mentioned this pull request Feb 4, 2026

fix: consolidated memory and storage improvements #80

Closed

echobt closed this Feb 4, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix(engine): add cache size limits to prevent unbounded memory growth #45

fix(engine): add cache size limits to prevent unbounded memory growth #45

Uh oh!

echobt commented Feb 4, 2026

Uh oh!

greptile-apps bot commented Feb 4, 2026

Important Files Changed

Uh oh!

greptile-apps bot left a comment

Uh oh!

greptile-apps bot Feb 4, 2026

Uh oh!

greptile-apps bot Feb 4, 2026

Uh oh!

greptile-apps bot Feb 4, 2026

Uh oh!

greptile-apps bot Feb 4, 2026

Uh oh!

echobt commented Feb 4, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

fix(engine): add cache size limits to prevent unbounded memory growth #45

fix(engine): add cache size limits to prevent unbounded memory growth #45

Uh oh!

Conversation

echobt commented Feb 4, 2026

Summary

Problem

Solution

Uh oh!

greptile-apps bot commented Feb 4, 2026

Greptile Overview

Greptile Summary

Confidence Score: 3/5

Important Files Changed

Sequence Diagram

Uh oh!

greptile-apps bot left a comment

Choose a reason for hiding this comment

Uh oh!

greptile-apps bot Feb 4, 2026

Choose a reason for hiding this comment

Uh oh!

greptile-apps bot Feb 4, 2026

Choose a reason for hiding this comment

Uh oh!

greptile-apps bot Feb 4, 2026

Choose a reason for hiding this comment

Uh oh!

greptile-apps bot Feb 4, 2026

Choose a reason for hiding this comment

Uh oh!

echobt commented Feb 4, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant