fix(cache): reduce DeepSeek prefix churn from dynamic tool catalog growth by caulif · Pull Request #1576 · Hmbown/CodeWhale

caulif · 2026-05-13T12:34:03Z

Summary

DeepSeek's prompt cache only hits when later requests replay the same cached prefix units. In Agent mode, DeepSeek-TUI was still deferring several tools that belong to the normal coding loop, so the model-visible tools array kept growing during read -> edit -> diff/test workflows.

This patch preloads the core coding toolset in Agent mode so those tools are present from turn one instead of being appended later after first use.

What changed

Keep these tools loaded by default in Agent mode:
- write_file
- edit_file
- apply_patch
- git_status
- git_diff
- git_show
- git_log
- git_blame
- run_tests
Add regression tests that assert:
- the core coding toolset is not deferred in Agent mode
- the core coding toolset is active from the first turn
Add an analysis document with official links and cache validation:
- docs/deepseek_cache_prefix_churn.md

Why

DeepSeek's official KV cache docs say prompt caching is enabled by default, and that later requests only hit when they reuse a repeated prefix; because of Sliding Window Attention, matching happens on complete cache prefix units.

DeepSeek KV cache: https://api-docs.deepseek.com/zh-cn/guides/kv_cache

Claude Code's official docs describe a similar context-stability principle from the opposite direction: tool definitions are deferred and loaded on demand so the stable prefix stays small and stable.

Claude Code how-it-works: https://code.claude.com/docs/en/how-claude-code-works

DeepSeek-TUI already sorts tool partitions deterministically and appends deferred activations to the tail, which is good. The remaining problem was that many tools the project itself treats as first-class coding tools were still deferred, so a normal Agent session kept mutating the visible tool catalog.

Validation

Rust verification

cargo fmt --all
cargo test -p deepseek-tui --bin deepseek-tui agent_mode_preloads_core_coding_toolset -- --nocapture
cargo test -p deepseek-tui --bin deepseek-tui initial_active_tools_include_core_coding_toolset_in_agent_mode -- --nocapture
cargo test -p deepseek-tui --bin deepseek-tui non_yolo_mode_retains_default_defer_policy -- --nocapture
cargo test -p deepseek-tui --bin deepseek-tui active_tool_list_pushes_deferred_activations_to_the_tail -- --nocapture
cargo test -p deepseek-tui --bin deepseek-tui deferred_edit_file_first_use_hydrates_schema_without_execution -- --nocapture
cargo test -p deepseek-tui --bin deepseek-tui model_tool_catalog -- --nocapture
cargo build -p deepseek-tui
cargo test --workspace --all-features (passes for this change set's touched areas, but currently reports two unrelated failures in commands::skills::*)

DeepSeek API cache reproduction

Using the official Chat Completions API with a long repeated prefix:

Stable tools on repeat:
- prompt_cache_hit_tokens = 4096
- prompt_cache_miss_tokens = 18
Expanded tool catalog on repeat:
- prompt_cache_hit_tokens = 3328
- prompt_cache_miss_tokens = 791

That is a loss of 768 cache-hit tokens and an increase of 773 miss tokens on the repeated request when the visible tool catalog grows between requests.

Notes

This is a focused first fix. It does not try to solve MCP churn, compaction strategy, or tool-result payload size.

…owth

gemini-code-assist

Code Review

This pull request improves DeepSeek prompt cache hit rates in Agent mode by preloading core coding tools, such as file editing, git operations, and test execution, rather than deferring them. The changes include new unit tests to verify this behavior and comprehensive documentation detailing the cache churn analysis. The reviewer recommended extracting the list of preloaded tools into a constant for better maintainability and sorting the shell execution tools alphabetically.

gemini-code-assist · 2026-05-13T12:40:26Z

+    let always_loaded_in_agent_mode = matches!(mode, AppMode::Agent)
        && matches!(
            name,
-            "exec_shell"
+            "apply_patch"
+                | "edit_file"
+                | "exec_shell"
                | "exec_shell_wait"
                | "exec_shell_interact"
                | "exec_wait"
                | "exec_interact"
+                | "git_blame"
+                | "git_diff"
+                | "git_log"
+                | "git_show"
+                | "git_status"
+                | "run_tests"
+                | "write_file"
        );


The list of tools in always_loaded_in_agent_mode is becoming quite large. While the matches! macro is efficient, consider extracting this list into a named constant (e.g., CORE_CODING_TOOLS) to improve maintainability and allow reuse in tests without duplication.

Fixed in cad7154: extracted the Agent preload set into AGENT_MODE_PRELOADED_TOOLS and reused it in the tests so the policy and assertions stay in sync.

gemini-code-assist · 2026-05-13T12:40:26Z

+                | "exec_shell"
                | "exec_shell_wait"
                | "exec_shell_interact"
                | "exec_wait"
                | "exec_interact"


Fixed in cad7154: the shared Agent preload list is now alphabetized, including the shell execution tools.

Hmbown · 2026-05-23T21:55:28Z

This PR was opened before the v0.8.41 rebrand and is now stale. Feel free to rebase onto current main and reopen. 鲸鱼兄弟们等你 🐋

fix(cache): reduce DeepSeek prefix churn from dynamic tool catalog gr…

9555801

…owth

gemini-code-assist Bot reviewed May 13, 2026

View reviewed changes

refactor(cache): extract Agent tool preload set

cad7154

Hmbown mentioned this pull request May 21, 2026

v0.8.42 tracker: inbox-zero triage and backlog close-out #1876

Closed

Hmbown closed this May 23, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix(cache): reduce DeepSeek prefix churn from dynamic tool catalog growth#1576

fix(cache): reduce DeepSeek prefix churn from dynamic tool catalog growth#1576
caulif wants to merge 2 commits into
Hmbown:mainfrom
caulif:codex/fix-cache-prefix-churn

caulif commented May 13, 2026

Uh oh!

gemini-code-assist Bot left a comment

Uh oh!

gemini-code-assist Bot May 13, 2026

Uh oh!

caulif May 13, 2026

Uh oh!

gemini-code-assist Bot May 13, 2026

Uh oh!

caulif May 13, 2026

Uh oh!

Hmbown commented May 23, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

caulif commented May 13, 2026

Summary

What changed

Why

Validation

Rust verification

DeepSeek API cache reproduction

Notes

Uh oh!

gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

gemini-code-assist Bot May 13, 2026

Choose a reason for hiding this comment

Uh oh!

caulif May 13, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist Bot May 13, 2026

Choose a reason for hiding this comment

Uh oh!

caulif May 13, 2026

Choose a reason for hiding this comment

Uh oh!

Hmbown commented May 23, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants