fix(cache): reduce DeepSeek prefix churn from dynamic tool catalog growth#1576
fix(cache): reduce DeepSeek prefix churn from dynamic tool catalog growth#1576caulif wants to merge 2 commits into
Conversation
There was a problem hiding this comment.
Code Review
This pull request improves DeepSeek prompt cache hit rates in Agent mode by preloading core coding tools, such as file editing, git operations, and test execution, rather than deferring them. The changes include new unit tests to verify this behavior and comprehensive documentation detailing the cache churn analysis. The reviewer recommended extracting the list of preloaded tools into a constant for better maintainability and sorting the shell execution tools alphabetically.
| let always_loaded_in_agent_mode = matches!(mode, AppMode::Agent) | ||
| && matches!( | ||
| name, | ||
| "exec_shell" | ||
| "apply_patch" | ||
| | "edit_file" | ||
| | "exec_shell" | ||
| | "exec_shell_wait" | ||
| | "exec_shell_interact" | ||
| | "exec_wait" | ||
| | "exec_interact" | ||
| | "git_blame" | ||
| | "git_diff" | ||
| | "git_log" | ||
| | "git_show" | ||
| | "git_status" | ||
| | "run_tests" | ||
| | "write_file" | ||
| ); |
There was a problem hiding this comment.
There was a problem hiding this comment.
Fixed in cad7154: extracted the Agent preload set into AGENT_MODE_PRELOADED_TOOLS and reused it in the tests so the policy and assertions stay in sync.
| | "exec_shell" | ||
| | "exec_shell_wait" | ||
| | "exec_shell_interact" | ||
| | "exec_wait" | ||
| | "exec_interact" |
There was a problem hiding this comment.
The shell execution tools are not strictly sorted alphabetically. For better maintainability as this list grows, consider sorting them (e.g., exec_interact should come before exec_shell).
| | "exec_shell" | |
| | "exec_shell_wait" | |
| | "exec_shell_interact" | |
| | "exec_wait" | |
| | "exec_interact" | |
| | "exec_interact" | |
| | "exec_shell" | |
| | "exec_shell_interact" | |
| | "exec_shell_wait" | |
| | "exec_wait" |
There was a problem hiding this comment.
Fixed in cad7154: the shared Agent preload list is now alphabetized, including the shell execution tools.
|
This PR was opened before the v0.8.41 rebrand and is now stale. Feel free to rebase onto current |
Summary
DeepSeek's prompt cache only hits when later requests replay the same cached prefix units. In Agent mode, DeepSeek-TUI was still deferring several tools that belong to the normal coding loop, so the model-visible
toolsarray kept growing during read -> edit -> diff/test workflows.This patch preloads the core coding toolset in Agent mode so those tools are present from turn one instead of being appended later after first use.
What changed
write_fileedit_fileapply_patchgit_statusgit_diffgit_showgit_loggit_blamerun_testsdocs/deepseek_cache_prefix_churn.mdWhy
DeepSeek's official KV cache docs say prompt caching is enabled by default, and that later requests only hit when they reuse a repeated prefix; because of Sliding Window Attention, matching happens on complete cache prefix units.
Claude Code's official docs describe a similar context-stability principle from the opposite direction: tool definitions are deferred and loaded on demand so the stable prefix stays small and stable.
DeepSeek-TUI already sorts tool partitions deterministically and appends deferred activations to the tail, which is good. The remaining problem was that many tools the project itself treats as first-class coding tools were still deferred, so a normal Agent session kept mutating the visible tool catalog.
Validation
Rust verification
cargo fmt --allcargo test -p deepseek-tui --bin deepseek-tui agent_mode_preloads_core_coding_toolset -- --nocapturecargo test -p deepseek-tui --bin deepseek-tui initial_active_tools_include_core_coding_toolset_in_agent_mode -- --nocapturecargo test -p deepseek-tui --bin deepseek-tui non_yolo_mode_retains_default_defer_policy -- --nocapturecargo test -p deepseek-tui --bin deepseek-tui active_tool_list_pushes_deferred_activations_to_the_tail -- --nocapturecargo test -p deepseek-tui --bin deepseek-tui deferred_edit_file_first_use_hydrates_schema_without_execution -- --nocapturecargo test -p deepseek-tui --bin deepseek-tui model_tool_catalog -- --nocapturecargo build -p deepseek-tuicargo test --workspace --all-features(passes for this change set's touched areas, but currently reports two unrelated failures incommands::skills::*)DeepSeek API cache reproduction
Using the official Chat Completions API with a long repeated prefix:
prompt_cache_hit_tokens = 4096prompt_cache_miss_tokens = 18prompt_cache_hit_tokens = 3328prompt_cache_miss_tokens = 791That is a loss of 768 cache-hit tokens and an increase of 773 miss tokens on the repeated request when the visible tool catalog grows between requests.
Notes
This is a focused first fix. It does not try to solve MCP churn, compaction strategy, or tool-result payload size.