fix(context): preserve cache-boundary metadata through prompt builders#506
Conversation
- Updated the `SystemPromptBuilder` to include a new method for building prompts with cache metadata, returning a `RenderedPrompt` struct that contains the prompt text and an optional cache boundary index. - Introduced a constant for the cache boundary marker to improve readability and maintainability. - Modified the `extract_cache_boundary` function to cleanly remove the cache boundary marker from the rendered prompt while returning the relevant index. - Updated tests to validate the new cache boundary functionality and ensure that the system prompt does not leak internal markers. - Adjusted the agent's session management to track the cache boundary, enhancing the overall prompt handling and memory context management.
- Introduced `build_system_prompt_with_cache_metadata` method in `ContextManager` to assemble the opening system prompt while preserving cache-boundary metadata for improved request prefix caching. - Updated imports to include `RenderedPrompt` for the new functionality, enhancing the context management capabilities.
📝 WalkthroughWalkthroughThe pull request introduces a Changes
Sequence Diagram(s)sequenceDiagram
participant A as Agent
participant T as Turn Logic
participant CM as ContextManager
participant SPB as SystemPromptBuilder
participant PR as Provider
A->>T: build_system_prompt(learned_context)
T->>CM: build_system_prompt_with_cache_metadata(ctx)
CM->>SPB: build_with_cache_metadata(ctx)
SPB->>SPB: render + insert CACHE_BOUNDARY_MARKER
SPB->>SPB: extract_cache_boundary() → RenderedPrompt
SPB-->>CM: RenderedPrompt{text, cache_boundary}
CM-->>T: RenderedPrompt{text, cache_boundary}
T->>A: store cache_boundary (first turn only)
T->>A: add rendered.text to history
T->>PR: ChatRequest with system_prompt_cache_boundary
PR-->>T: response
Estimated code review effort🎯 4 (Complex) | ⏱️ ~45 minutes Possibly related PRs
Poem
🚥 Pre-merge checks | ✅ 3✅ Passed checks (3 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Comment |
There was a problem hiding this comment.
Actionable comments posted: 1
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (2)
src/openhuman/context/prompt.rs (1)
500-607:⚠️ Potential issue | 🟠 MajorTyped sub-agent prompts never produce a cache boundary.
src/openhuman/agent/harness/subagent_runner.rsnow wraps this helper inextract_cache_boundary(...), butrender_subagent_system_prompt()never emitsCACHE_BOUNDARY_MARKERor otherwise records a boundary. That means typed-mode sub-agents always sendsystem_prompt_cache_boundary = None, so the new prefix-caching path silently does nothing there and the added typed-mode boundary test should fail. Have this helper return aRenderedPrompt(or explicitly set a terminal boundary when the whole prompt is cacheable) instead of extracting from a marker-free string afterward.src/openhuman/agent/harness/subagent_runner.rs (1)
352-376:⚠️ Potential issue | 🟠 MajorFork-mode allowlist no longer matches the replayed tool schema.
After switching the request body to
fork.tool_specs, the execution allowlist is still built fromparent.all_tools. A model that emits a tool name outside the replayed schema will now pass the allowlist check and run anyway, so fork mode is no longer constrained to the parent’s exact request surface. Derive the allowlist fromfork.tool_specs(or otherwise filter execution to that same snapshot).🔒 Keep execution aligned with the replayed schema
- let allowed_names: HashSet<String> = parent - .all_tools - .iter() - .map(|t| t.name().to_string()) - .collect(); + let allowed_names: HashSet<String> = fork + .tool_specs + .iter() + .map(|spec| spec.name.clone()) + .collect();🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/openhuman/agent/harness/subagent_runner.rs` around lines 352 - 376, The allowlist is currently built from parent.all_tools but the request body passed to run_inner_loop is fork.tool_specs, so a model could invoke tools not in the replayed schema; change the allowed_names computation to derive names from fork.tool_specs (or compute the intersection of parent.all_tools and fork.tool_specs) so the set passed to run_inner_loop matches the replayed tool snapshot; update the variable built from parent.all_tools to iterate fork.tool_specs (or filter parent.all_tools by fork.tool_specs names) before calling run_inner_loop.
🧹 Nitpick comments (1)
src/openhuman/agent/harness/session/runtime.rs (1)
113-117: Add a debug breadcrumb when history resets.
clear_history()now invalidates both conversation history and the stored prompt-cache boundary, but there’s still no structured trace when that reset happens. A small debug log here would make unexpected cache misses after a manual reset much easier to diagnose.🔍 Suggested tweak
pub fn clear_history(&mut self) { + log::debug!( + "[agent] clearing history history_len={} cache_boundary_present={}", + self.history.len(), + self.system_prompt_cache_boundary.is_some() + ); self.history.clear(); self.system_prompt_cache_boundary = None; }As per coding guidelines, "Add substantial, development-oriented logs on new/changed flows; include logs at entry/exit points, branch decisions, external calls, retries/timeouts, state transitions, and error handling paths".
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/openhuman/agent/harness/session/runtime.rs` around lines 113 - 117, Add a debug breadcrumb in clear_history to log the state transition when conversation history is reset: update the pub fn clear_history(&mut self) method to emit a debug/processLogger-style log indicating that history was cleared and that system_prompt_cache_boundary was set to None, referencing the history and system_prompt_cache_boundary fields so callers can correlate cache-miss events; keep the log level as debug and include minimal context (e.g., previous history length and whether a cache boundary existed) to aid diagnostics.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@src/openhuman/agent/harness/session/turn.rs`:
- Around line 61-67: The debug log prints the full rendered system prompt
(rendered_prompt.text) which may contain sensitive data; change the logging in
the block that sets self.system_prompt_cache_boundary to avoid printing the
prompt body—remove or redact the log::debug!("[agent_loop] system prompt
body:\n{}", rendered_prompt.text) call and instead log only non-sensitive
metadata such as rendered_prompt.cache_boundary or a hash (e.g., SHA256 of
rendered_prompt.text) and the char count; ensure the other lines (info logs and
assignment to self.system_prompt_cache_boundary) remain intact.
---
Outside diff comments:
In `@src/openhuman/agent/harness/subagent_runner.rs`:
- Around line 352-376: The allowlist is currently built from parent.all_tools
but the request body passed to run_inner_loop is fork.tool_specs, so a model
could invoke tools not in the replayed schema; change the allowed_names
computation to derive names from fork.tool_specs (or compute the intersection of
parent.all_tools and fork.tool_specs) so the set passed to run_inner_loop
matches the replayed tool snapshot; update the variable built from
parent.all_tools to iterate fork.tool_specs (or filter parent.all_tools by
fork.tool_specs names) before calling run_inner_loop.
---
Nitpick comments:
In `@src/openhuman/agent/harness/session/runtime.rs`:
- Around line 113-117: Add a debug breadcrumb in clear_history to log the state
transition when conversation history is reset: update the pub fn
clear_history(&mut self) method to emit a debug/processLogger-style log
indicating that history was cleared and that system_prompt_cache_boundary was
set to None, referencing the history and system_prompt_cache_boundary fields so
callers can correlate cache-miss events; keep the log level as debug and include
minimal context (e.g., previous history length and whether a cache boundary
existed) to aid diagnostics.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: 2fdc8617-b08e-4fd9-8994-4bbb57f6f764
📒 Files selected for processing (8)
src/openhuman/agent/harness/session/builder.rssrc/openhuman/agent/harness/session/runtime.rssrc/openhuman/agent/harness/session/tests.rssrc/openhuman/agent/harness/session/turn.rssrc/openhuman/agent/harness/session/types.rssrc/openhuman/agent/harness/subagent_runner.rssrc/openhuman/context/manager.rssrc/openhuman/context/prompt.rs
| log::info!("[agent] system prompt built — initialising conversation history"); | ||
| log::info!( | ||
| "[agent_loop] system prompt built chars={}", | ||
| system_prompt.chars().count() | ||
| rendered_prompt.text.chars().count() | ||
| ); | ||
| log::debug!("[agent_loop] system prompt body:\n{system_prompt}"); | ||
| log::debug!("[agent_loop] system prompt body:\n{}", rendered_prompt.text); | ||
| self.system_prompt_cache_boundary = rendered_prompt.cache_boundary; |
There was a problem hiding this comment.
Don't log the full rendered system prompt.
With learning enabled, this debug line can include memory-derived user profile/context and workspace bootstrap text. Logging the boundary or a hash is enough for cache debugging without leaking prompt contents.
🛡️ Safer logging
- log::debug!("[agent_loop] system prompt body:\n{}", rendered_prompt.text);
+ log::debug!(
+ "[agent_loop] system prompt metadata cache_boundary={:?}",
+ rendered_prompt.cache_boundary
+ );As per coding guidelines, "Never log secrets, raw JWTs, API keys, or full PII in debug logs; redact or omit sensitive fields".
📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| log::info!("[agent] system prompt built — initialising conversation history"); | |
| log::info!( | |
| "[agent_loop] system prompt built chars={}", | |
| system_prompt.chars().count() | |
| rendered_prompt.text.chars().count() | |
| ); | |
| log::debug!("[agent_loop] system prompt body:\n{system_prompt}"); | |
| log::debug!("[agent_loop] system prompt body:\n{}", rendered_prompt.text); | |
| self.system_prompt_cache_boundary = rendered_prompt.cache_boundary; | |
| log::info!("[agent] system prompt built — initialising conversation history"); | |
| log::info!( | |
| "[agent_loop] system prompt built chars={}", | |
| rendered_prompt.text.chars().count() | |
| ); | |
| log::debug!( | |
| "[agent_loop] system prompt metadata cache_boundary={:?}", | |
| rendered_prompt.cache_boundary | |
| ); | |
| self.system_prompt_cache_boundary = rendered_prompt.cache_boundary; |
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@src/openhuman/agent/harness/session/turn.rs` around lines 61 - 67, The debug
log prints the full rendered system prompt (rendered_prompt.text) which may
contain sensitive data; change the logging in the block that sets
self.system_prompt_cache_boundary to avoid printing the prompt body—remove or
redact the log::debug!("[agent_loop] system prompt body:\n{}",
rendered_prompt.text) call and instead log only non-sensitive metadata such as
rendered_prompt.cache_boundary or a hash (e.g., SHA256 of rendered_prompt.text)
and the char count; ensure the other lines (info logs and assignment to
self.system_prompt_cache_boundary) remain intact.
- Updated `render_subagent_system_prompt` to insert a cache boundary marker before the workspace section, allowing typed sub-agents to maintain static instructions above this boundary for efficient prompt reuse. - Added a new test to verify that the cache boundary is correctly included in the rendered prompt, ensuring that the system prompt structure supports improved caching behavior. - Adjusted comments for clarity and updated the runtime banner numbering to reflect the new structure.
… prompt test - Reformatted the workspace path construction in the `render_subagent_system_prompt_includes_cache_boundary_before_workspace` test for improved readability, using a multi-line format for clarity. - This change enhances code maintainability and aligns with existing formatting standards in the codebase.
tinyhumansai#506) * feat(prompt): enhance system prompt handling with cache boundary support - Updated the `SystemPromptBuilder` to include a new method for building prompts with cache metadata, returning a `RenderedPrompt` struct that contains the prompt text and an optional cache boundary index. - Introduced a constant for the cache boundary marker to improve readability and maintainability. - Modified the `extract_cache_boundary` function to cleanly remove the cache boundary marker from the rendered prompt while returning the relevant index. - Updated tests to validate the new cache boundary functionality and ensure that the system prompt does not leak internal markers. - Adjusted the agent's session management to track the cache boundary, enhancing the overall prompt handling and memory context management. * feat(context): add method to build system prompt with cache metadata - Introduced `build_system_prompt_with_cache_metadata` method in `ContextManager` to assemble the opening system prompt while preserving cache-boundary metadata for improved request prefix caching. - Updated imports to include `RenderedPrompt` for the new functionality, enhancing the context management capabilities. * feat(prompt): include cache boundary in subagent system prompt rendering - Updated `render_subagent_system_prompt` to insert a cache boundary marker before the workspace section, allowing typed sub-agents to maintain static instructions above this boundary for efficient prompt reuse. - Added a new test to verify that the cache boundary is correctly included in the rendered prompt, ensuring that the system prompt structure supports improved caching behavior. - Adjusted comments for clarity and updated the runtime banner numbering to reflect the new structure.
Summary
ContextManagerwithout dropping the boundary offset needed for provider prefix caching.openhuman::context::promptpath.AgentDefinitionbehavior.Problem
openhuman::contextmodule, but cache-boundary metadata still needed to survive into provider requests.upstream/main, the harness ended up with conflicting prompt-building paths, which broke the Rust build and risked silently disabling prefix-cache hints for main-agent and sub-agent calls.Solution
ContextManager::build_system_prompt_with_cache_metadataso callers can use the centralized builder and still receiveRenderedPromptmetadata.system_prompt_cache_boundaryin session state.context::prompt::render_subagent_system_prompt, then extract and forward the cache boundary explicitly.SubagentRenderOptionsso definitions that opt into safety or identity blocks continue to render them correctly.Submission Checklist
app/) and/orcargo test(core) for logic you add or changeapp/test/e2e, mock backend,tests/json_rpc_e2e.rsas appropriate)cargo check --manifest-path Cargo.tomlandyarn build//////!(Rust), JSDoc or brief file/module headers (TS) on public APIs and non-obvious modulesImpact
Related
Summary by CodeRabbit
Release Notes
New Features
Improvements
Tests