Skip to content

fix(context): preserve cache-boundary metadata through prompt builders#506

Merged
senamakel merged 6 commits into
tinyhumansai:mainfrom
senamakel:feat/agent-improve-mk2
Apr 11, 2026
Merged

fix(context): preserve cache-boundary metadata through prompt builders#506
senamakel merged 6 commits into
tinyhumansai:mainfrom
senamakel:feat/agent-improve-mk2

Conversation

@senamakel
Copy link
Copy Markdown
Member

@senamakel senamakel commented Apr 11, 2026

Summary

  • Preserve system-prompt cache-boundary metadata in the agent/context prompt pipeline.
  • Route main-agent prompt building through ContextManager without dropping the boundary offset needed for provider prefix caching.
  • Move typed sub-agent prompt rendering onto the centralized openhuman::context::prompt path.
  • Thread sub-agent render options so omit/include flags stay aligned with AgentDefinition behavior.
  • Add/keep coverage around cache-boundary propagation in session and sub-agent flows.

Problem

  • The branch moved prompt assembly toward the centralized openhuman::context module, but cache-boundary metadata still needed to survive into provider requests.
  • During the merge from upstream/main, the harness ended up with conflicting prompt-building paths, which broke the Rust build and risked silently disabling prefix-cache hints for main-agent and sub-agent calls.
  • Reviewers need confidence that the prompt refactor still preserves deterministic prompt construction and the boundary offset used for request caching.

Solution

  • Add ContextManager::build_system_prompt_with_cache_metadata so callers can use the centralized builder and still receive RenderedPrompt metadata.
  • Update the main turn path to consume that rendered prompt and keep system_prompt_cache_boundary in session state.
  • Switch typed sub-agent prompt generation to context::prompt::render_subagent_system_prompt, then extract and forward the cache boundary explicitly.
  • Keep sub-agent omit/include behavior explicit through SubagentRenderOptions so definitions that opt into safety or identity blocks continue to render them correctly.

Submission Checklist

  • Unit tests — Vitest (app/) and/or cargo test (core) for logic you add or change
  • E2E / integration — Where behavior is user-visible or crosses UI → Tauri → sidecar → JSON-RPC; use existing harnesses (app/test/e2e, mock backend, tests/json_rpc_e2e.rs as appropriate)
  • N/A — Full E2E was not applicable for this build-fix pass; verification here was cargo check --manifest-path Cargo.toml and yarn build
  • Doc comments/// / //! (Rust), JSDoc or brief file/module headers (TS) on public APIs and non-obvious modules
  • Inline comments — Where logic, invariants, or edge cases aren’t clear from names alone (keep them grep-friendly; avoid restating the code)

Impact

  • Runtime impact is in the Rust core agent harness and context pipeline; no direct frontend behavior change.
  • Preserves prompt prefix-cache metadata for provider calls, which should avoid unnecessary prompt re-prefill when the system prompt is reused.
  • No config or migration changes.

Related

  • Issue(s):
  • Follow-up PR(s)/TODOs:

Summary by CodeRabbit

Release Notes

  • New Features

    • Implemented system prompt cache boundary tracking to optimize prompt caching and reuse across conversation turns.
    • Sub-agents now correctly respect memory context configuration settings during execution.
  • Improvements

    • Enhanced cache metadata propagation in sub-agent execution paths.
  • Tests

    • Added validation tests for cache boundary behavior and persistence.

- Updated the `SystemPromptBuilder` to include a new method for building prompts with cache metadata, returning a `RenderedPrompt` struct that contains the prompt text and an optional cache boundary index.
- Introduced a constant for the cache boundary marker to improve readability and maintainability.
- Modified the `extract_cache_boundary` function to cleanly remove the cache boundary marker from the rendered prompt while returning the relevant index.
- Updated tests to validate the new cache boundary functionality and ensure that the system prompt does not leak internal markers.
- Adjusted the agent's session management to track the cache boundary, enhancing the overall prompt handling and memory context management.
- Introduced `build_system_prompt_with_cache_metadata` method in `ContextManager` to assemble the opening system prompt while preserving cache-boundary metadata for improved request prefix caching.
- Updated imports to include `RenderedPrompt` for the new functionality, enhancing the context management capabilities.
@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai Bot commented Apr 11, 2026

Caution

Review failed

The head commit changed during the review from 611b6cf to 18beb1a.

📝 Walkthrough

Walkthrough

The pull request introduces a system_prompt_cache_boundary field to the Agent struct and threads it through the agent harness session lifecycle, including builder initialization, prompt rendering, cache extraction, and provider request wiring for both direct and sub-agent execution paths.

Changes

Cohort / File(s) Summary
Agent Field Addition
src/openhuman/agent/harness/session/types.rs, src/openhuman/agent/harness/session/builder.rs, src/openhuman/agent/harness/session/runtime.rs
Added new system_prompt_cache_boundary: Option<usize> field to Agent struct. AgentBuilder::build() initializes field to None. Agent::clear_history now resets field to None alongside conversation history.
Prompt Infrastructure
src/openhuman/context/prompt.rs, src/openhuman/context/manager.rs
Introduced RenderedPrompt { text: String, cache_boundary: Option<usize> } struct and extract_cache_boundary() helper. SystemPromptBuilder::build_with_cache_metadata() builds prompts while tracking/removing cache boundary marker. Added new public method ContextManager::build_system_prompt_with_cache_metadata().
Session Turn Logic
src/openhuman/agent/harness/session/turn.rs
Changed Agent::build_system_prompt to return RenderedPrompt instead of String. On first turn, system prompt text is stored in history and cache boundary is recorded in Agent.system_prompt_cache_boundary. Provider requests now use captured boundary instead of hardcoded None. Fork context populates cache boundary from agent field.
Subagent Cache Boundary Handling
src/openhuman/agent/harness/subagent_runner.rs
Typed mode now renders system prompt via render_subagent_system_prompt() and extracts cache boundary. Conditional memory context injection based on definition.omit_memory_context. Fork mode uses fork.tool_specs and propagates fork.cache_boundary. Extended run_inner_loop() with system_prompt_cache_boundary parameter; stops hardcoding None.
Test Updates
src/openhuman/agent/harness/session/tests.rs
CapturedCall now stores cache_boundary: Option<usize>. RecordingProvider captures request.system_prompt_cache_boundary per call. Updated assertions to verify cache boundary consistency across turns, presence of boundary value, and absence of internal marker from rendered text.

Sequence Diagram(s)

sequenceDiagram
    participant A as Agent
    participant T as Turn Logic
    participant CM as ContextManager
    participant SPB as SystemPromptBuilder
    participant PR as Provider

    A->>T: build_system_prompt(learned_context)
    T->>CM: build_system_prompt_with_cache_metadata(ctx)
    CM->>SPB: build_with_cache_metadata(ctx)
    SPB->>SPB: render + insert CACHE_BOUNDARY_MARKER
    SPB->>SPB: extract_cache_boundary() → RenderedPrompt
    SPB-->>CM: RenderedPrompt{text, cache_boundary}
    CM-->>T: RenderedPrompt{text, cache_boundary}
    T->>A: store cache_boundary (first turn only)
    T->>A: add rendered.text to history
    T->>PR: ChatRequest with system_prompt_cache_boundary
    PR-->>T: response
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

Possibly related PRs

  • PR #500: Introduces RenderedPrompt struct and extract_cache_boundary logic in prompt infrastructure that directly enables this PR's cache boundary tracking and propagation through the agent harness.
  • PR #474: Modifies agent prompt rendering and sub-agent cache-boundary plumbing with overlapping changes to system_prompt_cache_boundary field threading through agent, turn, and provider request wiring.
  • PR #496: Updates subagent_runner.rs typed-mode context wiring and parent memory context forwarding logic that intersects with this PR's conditional memory context injection and cache boundary handling.

Poem

🐰✨ Cache boundaries dance through prompt-rendering halls,
From context to agent, through every turn's calls,
A rabbit tracks metadata with markers so keen,
Extracting the truth from the space in-between! 🎯

🚥 Pre-merge checks | ✅ 3
✅ Passed checks (3 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title directly and accurately describes the main objective of the PR: preserving system-prompt cache-boundary metadata through the prompt builders pipeline.
Docstring Coverage ✅ Passed Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Comment @coderabbitai help to get the list of available commands and usage tips.

@senamakel senamakel marked this pull request as ready for review April 11, 2026 08:39
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (2)
src/openhuman/context/prompt.rs (1)

500-607: ⚠️ Potential issue | 🟠 Major

Typed sub-agent prompts never produce a cache boundary.

src/openhuman/agent/harness/subagent_runner.rs now wraps this helper in extract_cache_boundary(...), but render_subagent_system_prompt() never emits CACHE_BOUNDARY_MARKER or otherwise records a boundary. That means typed-mode sub-agents always send system_prompt_cache_boundary = None, so the new prefix-caching path silently does nothing there and the added typed-mode boundary test should fail. Have this helper return a RenderedPrompt (or explicitly set a terminal boundary when the whole prompt is cacheable) instead of extracting from a marker-free string afterward.

src/openhuman/agent/harness/subagent_runner.rs (1)

352-376: ⚠️ Potential issue | 🟠 Major

Fork-mode allowlist no longer matches the replayed tool schema.

After switching the request body to fork.tool_specs, the execution allowlist is still built from parent.all_tools. A model that emits a tool name outside the replayed schema will now pass the allowlist check and run anyway, so fork mode is no longer constrained to the parent’s exact request surface. Derive the allowlist from fork.tool_specs (or otherwise filter execution to that same snapshot).

🔒 Keep execution aligned with the replayed schema
-    let allowed_names: HashSet<String> = parent
-        .all_tools
-        .iter()
-        .map(|t| t.name().to_string())
-        .collect();
+    let allowed_names: HashSet<String> = fork
+        .tool_specs
+        .iter()
+        .map(|spec| spec.name.clone())
+        .collect();
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/openhuman/agent/harness/subagent_runner.rs` around lines 352 - 376, The
allowlist is currently built from parent.all_tools but the request body passed
to run_inner_loop is fork.tool_specs, so a model could invoke tools not in the
replayed schema; change the allowed_names computation to derive names from
fork.tool_specs (or compute the intersection of parent.all_tools and
fork.tool_specs) so the set passed to run_inner_loop matches the replayed tool
snapshot; update the variable built from parent.all_tools to iterate
fork.tool_specs (or filter parent.all_tools by fork.tool_specs names) before
calling run_inner_loop.
🧹 Nitpick comments (1)
src/openhuman/agent/harness/session/runtime.rs (1)

113-117: Add a debug breadcrumb when history resets.

clear_history() now invalidates both conversation history and the stored prompt-cache boundary, but there’s still no structured trace when that reset happens. A small debug log here would make unexpected cache misses after a manual reset much easier to diagnose.

🔍 Suggested tweak
 pub fn clear_history(&mut self) {
+    log::debug!(
+        "[agent] clearing history history_len={} cache_boundary_present={}",
+        self.history.len(),
+        self.system_prompt_cache_boundary.is_some()
+    );
     self.history.clear();
     self.system_prompt_cache_boundary = None;
 }

As per coding guidelines, "Add substantial, development-oriented logs on new/changed flows; include logs at entry/exit points, branch decisions, external calls, retries/timeouts, state transitions, and error handling paths".

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/openhuman/agent/harness/session/runtime.rs` around lines 113 - 117, Add a
debug breadcrumb in clear_history to log the state transition when conversation
history is reset: update the pub fn clear_history(&mut self) method to emit a
debug/processLogger-style log indicating that history was cleared and that
system_prompt_cache_boundary was set to None, referencing the history and
system_prompt_cache_boundary fields so callers can correlate cache-miss events;
keep the log level as debug and include minimal context (e.g., previous history
length and whether a cache boundary existed) to aid diagnostics.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@src/openhuman/agent/harness/session/turn.rs`:
- Around line 61-67: The debug log prints the full rendered system prompt
(rendered_prompt.text) which may contain sensitive data; change the logging in
the block that sets self.system_prompt_cache_boundary to avoid printing the
prompt body—remove or redact the log::debug!("[agent_loop] system prompt
body:\n{}", rendered_prompt.text) call and instead log only non-sensitive
metadata such as rendered_prompt.cache_boundary or a hash (e.g., SHA256 of
rendered_prompt.text) and the char count; ensure the other lines (info logs and
assignment to self.system_prompt_cache_boundary) remain intact.

---

Outside diff comments:
In `@src/openhuman/agent/harness/subagent_runner.rs`:
- Around line 352-376: The allowlist is currently built from parent.all_tools
but the request body passed to run_inner_loop is fork.tool_specs, so a model
could invoke tools not in the replayed schema; change the allowed_names
computation to derive names from fork.tool_specs (or compute the intersection of
parent.all_tools and fork.tool_specs) so the set passed to run_inner_loop
matches the replayed tool snapshot; update the variable built from
parent.all_tools to iterate fork.tool_specs (or filter parent.all_tools by
fork.tool_specs names) before calling run_inner_loop.

---

Nitpick comments:
In `@src/openhuman/agent/harness/session/runtime.rs`:
- Around line 113-117: Add a debug breadcrumb in clear_history to log the state
transition when conversation history is reset: update the pub fn
clear_history(&mut self) method to emit a debug/processLogger-style log
indicating that history was cleared and that system_prompt_cache_boundary was
set to None, referencing the history and system_prompt_cache_boundary fields so
callers can correlate cache-miss events; keep the log level as debug and include
minimal context (e.g., previous history length and whether a cache boundary
existed) to aid diagnostics.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 2fdc8617-b08e-4fd9-8994-4bbb57f6f764

📥 Commits

Reviewing files that changed from the base of the PR and between afb95e7 and 91547e3.

📒 Files selected for processing (8)
  • src/openhuman/agent/harness/session/builder.rs
  • src/openhuman/agent/harness/session/runtime.rs
  • src/openhuman/agent/harness/session/tests.rs
  • src/openhuman/agent/harness/session/turn.rs
  • src/openhuman/agent/harness/session/types.rs
  • src/openhuman/agent/harness/subagent_runner.rs
  • src/openhuman/context/manager.rs
  • src/openhuman/context/prompt.rs

Comment on lines 61 to +67
log::info!("[agent] system prompt built — initialising conversation history");
log::info!(
"[agent_loop] system prompt built chars={}",
system_prompt.chars().count()
rendered_prompt.text.chars().count()
);
log::debug!("[agent_loop] system prompt body:\n{system_prompt}");
log::debug!("[agent_loop] system prompt body:\n{}", rendered_prompt.text);
self.system_prompt_cache_boundary = rendered_prompt.cache_boundary;
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Don't log the full rendered system prompt.

With learning enabled, this debug line can include memory-derived user profile/context and workspace bootstrap text. Logging the boundary or a hash is enough for cache debugging without leaking prompt contents.

🛡️ Safer logging
-            log::debug!("[agent_loop] system prompt body:\n{}", rendered_prompt.text);
+            log::debug!(
+                "[agent_loop] system prompt metadata cache_boundary={:?}",
+                rendered_prompt.cache_boundary
+            );

As per coding guidelines, "Never log secrets, raw JWTs, API keys, or full PII in debug logs; redact or omit sensitive fields".

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
log::info!("[agent] system prompt built — initialising conversation history");
log::info!(
"[agent_loop] system prompt built chars={}",
system_prompt.chars().count()
rendered_prompt.text.chars().count()
);
log::debug!("[agent_loop] system prompt body:\n{system_prompt}");
log::debug!("[agent_loop] system prompt body:\n{}", rendered_prompt.text);
self.system_prompt_cache_boundary = rendered_prompt.cache_boundary;
log::info!("[agent] system prompt built — initialising conversation history");
log::info!(
"[agent_loop] system prompt built chars={}",
rendered_prompt.text.chars().count()
);
log::debug!(
"[agent_loop] system prompt metadata cache_boundary={:?}",
rendered_prompt.cache_boundary
);
self.system_prompt_cache_boundary = rendered_prompt.cache_boundary;
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/openhuman/agent/harness/session/turn.rs` around lines 61 - 67, The debug
log prints the full rendered system prompt (rendered_prompt.text) which may
contain sensitive data; change the logging in the block that sets
self.system_prompt_cache_boundary to avoid printing the prompt body—remove or
redact the log::debug!("[agent_loop] system prompt body:\n{}",
rendered_prompt.text) call and instead log only non-sensitive metadata such as
rendered_prompt.cache_boundary or a hash (e.g., SHA256 of rendered_prompt.text)
and the char count; ensure the other lines (info logs and assignment to
self.system_prompt_cache_boundary) remain intact.

- Updated `render_subagent_system_prompt` to insert a cache boundary marker before the workspace section, allowing typed sub-agents to maintain static instructions above this boundary for efficient prompt reuse.
- Added a new test to verify that the cache boundary is correctly included in the rendered prompt, ensuring that the system prompt structure supports improved caching behavior.
- Adjusted comments for clarity and updated the runtime banner numbering to reflect the new structure.
… prompt test

- Reformatted the workspace path construction in the `render_subagent_system_prompt_includes_cache_boundary_before_workspace` test for improved readability, using a multi-line format for clarity.
- This change enhances code maintainability and aligns with existing formatting standards in the codebase.
@senamakel senamakel merged commit 362d0a0 into tinyhumansai:main Apr 11, 2026
AusAgentSmith pushed a commit to AusAgentSmith/openhuman that referenced this pull request May 23, 2026
tinyhumansai#506)

* feat(prompt): enhance system prompt handling with cache boundary support

- Updated the `SystemPromptBuilder` to include a new method for building prompts with cache metadata, returning a `RenderedPrompt` struct that contains the prompt text and an optional cache boundary index.
- Introduced a constant for the cache boundary marker to improve readability and maintainability.
- Modified the `extract_cache_boundary` function to cleanly remove the cache boundary marker from the rendered prompt while returning the relevant index.
- Updated tests to validate the new cache boundary functionality and ensure that the system prompt does not leak internal markers.
- Adjusted the agent's session management to track the cache boundary, enhancing the overall prompt handling and memory context management.

* feat(context): add method to build system prompt with cache metadata

- Introduced `build_system_prompt_with_cache_metadata` method in `ContextManager` to assemble the opening system prompt while preserving cache-boundary metadata for improved request prefix caching.
- Updated imports to include `RenderedPrompt` for the new functionality, enhancing the context management capabilities.

* feat(prompt): include cache boundary in subagent system prompt rendering

- Updated `render_subagent_system_prompt` to insert a cache boundary marker before the workspace section, allowing typed sub-agents to maintain static instructions above this boundary for efficient prompt reuse.
- Added a new test to verify that the cache boundary is correctly included in the rendered prompt, ensuring that the system prompt structure supports improved caching behavior.
- Adjusted comments for clarity and updated the runtime banner numbering to reflect the new structure.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant