feat(view): Incremental view construction and conversation state view property#2141
feat(view): Incremental view construction and conversation state view property#2141
Conversation
…ties Replace @cached_property with a manually-invalidated cache for View.manipulation_indices. The cached_property was never invalidated when add_event or enforce_properties mutated self.events, causing stale indices to be returned. Now _invalidate_manipulation_indices() is called in every branch of add_event and enforce_properties that modifies the events list. Co-authored-by: openhands <openhands@all-hands.dev>
Coverage Report •
|
|||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
Hi! I started running the integration tests on your PR. You will receive a comment with the results shortly. |
|
Hi! I started running the condenser tests on your PR. You will receive a comment with the results shortly. Note: These are non-blocking tests that validate condenser functionality across different LLMs. |
Condenser Test Results (Non-Blocking)
🧪 Integration Tests ResultsOverall Success Rate: 100.0% 📊 Summary
📋 Detailed Resultslitellm_proxy_anthropic_claude_opus_4_5_20251101
litellm_proxy_gpt_5.1_codex_max
Skipped Tests:
|
all-hands-bot
left a comment
There was a problem hiding this comment.
🟡 Acceptable - Solid design making View a first-class object. Manual cache invalidation is correct but fragile. No critical issues.
Key Insight: The real win here isn't performance—it's making View an explicit representation of the agent's attention window, accessible throughout the system. Good architectural move.
See inline comments for improvement opportunities.
| for property in ALL_PROPERTIES: | ||
| results &= property.manipulation_indices(self.events) | ||
|
|
||
| self._cached_manipulation_indices = results |
There was a problem hiding this comment.
🟠 Important: Manual cache invalidation is fragile. This works now, but it's easy to forget _invalidate_manipulation_indices() when adding new methods that modify self.events.
Suggestion: Consider documenting this invariant clearly, or use a different pattern like:
- Wrap
self.eventsin a custom list type that auto-invalidates on mutation - Use Pydantic's
@computed_field(cached=True)if the model config allows it
Not blocking—current implementation is correct—but this is a maintenance risk.
| Since enforcement is intended as a fallback to inductively maintaining the | ||
| properties via the associated manipulation indices, any time a property must be | ||
| enforced a warning is logged. | ||
|
|
||
| Modifies the view in-place. | ||
| """ | ||
| for property in ALL_PROPERTIES: | ||
| events_to_forget = property.enforce(current_view_events, all_events) | ||
| events_to_forget = property.enforce(self.events, all_events) | ||
| if events_to_forget: | ||
| logger.warning( | ||
| f"Property {property.__class__} enforced, " | ||
| f"{len(events_to_forget)} events dropped." | ||
| ) | ||
| return View.enforce_properties( | ||
| [ | ||
| event | ||
| for event in current_view_events | ||
| if event.id not in events_to_forget | ||
| ], | ||
| all_events, | ||
| ) | ||
| return current_view_events | ||
| self.events = [ | ||
| event for event in self.events if event.id not in events_to_forget | ||
| ] | ||
| self._invalidate_manipulation_indices() | ||
|
|
||
| @staticmethod | ||
| def from_events(events: Sequence[Event]) -> View: | ||
| """Create a view from a list of events, respecting the semantics of any | ||
| condensation events. | ||
| # If we've forgotten events to enforce the properties, we'll need to | ||
| # attempt to apply each property again. Once we get all the way through | ||
| # the properties without any kind of modification, we can exit the loop. | ||
| self.enforce_properties(all_events) |
There was a problem hiding this comment.
🟡 Suggestion: The recursive call with break is a bit harder to follow than necessary. The old tail-recursive pattern was clearer.
Consider making the recursion explicit:
def enforce_properties(self, all_events: Sequence[Event]) -> None:
"""Enforce all properties, recursively dropping events until stable."""
while True:
changed = False
for property in ALL_PROPERTIES:
events_to_forget = property.enforce(self.events, all_events)
if events_to_forget:
logger.warning(
f"Property {property.__class__} enforced, "
f"{len(events_to_forget)} events dropped."
)
self.events = [
event for event in self.events
if event.id not in events_to_forget
]
self._invalidate_manipulation_indices()
changed = True
break # Restart property checks
if not changed:
break # All properties satisfiedThis makes it clearer that we're iterating until a fixed point.
| logger.debug(f"Skipping non-LLMConvertibleEvent of type {type(event)}") | ||
|
|
||
| @staticmethod | ||
| def from_events(events: Sequence[Event]) -> View: |
There was a problem hiding this comment.
🟢 Nit: This incrementally builds the view, which is good. But there's no test coverage for the case where a non-LLMConvertibleEvent (that's not condensation-related) is added.
Consider adding a test that verifies these are silently skipped:
def test_add_event_skips_non_llm_convertible():
view = View()
view.add_event(ConversationStateUpdateEvent(...))
assert len(view.events) == 0Not critical, just helps document the behavior.
|
@OpenHands Do a /codereview-roasted on this PR. Post your review as a comment on the PR. Note that it will be rendered as markdown. Once you posted, take a step back and look with fresh eyes: re-read critically your review and the PR, and analyze every point of it, is it actually real, is it minor, is it major compared to what it seemed? Post your re-analysis as a new comment on the PR. |
|
I'm on it! enyst can track my progress at all-hands.dev |
/codereview-roastedTaste rating: 🔴 Needs improvementYou took a relatively simple data model ("the event log is the source of truth") and replaced it with two mutable sources of truth: That can be fine if you make it impossible to get out of sync. Right now it’s very possible, and the failure mode is the worst kind: silent wrong behavior. [CRITICAL ISSUES]
[IMPROVEMENT OPPORTUNITIES]
[TESTING GAPS]
VERDICT❌ Needs rework before I’d be comfortable calling this “safer/faster”. The concept (incremental view) is fine; the current implementation relies on “everyone always uses the right mutation path” and that’s not an engineering strategy. KEY INSIGHTYou didn’t add a |
Re-analysis of my /codereview-roasted (fresh eyes)Re-reading both the PR and my own comment, here’s what I think is actually important vs what’s more “defense-in-depth / taste”. Still major / real
If only one thing gets addressed, it should be this: either make Probably medium / depends on invariants
So: I still think a post-condensation sanity/enforcement hook is a good idea, but I’d downgrade this from “guaranteed breakage” to “cheap guardrail that prevents future weirdness / version skew issues”. Probably minor
NetMy original review wasn’t wrong about the core risk, but I was too harsh in implying the condensation enforcement change is necessarily a runtime bug. The API/invariant mismatch between |
Summary of work completed
Compliance checklist (per request)
Changes / pushing
|
Summary
This PR updates the conversation state with a persistent
Viewobject that is incrementally updated as events are recorded.To support this,
Viewcreation has been refactored to be incremental. Events are added one at a time using theView.add_eventfunction, which updates the view in-place and handles condensation application, unhandled condensation request tracking, and so on.This should improve performance (we no longer need to rebuild a view from all events every time the agent takes a step) and make
Viewa first-class object accessible wherever the conversation state is.Design Decisions
Views are maintained by
ConversationStateobjects as a private field, and exposed as a property. This ensures the view is not serialized when the conversation state is saved. Instead, we rebuild the view from the whole list of events when the state is deserialized.The idea is to avoid accidentally serializing events outside the file system store we maintain, but there is the tradeoff that loading conversation states might be slightly more expensive. This is an easy decision to change should the need arise.
Performance "Benefits"
While reducing the number of
View.from_eventscalls will technically improve performance, in most use cases we expect it will be incredibly minor gains.While working on this PR I had a stack trace sampler monitoring the OpenHands ACP instance that was assisting me. On that trace, something like 99% of the samples from the
Agent.stepfunction were API calls to the LLM. Other traces have that number lower (to make room for critic calls and callback handlers), butViewconstruction has never been more than 0.5% of the duration ofAgent.step.Point is, we're hugely network-bound.
The real benefit of this PR is making
Viewa first-class object accessible to more of the system. It exposes precisely the list of events that are converted to messages and sent to the LLM, and so represents the agent's "attention window". That has to be helpful outside the condenser.Breaking Changes
View.condensationsfield removed.View.enforce_propertieschanged from a static method to an in-place modification of the calling view.prepare_llm_messagesno longer takes a list events as input. Instead of converting that list of events into a view it takes a view directly.Lastly, because
View.enforce_propertiesis only called when a view is constructed directly from a list of events (instead of being incrementally built), we now only enforce properties when conversation states are loaded from disk.Checklist
Agent Server images for this PR
• GHCR package: https://github.com/OpenHands/agent-sdk/pkgs/container/agent-server
Variants & Base Images
eclipse-temurin:17-jdknikolaik/python-nodejs:python3.12-nodejs22golang:1.21-bookwormPull (multi-arch manifest)
# Each variant is a multi-arch manifest supporting both amd64 and arm64 docker pull ghcr.io/openhands/agent-server:b6e828d-pythonRun
All tags pushed for this build
About Multi-Architecture Support
b6e828d-python) is a multi-arch manifest supporting both amd64 and arm64b6e828d-python-amd64) are also available if needed