feat(view): Incremental view construction and conversation state view property by csmith49 · Pull Request #2141 · OpenHands/software-agent-sdk

csmith49 · 2026-02-19T23:19:25Z

Summary

This PR updates the conversation state with a persistent View object that is incrementally updated as events are recorded.

To support this, View creation has been refactored to be incremental. Events are added one at a time using the View.add_event function, which updates the view in-place and handles condensation application, unhandled condensation request tracking, and so on.

This should improve performance (we no longer need to rebuild a view from all events every time the agent takes a step) and make View a first-class object accessible wherever the conversation state is.

Design Decisions

Views are maintained by ConversationState objects as a private field, and exposed as a property. This ensures the view is not serialized when the conversation state is saved. Instead, we rebuild the view from the whole list of events when the state is deserialized.

The idea is to avoid accidentally serializing events outside the file system store we maintain, but there is the tradeoff that loading conversation states might be slightly more expensive. This is an easy decision to change should the need arise.

Performance "Benefits"

While reducing the number of View.from_events calls will technically improve performance, in most use cases we expect it will be incredibly minor gains.

While working on this PR I had a stack trace sampler monitoring the OpenHands ACP instance that was assisting me. On that trace, something like 99% of the samples from the Agent.step function were API calls to the LLM. Other traces have that number lower (to make room for critic calls and callback handlers), but View construction has never been more than 0.5% of the duration of Agent.step.

Point is, we're hugely network-bound.

The real benefit of this PR is making View a first-class object accessible to more of the system. It exposes precisely the list of events that are converted to messages and sent to the LLM, and so represents the agent's "attention window". That has to be helpful outside the condenser.

Breaking Changes

View.condensations field removed.
View.enforce_properties changed from a static method to an in-place modification of the calling view.
prepare_llm_messages no longer takes a list events as input. Instead of converting that list of events into a view it takes a view directly.

Lastly, because View.enforce_properties is only called when a view is constructed directly from a list of events (instead of being incrementally built), we now only enforce properties when conversation states are loaded from disk.

Checklist

If the PR is changing/adding functionality, are there tests to reflect this?
If there is an example, have you run the example to make sure that it works?
If there are instructions on how to run the code, have you followed the instructions and made sure that it works?
If the feature is significant enough to require documentation, is there a PR open on the OpenHands/docs repository with the same branch name?
Is the github CI passing?

Agent Server images for this PR

• GHCR package: https://github.com/OpenHands/agent-sdk/pkgs/container/agent-server

Variants & Base Images

Variant	Architectures	Base Image	Docs / Tags
java	amd64, arm64	`eclipse-temurin:17-jdk`	Link
python	amd64, arm64	`nikolaik/python-nodejs:python3.12-nodejs22`	Link
golang	amd64, arm64	`golang:1.21-bookworm`	Link

Pull (multi-arch manifest)

# Each variant is a multi-arch manifest supporting both amd64 and arm64
docker pull ghcr.io/openhands/agent-server:b6e828d-python

Run

docker run -it --rm \
  -p 8000:8000 \
  --name agent-server-b6e828d-python \
  ghcr.io/openhands/agent-server:b6e828d-python

All tags pushed for this build

ghcr.io/openhands/agent-server:b6e828d-golang-amd64
ghcr.io/openhands/agent-server:b6e828d-golang_tag_1.21-bookworm-amd64
ghcr.io/openhands/agent-server:b6e828d-golang-arm64
ghcr.io/openhands/agent-server:b6e828d-golang_tag_1.21-bookworm-arm64
ghcr.io/openhands/agent-server:b6e828d-java-amd64
ghcr.io/openhands/agent-server:b6e828d-eclipse-temurin_tag_17-jdk-amd64
ghcr.io/openhands/agent-server:b6e828d-java-arm64
ghcr.io/openhands/agent-server:b6e828d-eclipse-temurin_tag_17-jdk-arm64
ghcr.io/openhands/agent-server:b6e828d-python-amd64
ghcr.io/openhands/agent-server:b6e828d-nikolaik_s_python-nodejs_tag_python3.12-nodejs22-amd64
ghcr.io/openhands/agent-server:b6e828d-python-arm64
ghcr.io/openhands/agent-server:b6e828d-nikolaik_s_python-nodejs_tag_python3.12-nodejs22-arm64
ghcr.io/openhands/agent-server:b6e828d-golang
ghcr.io/openhands/agent-server:b6e828d-java
ghcr.io/openhands/agent-server:b6e828d-python

About Multi-Architecture Support

Each variant tag (e.g., b6e828d-python) is a multi-arch manifest supporting both amd64 and arm64
Docker automatically pulls the correct architecture for your platform
Individual architecture tags (e.g., b6e828d-python-amd64) are also available if needed

…ties Replace @cached_property with a manually-invalidated cache for View.manipulation_indices. The cached_property was never invalidated when add_event or enforce_properties mutated self.events, causing stale indices to be returned. Now _invalidate_manipulation_indices() is called in every branch of add_event and enforce_properties that modifies the events list. Co-authored-by: openhands <openhands@all-hands.dev>

github-actions · 2026-02-19T23:22:07Z

Coverage Report •

File	Stmts	Miss	Cover	Missing
openhands-sdk/openhands/sdk/agent
agent.py	231	36	84%	94, 98, 238–240, 242, 272–273, 280–281, 313, 366–367, 369, 409, 548–549, 554, 566–567, 572–573, 592–593, 595, 623–624, 631–632, 636, 644–645, 682, 688, 700, 707
utils.py	55	3	94%	62, 82–83
openhands-sdk/openhands/sdk/context/view
view.py	72	7	90%	82, 93–98
openhands-sdk/openhands/sdk/conversation
state.py	191	8	95%	188, 192, 203, 354, 400–402, 516
openhands-sdk/openhands/sdk/conversation/impl
local_conversation.py	329	18	94%	277, 282, 310, 375, 417, 563–564, 567, 713, 721, 723, 734, 736–738, 920, 927–928
TOTAL	18332	5570	69%

github-actions · 2026-02-19T23:29:07Z

Hi! I started running the integration tests on your PR. You will receive a comment with the results shortly.

github-actions · 2026-02-20T03:27:50Z

Hi! I started running the condenser tests on your PR. You will receive a comment with the results shortly.

Note: These are non-blocking tests that validate condenser functionality across different LLMs.

github-actions · 2026-02-20T03:32:44Z

Condenser Test Results (Non-Blocking)

These tests validate condenser functionality and do not block PR merges.

🧪 Integration Tests Results

Overall Success Rate: 100.0%
Total Cost: $0.99
Models Tested: 2
Timestamp: 2026-02-20 03:32:37 UTC

📊 Summary

Model	Overall	Tests Passed	Skipped	Total	Cost	Tokens
litellm_proxy_anthropic_claude_opus_4_5_20251101	100.0%	5/5	0	5	$0.90	440,917
litellm_proxy_gpt_5.1_codex_max	100.0%	2/2	3	5	$0.09	62,635

📋 Detailed Results

litellm_proxy_anthropic_claude_opus_4_5_20251101

Success Rate: 100.0% (5/5)
Total Cost: $0.90
Token Usage: prompt: 424,641, completion: 16,276, cache_read: 374,439, cache_write: 43,738, reasoning: 911
Run Suffix: litellm_proxy_anthropic_claude_opus_4_5_20251101_6519856_opus_condenser_run_N5_20260220_032807

litellm_proxy_gpt_5.1_codex_max

Success Rate: 100.0% (2/2)
Total Cost: $0.09
Token Usage: prompt: 58,879, completion: 3,756, cache_read: 18,304, reasoning: 1,792
Run Suffix: litellm_proxy_gpt_5.1_codex_max_6519856_gpt51_condenser_run_N5_20260220_032804
Skipped Tests: 3

Skipped Tests:

c01_thinking_block_condenser: Model litellm_proxy/gpt-5.1-codex-max does not support extended thinking or reasoning effort
c05_size_condenser: This test stresses long repetitive tool loops to trigger size-based condensation. GPT-5.1 Codex Max often declines such requests for efficiency/safety reasons.
c04_token_condenser: This test stresses long repetitive tool loops to trigger token-based condensation. GPT-5.1 Codex Max often declines such requests for efficiency/safety reasons.

all-hands-bot

🟡 Acceptable - Solid design making View a first-class object. Manual cache invalidation is correct but fragile. No critical issues.

Key Insight: The real win here isn't performance—it's making View an explicit representation of the agent's attention window, accessible throughout the system. Good architectural move.

See inline comments for improvement opportunities.

all-hands-bot · 2026-02-20T03:43:37Z

openhands-sdk/openhands/sdk/context/view/view.py

        for property in ALL_PROPERTIES:
            results &= property.manipulation_indices(self.events)
+
+        self._cached_manipulation_indices = results


🟠 Important: Manual cache invalidation is fragile. This works now, but it's easy to forget _invalidate_manipulation_indices() when adding new methods that modify self.events.

Suggestion: Consider documenting this invariant clearly, or use a different pattern like:

Wrap self.events in a custom list type that auto-invalidates on mutation

Use Pydantic's @computed_field(cached=True) if the model config allows it

Not blocking—current implementation is correct—but this is a maintenance risk.

all-hands-bot · 2026-02-20T03:43:37Z

openhands-sdk/openhands/sdk/context/view/view.py

        Since enforcement is intended as a fallback to inductively maintaining the
        properties via the associated manipulation indices, any time a property must be
        enforced a warning is logged.
+
+        Modifies the view in-place.
        """
        for property in ALL_PROPERTIES:
-            events_to_forget = property.enforce(current_view_events, all_events)
+            events_to_forget = property.enforce(self.events, all_events)
            if events_to_forget:
                logger.warning(
                    f"Property {property.__class__} enforced, "
                    f"{len(events_to_forget)} events dropped."
                )
-                return View.enforce_properties(
-                    [
-                        event
-                        for event in current_view_events
-                        if event.id not in events_to_forget
-                    ],
-                    all_events,
-                )
-        return current_view_events
+                self.events = [
+                    event for event in self.events if event.id not in events_to_forget
+                ]
+                self._invalidate_manipulation_indices()

-    @staticmethod
-    def from_events(events: Sequence[Event]) -> View:
-        """Create a view from a list of events, respecting the semantics of any
-        condensation events.
+                # If we've forgotten events to enforce the properties, we'll need to
+                # attempt to apply each property again. Once we get all the way through
+                # the properties without any kind of modification, we can exit the loop.
+                self.enforce_properties(all_events)


🟡 Suggestion: The recursive call with break is a bit harder to follow than necessary. The old tail-recursive pattern was clearer.

Consider making the recursion explicit:

def enforce_properties(self, all_events: Sequence[Event]) -> None: """Enforce all properties, recursively dropping events until stable.""" while True: changed = False for property in ALL_PROPERTIES: events_to_forget = property.enforce(self.events, all_events) if events_to_forget: logger.warning( f"Property {property.__class__} enforced, " f"{len(events_to_forget)} events dropped." ) self.events = [ event for event in self.events if event.id not in events_to_forget ] self._invalidate_manipulation_indices() changed = True break # Restart property checks if not changed: break # All properties satisfied

This makes it clearer that we're iterating until a fixed point.

all-hands-bot · 2026-02-20T03:43:37Z

openhands-sdk/openhands/sdk/context/view/view.py

+                logger.debug(f"Skipping non-LLMConvertibleEvent of type {type(event)}")
+
+    @staticmethod
+    def from_events(events: Sequence[Event]) -> View:


🟢 Nit: This incrementally builds the view, which is good. But there's no test coverage for the case where a non-LLMConvertibleEvent (that's not condensation-related) is added.

Consider adding a test that verifies these are silently skipped:

def test_add_event_skips_non_llm_convertible(): view = View() view.add_event(ConversationStateUpdateEvent(...)) assert len(view.events) == 0

Not critical, just helps document the behavior.

enyst · 2026-02-20T04:11:02Z

@OpenHands Do a /codereview-roasted on this PR. Post your review as a comment on the PR. Note that it will be rendered as markdown.

Once you posted, take a step back and look with fresh eyes: re-read critically your review and the PR, and analyze every point of it, is it actually real, is it minor, is it major compared to what it seemed? Post your re-analysis as a new comment on the PR.

openhands-ai · 2026-02-20T04:11:28Z

I'm on it! enyst can track my progress at all-hands.dev

enyst · 2026-02-20T04:18:04Z

/codereview-roasted

Taste rating: 🔴 Needs improvement

You took a relatively simple data model ("the event log is the source of truth") and replaced it with two mutable sources of truth: EventLog and a cached/derived View that must be kept in sync manually.

That can be fine if you make it impossible to get out of sync. Right now it’s very possible, and the failure mode is the worst kind: silent wrong behavior.

[CRITICAL ISSUES]

ConversationState now has a cache without a hard invalidation story
- Where: openhands-sdk/openhands/sdk/conversation/state.py (view + add_event at ~206–218), openhands-sdk/openhands/sdk/agent/agent.py (uses state.view at ~246)
- Problem: state.events is still exposed as a mutable EventLog, but only ConversationState.add_event() updates the view. Any code doing the old/obvious thing:
```
state.events.append(e)
```
  will now silently fail to update state.view, and therefore the next LLM call (which now uses state.view) will ignore those events.
  
  This is a behavioral break that won’t crash; it’ll just make agents “forget” things and make debugging miserable.
- Fix direction: either (a) make events append go through ConversationState.add_event (proxy/wrapper), (b) stop exposing a mutable event log, or (c) compute the view lazily from events (invalidate on append) instead of keeping two independently-mutable structures.
View.add_event() can poison the view permanently after a bad Condensation
- Where: openhands-sdk/openhands/sdk/context/view/view.py (add_event ~130–169, enforcement ~100–129)
- Problem: add_event() applies Condensation.apply() directly (~153–156) but does not re-run enforce_properties(). Since Condensation.apply() can forget arbitrary IDs, it can violate ToolCallMatchingProperty / BatchAtomicityProperty invariants.
  
  Previously, rebuilding a view via from_events() + enforcement was an automatic “self-heal” step. Now enforcement only happens on resume (ConversationState.create(... resume ...) rebuilds view). In other words: one malformed condensation can corrupt the runtime view for the rest of the process.
- Fix direction: enforce properties (or at least validate) after applying a Condensation, or enforce at the ConversationState.add_event() layer when the appended event is a Condensation.

[IMPROVEMENT OPPORTUNITIES]

Breaking API churn with no compatibility shim
- Where: openhands-sdk/openhands/sdk/agent/utils.py (prepare_llm_messages now takes View at ~114–179)
- Problem: This is a signature-breaking change in a utility function that downstream users will import.
- Fix direction: accept both View and Sequence[Event] (deprecated path), or provide a transitional helper.
Manual caching + public mutability is a trap
- Where: openhands-sdk/openhands/sdk/context/view/view.py (manipulation_indices cache ~40–60)
- Problem: you fixed invalidation for the mutations you control (add_event, enforce_properties), but view.events is still a public list. Any direct mutation makes the cache wrong again.
- Fix direction: either treat events as effectively private (and document that), or stop caching something derived from a mutable list you don’t control.
Dead helper left behind
- Where: View.unhandled_condensation_request_exists (view.py ~84–98)
- Problem: unused after the incremental refactor. It’s harmless, but it reads like unfinished surgery.

[TESTING GAPS]

The new tests/sdk/conversation/local/test_state_view.py is good coverage for the happy path.
What’s missing is a test for the real regression risk introduced by the cache:
- If state.events.append() is still “allowed”, verify the view stays correct (or explicitly assert it doesn’t and document that add_event() is required).

VERDICT

❌ Needs rework before I’d be comfortable calling this “safer/faster”. The concept (incremental view) is fine; the current implementation relies on “everyone always uses the right mutation path” and that’s not an engineering strategy.

KEY INSIGHT

You didn’t add a View; you added a cache. Caches must have a bulletproof invalidation model, or they become distributed lies.

enyst · 2026-02-20T04:18:53Z

Re-analysis of my /codereview-roasted (fresh eyes)

Re-reading both the PR and my own comment, here’s what I think is actually important vs what’s more “defense-in-depth / taste”.

Still major / real

ConversationState.events vs ConversationState.view divergence is a genuine behavioral trap.
- This isn’t theoretical: the SDK continues to expose a mutable EventLog via state.events, but Agent.step() now uses state.view to build the LLM context.
- If any downstream code has ever appended events directly to state.events (which used to be the obvious / only knob), those events will now be omitted from the LLM context silently.
- Even if maintainers consider state.events.append() “internal”, the current API surface doesn’t communicate that, and the failure mode is non-obvious.

If only one thing gets addressed, it should be this: either make events effectively append-only through ConversationState.add_event() (proxy/wrapper), or make it impossible/explicitly unsupported to mutate events directly.

Probably medium / depends on invariants

“Bad Condensation poisons view”: this is more about robustness than an immediate bug.
- In the intended flow, condensers should only forget along manipulation_indices, so properties should remain satisfied by construction.
- However, the moment you accept that Condensation.apply() can remove arbitrary IDs (it can), then skipping any enforcement/validation after applying a Condensation means the view can become invalid until restart/resume.

So: I still think a post-condensation sanity/enforcement hook is a good idea, but I’d downgrade this from “guaranteed breakage” to “cheap guardrail that prevents future weirdness / version skew issues”.

Probably minor

prepare_llm_messages(events -> view) signature break: likely low impact inside this repo (only internal callers), but could break downstream imports. Whether that matters depends on how “public” you consider openhands.sdk.agent.utils.
Manual caching + mutable list: real, but typical. If View.events is treated as internal, the cache is fine.
Unused helper (unhandled_condensation_request_exists): pure cleanup nit.

Net

My original review wasn’t wrong about the core risk, but I was too harsh in implying the condensation enforcement change is necessarily a runtime bug. The API/invariant mismatch between events and view is the only point I’d call “must address”, because it’s the one that can bite real users without any obvious stacktrace.

openhands-ai · 2026-02-20T04:19:42Z

Summary of work completed

Retrieved PR feat(view): Incremental view construction and conversation state view property #2141 metadata and inspected the key diffs affecting:
- View incremental construction (View.add_event, in-place enforce_properties, and manipulation_indices caching)
- ConversationState gaining a persistent private _view and a view property, plus add_event() to keep _events and _view in sync
- Agent/conversation paths switching from state.events → state.view for prepare_llm_messages
- Test updates and new tests covering view synchronization and cache invalidation
Posted a /codereview-roasted review as a markdown PR comment focusing on the main engineering risks:
- Two mutable sources of truth (state.events vs state.view) and the silent divergence hazard if anything appends directly to state.events
- Potential invariant breakage if a Condensation forgets events in a way that violates view properties and enforcement isn’t re-run at runtime
- API churn risk (prepare_llm_messages signature change), cache-vs-mutable-list pitfalls, and leftover unused helper
Then posted a second PR comment with a fresh-eyes re-analysis, downgrading some points and re-emphasizing the one truly major issue:
- The events vs view divergence trap remains the “must address” item
- Condensation enforcement concern reframed as robustness/guardrail rather than guaranteed bug

Compliance checklist (per request)

Did a /codereview-roasted on the PR
Posted the review as a PR comment (markdown)
Re-read critically and posted a separate re-analysis comment

Changes / pushing

No code changes were made in the repository as part of this request, so there is nothing to push.

csmith49 added 8 commits February 19, 2026 14:34

removing condensations field

a224c64

fixing tests referencing removed field

ad15734

extracting event addition semantics into a separate function

8b18f6b

view from_events refactored

0263df0

minor doc changes

18ae2d4

view in conversation state

36302c5

agent using states view instead of always recreating

dfb1b63

utility function for adding event to state

02e18cb

csmith49 self-assigned this Feb 19, 2026

csmith49 and others added 2 commits February 19, 2026 16:20

Merge branch 'main' into feat/incremental-view

9708256

csmith49 changed the title ~~feat(view): Incremental view construction~~ feat(view): Incremental view construction and conversation state view property Feb 19, 2026

csmith49 added the integration-test Runs the integration tests and comments the results label Feb 19, 2026

Merge branch 'main' into feat/incremental-view

6519856

csmith49 added condenser-test Triggers a run of all condenser integration tests and removed integration-test Runs the integration tests and comments the results labels Feb 20, 2026

csmith49 marked this pull request as ready for review February 20, 2026 03:41

all-hands-bot approved these changes Feb 20, 2026

View reviewed changes

Conversation

csmith49 commented Feb 19, 2026 • edited by github-actions bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Design Decisions

Performance "Benefits"

Breaking Changes

Checklist

Uh oh!

github-actions bot commented Feb 19, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

github-actions bot commented Feb 19, 2026

Uh oh!

github-actions bot commented Feb 20, 2026

Uh oh!

github-actions bot commented Feb 20, 2026

Condenser Test Results (Non-Blocking)

🧪 Integration Tests Results

📊 Summary

📋 Detailed Results

litellm_proxy_anthropic_claude_opus_4_5_20251101

litellm_proxy_gpt_5.1_codex_max

Uh oh!

all-hands-bot left a comment

Choose a reason for hiding this comment

Uh oh!

all-hands-bot Feb 20, 2026

Choose a reason for hiding this comment

Uh oh!

all-hands-bot Feb 20, 2026

Choose a reason for hiding this comment

Uh oh!

all-hands-bot Feb 20, 2026

Choose a reason for hiding this comment

Uh oh!

enyst commented Feb 20, 2026

Uh oh!

openhands-ai bot commented Feb 20, 2026

Uh oh!

enyst commented Feb 20, 2026

/codereview-roasted

Taste rating: 🔴 Needs improvement

[CRITICAL ISSUES]

[IMPROVEMENT OPPORTUNITIES]

[TESTING GAPS]

VERDICT

KEY INSIGHT

Uh oh!

enyst commented Feb 20, 2026

Re-analysis of my /codereview-roasted (fresh eyes)

Still major / real

Probably medium / depends on invariants

Probably minor

Net

Uh oh!

openhands-ai bot commented Feb 20, 2026

Summary of work completed

Compliance checklist (per request)

Changes / pushing

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Comments

csmith49 commented Feb 19, 2026 •

edited by github-actions bot

Loading

github-actions bot commented Feb 19, 2026 •

edited

Loading