feat: optimize web UI performance for large chat histories#21
Conversation
Covers conditional chat list snapshots, log pagination, lazy deserialization at startup, scoped dirty signals, and batched DOM rendering. Made-with: Cursor
behavior, auto-discovery, add tasks-store.js to file list Made-with: Cursor
Five coordinated changes to reduce load times, memory usage, and network overhead: 1. Conditional chat list in snapshots — skip sending contexts[]/tasks[] when the list hasn't changed (timestamp-based versioning) 2. Log pagination — send only last 50 entries on chat switch, load earlier messages on demand via new POST /chat_logs endpoint 3. Lazy deserialization — defer heavy History parsing at startup, hydrate agents only when a chat is actually accessed 4. Scoped dirty signals — new log items notify only clients viewing that context instead of broadcasting to all connections 5. Batched DOM rendering — render messages in chunks of 20 via requestAnimationFrame to eliminate UI freezes Includes 14 unit tests covering all new behavior. Made-with: Cursor
Add chat_list_updated_at and has_earlier_logs to hardcoded snapshot dicts in test_snapshot_schema_v1.py and test_state_snapshot.py. Made-with: Cursor
Nafania
left a comment
There was a problem hiding this comment.
Code Review: feat: optimize web UI performance for large chat histories
Strengths
- Well-designed conditional chat list — The timestamp-based
touch_chat_list()/chat_list_sinceprotocol is clean and effective. SkippingAgentContext.all()iteration when the list hasn't changed is a meaningful optimization. - Clean
Log.output(tail=N)implementation — The tail slicing logic correctly deduplicates updates, handles edge cases (empty log, tail > total, exact count), and the unit tests for it are thorough (8 test cases). - Scoped dirty signals are a solid win — Switching from
mark_dirty_alltomark_dirty_for_contextin_notify_state_monitoris the right fix. Combined with conditional chat lists, other tabs no longer need the broadcast. - Good spec — The design doc is clear, well-structured, and covers edge cases. The 5 coordinated changes are well-motivated.
Issues
Important
See inline comments for details on each issue.
-
Missing
_ensure_hydrated()guards — Several existing code paths (api_message.py,history_get.py,ctx_window_get.py,import_knowledge.py,knowledge_path_get.py,task_scheduler.py,settings.py) accesscontext.agent0directly without calling_ensure_hydrated(). They will silently get a default empty agent instead of the deserialized one. -
Log.output()polymorphic return type — Returnslistwithouttail,tuple[list, bool]withtail. This dual return type is error-prone and will confuse future callers. -
chat_logs.pyusesbeforeas array index —beforecomes from the DOM element'snoattribute but is used as a direct array index intolog.logs. Ifnovalues ever diverge from array positions, pagination breaks silently. -
chat_logs.pyaccesses privatelog._lock— Breaks encapsulation. The lock management should stay insideLog. -
Batched rendering races with scroll-to-bottom —
shouldScrollandscrollToBottom()fire after the first batch of 20 but before the remainingrequestAnimationFramebatches render. -
loadEarlierLogshas no double-click protection — Rapid clicks trigger multiple concurrent fetches for the same page, duplicating prepended messages.
Minor
-
prependMessagesclearsinnerHTML— Could cause a flash of empty content.insertBeforeorDocumentFragmentwould be safer. -
Lazy deserialization and chat logs tests are too shallow —
test_raw_agents_default_noneinspects source text viainspect.getsource,test_chat_logs_importjust checks the module imports. Neither catches behavioral regressions. -
Unrelated OAuth docs in this PR — Commits 1–4 (OAuth design spec, review fixes, implementation plan, plan review fixes) are unrelated to webui performance. They should be in their own PR to keep the review scope clean.
Assessment
Ready to merge: No — fix Important issues first
The core performance ideas are sound and the backend changes are mostly clean. The main concern is the lazy deserialization — _ensure_hydrated() is only called in 3 places (nudge, get_agent, _serialize_context) but agent0 is accessed directly from at least 7 other locations. This will cause silent data loss (empty history) for users who e.g. export history or view context window on a lazily-loaded chat. The batched rendering scroll race and the missing loading guard are also likely to cause visible UX issues.
|
|
||
| def _ensure_hydrated(self): | ||
| """Ensure agents and history are fully deserialized (lazy hydration).""" | ||
| if self._raw_agents is not None: |
There was a problem hiding this comment.
Important: Missing _ensure_hydrated() guards on external call sites.
This method is only called from nudge(), get_agent(), and _serialize_context(). But several other code paths access context.agent0 or context.streaming_agent directly:
python/api/api_message.py:84—context.agent0.config.profilepython/api/history_get.py:8—context.streaming_agent or context.agent0python/api/ctx_window_get.py:10—context.streaming_agent or context.agent0python/api/import_knowledge.py:19,42—context.agent0python/api/knowledge_path_get.py:18—context.agent0python/helpers/task_scheduler.py:894—context.streaming_agent or context.agent0python/helpers/settings.py:623—ctx.agent0
Before hydration, these will get the default empty Agent(0, config, self) created in __init__, not the deserialized agent with history. For example, exporting chat history (history_get) on a lazily-loaded chat would return empty.
Suggested fix: either (a) add _ensure_hydrated() calls to all external access points that touch agent0 (safest), or (b) override __getattr__ to auto-hydrate on agent0 access (more elegant but harder to debug).
There was a problem hiding this comment.
Fixed. agent0 and streaming_agent are now auto-hydrating properties — getter calls _ensure_hydrated() before returning the backing field. All 7 unguarded access points are now covered automatically. No need to hunt for callers.
There was a problem hiding this comment.
Verified. Property-based auto-hydration is the right approach — covers all callers without hunting for them.
Re-entrancy protection is correct: hydrate_context_agents clears _raw_agents = None before calling _deserialize_agents, so if anything during deserialization accesses context.agent0, the property getter sees _raw_agents is None and returns the backing field immediately (no infinite recursion).
One edge case to be aware of: if _deserialize_agents raises, _raw_agents is already cleared, so the context permanently has the default empty agent with no way to retry hydration. Acceptable — if deserialization fails, the data is corrupt anyway.
|
|
||
| out = [logs[u].output() for u in unique] | ||
|
|
||
| if tail is not None: |
There was a problem hiding this comment.
Important: Polymorphic return type.
output() returns list when tail is None and tuple[list, bool] when tail is set. This dual return type is error-prone — any new caller must know to check whether tail was passed to decide how to unpack the result.
The caller in state_snapshot.py already handles this by branching on from_no == 0, but nothing in the type signature warns future developers.
Suggested fix: always return a named tuple or dataclass LogOutput(items=..., has_earlier=...), or split into two explicit methods (e.g. output() and output_tail(n)).
There was a problem hiding this comment.
Fixed. output() now always returns (list, bool). Single caller in state_snapshot.py updated to unpack with logs, _ = ....
There was a problem hiding this comment.
Verified. Clean fix — single return shape, single caller updated.
| return {"logs": [], "has_more": False} | ||
|
|
||
| log = context.log | ||
| with log._lock: |
There was a problem hiding this comment.
Important: Accessing private log._lock from an API handler breaks encapsulation.
If Log ever changes its locking strategy (e.g. switches to RLock, adds a read/write lock, or removes the lock), this handler silently breaks.
Suggested fix: add a public method to Log, e.g. get_paginated(before, limit) -> dict, that handles the locking internally and returns {"logs": [...], "has_more": bool}.
There was a problem hiding this comment.
Fixed. Added Log.get_items_before(before, limit) — locking and slicing are now internal to Log. chat_logs.py is a one-liner delegation.
There was a problem hiding this comment.
Verified. get_items_before() encapsulates locking and limit clamping. chat_logs.py is now a clean one-liner delegation.
| if before <= 0: | ||
| before = len(all_logs) | ||
|
|
||
| start_idx = max(0, before - limit) |
There was a problem hiding this comment.
Important: before is used as an array index, but it comes from a log item no.
The frontend extracts before from firstMsg.id.replace("message-", ""), which is the log item's no attribute. Here it is used directly as an array index (all_logs[start_idx:before]). This works only if no values are contiguous and equal to array positions.
If log items are ever renumbered, deleted, or have gaps in no values, this pagination will silently return wrong items or crash with an IndexError.
Suggested fix: search for the item with no == before using bisect or a linear scan, then slice relative to that position. Or ensure no is always the array index by contract and document that invariant.
There was a problem hiding this comment.
no == array index is a structural invariant: Log never deletes items, no is assigned as len(self.logs) before append, and _deserialize_log uses sequential i. The new get_items_before() method keeps the slicing inside Log where this invariant is enforced.
There was a problem hiding this comment.
Checked. LogItem.no is assigned as len(self.logs) before append, Log never deletes items, _deserialize_log uses sequential indices. The invariant holds by construction. Moving the slicing into Log.get_items_before() co-locates the assumption with its enforcement — accepted.
| for (let i = 0; i < messages.length; i++) { | ||
| _massRender = historyEmpty || (isLargeAppend && i < cutoff); | ||
| results.push(setMessage(messages[i]) || {}); | ||
| if (massRender && messages.length > BATCH_SIZE) { |
There was a problem hiding this comment.
Important: Batched rendering races with scroll-to-bottom.
In the batched path (lines 94–113), results only captures items from the first batch of 20. The shouldScroll decision and scrollToBottom() call at the end of setMessages (around line 125) fire immediately after the first batch — before the remaining requestAnimationFrame batches have rendered.
Result: the chat appears to scroll to the bottom of the first 20 messages, but subsequent batches render below the viewport without scrolling.
Suggested fix: track whether we need to scroll, and in the last requestAnimationFrame callback, trigger the scroll after the final batch renders.
There was a problem hiding this comment.
Fixed. scrollToBottom now fires in the terminal condition of renderNextBatch — after the last RAF batch completes. The batched path returns early before the normal scroll call.
There was a problem hiding this comment.
Verified. shouldScroll computed after first batch, scroll deferred to terminal condition of renderNextBatch. The return results on line 121 correctly short-circuits the normal scroll path. In the mass-render case historyEmpty is true, so shouldScroll is always true regardless of first-batch content — correct for chat-switch.
|
|
||
| const existingNodes = Array.from(history.children); | ||
|
|
||
| history.innerHTML = ""; |
There was a problem hiding this comment.
Minor: prependMessages clears innerHTML then re-appends nodes.
This approach works correctly (DOM node references survive innerHTML = "") but will cause a brief flash of empty content on slower devices between clearing and re-appending.
Consider using insertBefore(fragment, history.firstChild) with a DocumentFragment for the prepended messages instead — no clearing needed, no flicker.
There was a problem hiding this comment.
Fixed. prependMessages now moves existing children into a DocumentFragment, renders old messages, then re-appends the fragment. No innerHTML clearing, no flash.
There was a problem hiding this comment.
Verified. No innerHTML clearing, no flicker. The while (history.firstChild) loop is O(n) DOM detachments vs a single innerHTML = "", but the flicker elimination is worth it. A future optimization could use insertBefore(newFragment, history.firstChild) to avoid detaching existing children entirely, but this is fine for now.
| return hasEarlierLogs; | ||
| } | ||
|
|
||
| export async function loadEarlierLogs() { |
There was a problem hiding this comment.
Important: No loading state or double-click protection.
loadEarlierLogs is async but the "Load earlier messages" button is never disabled during the fetch. Rapid clicks will fire multiple concurrent requests for the same page of logs, and prependMessages will insert duplicates each time.
Suggested fix:
let loadingEarlierLogs = false;
export async function loadEarlierLogs() {
if (loadingEarlierLogs || !hasEarlierLogs || !context) return;
loadingEarlierLogs = true;
try { ... } finally { loadingEarlierLogs = false; }
}Also consider showing a loading spinner on the button while the request is in flight.
There was a problem hiding this comment.
Fixed. Added loadingEarlierLogs guard with try/finally. Button shows "Loading..." + disabled=true during the fetch.
There was a problem hiding this comment.
Verified. Guard flag with try/finally, button shows "Loading..." + disabled=true during fetch. Clean.
| # --------------------------------------------------------------------------- | ||
|
|
||
| class TestLazyDeserialization: | ||
|
|
There was a problem hiding this comment.
Minor: This test inspects source code text, not behavior.
inspect.getsource(AgentContext.__init__) + assert "_raw_agents" in src passes as long as the string appears anywhere in __init__ — even in a comment. It doesn't verify that lazy deserialization actually works.
Suggested: create an AgentContext with _raw_agents set to a non-None value, verify agent0 is the default before hydration, call _ensure_hydrated(), then verify agent0 is the properly deserialized agent.
Similarly, TestChatLogsEndpoint.test_chat_logs_import (line 178) only checks the module imports — consider testing actual pagination logic (e.g., create a Log with items, call the handler, verify slicing and has_more).
There was a problem hiding this comment.
Fixed. Replaced source-text inspection with behavioral test for hydration (test_hydrate_deserializes_agents). Added 4 behavioral tests for get_items_before() covering pagination, has_more flag, zero-before fallback, and limit clamping.
There was a problem hiding this comment.
Verified. test_hydrate_deserializes_agents tests actual behavior: passes raw data, asserts _deserialize_agents was called, confirms _raw_agents cleared. The 4 get_items_before tests cover basic slice, boundary (from start), zero-before fallback, and limit clamping. Meaningful improvement.
test_raw_agents_default_none still uses inspect.getsource — would be cleaner as an actual instantiation test, but it is not blocking.
1. agent0/streaming_agent → auto-hydrating properties (guards all 7 unguarded access points: api_message, history_get, ctx_window_get, import_knowledge, knowledge_path_get, task_scheduler, settings) 2. Log.output() always returns (list, bool) — no polymorphic return 3. Log.get_items_before() encapsulates pagination logic; chat_logs.py no longer accesses private log._lock 4. Batched rendering defers scrollToBottom to last RAF callback 5. loadEarlierLogs: loading guard + button disabled state 6. prependMessages uses DocumentFragment instead of innerHTML clear 7. Tests: +5 behavioral tests for get_items_before and hydration Made-with: Cursor
Re-review after fix commit 5f0499eVerified all 8 fixes against the code. Summary:
All Important issues resolved. Ready to merge after tests pass. |
Three tests expected log.output() to return a plain list; now it always returns (list, bool). Updated to unpack correctly. Made-with: Cursor
_60_rename_chat.py sets context.name but never notified clients. Before scoped dirty signals this was masked by mark_dirty_all on every log update. Now we explicitly touch_chat_list() + mark_dirty_all() after rename so all tabs see the updated name immediately. Made-with: Cursor
Covers three changes: move unread indicator to separate dot, add relative timestamps, sort chats by last_message. Made-with: Cursor
- Move unread indicator to separate blue dot right of chat name (no longer overrides project color ball) - Add relative timestamp (1s/5m/2h/3d/1w) with full datetime tooltip - Sort chats by last_message descending (most recent activity first) Made-with: Cursor
Add a reactive _tick counter that increments via setInterval, referenced in x-text so Alpine re-evaluates formatRelativeTime. Made-with: Cursor
Agent.hist_add_message was setting self.last_message on the Agent instance, but AgentContext.output() reads context.last_message which was only set at creation time. Now updates both and calls touch_chat_list() so sidebar timestamps and sort order reflect actual activity. Made-with: Cursor
After _process_chain completes (or fails), running status changes from true to false. Without touch_chat_list() the frontend never receives updated contexts[] showing running:false, causing messages to be incorrectly queued. Added finally block to ensure chat list timestamp is bumped when agent work ends. Made-with: Cursor
Alpine.js x-text doesn't properly evaluate void+comma operator expressions. Pass _tick as second arg to formatRelativeTime so Alpine tracks the reactive dependency naturally. Made-with: Cursor
mark_dirty_for_context only pushes to clients viewing the active chat. When user switches away before agent finishes, they never receive the running:false update, causing spinner to persist. Now mark_dirty_all() is called alongside touch_chat_list() in the finally block so all clients/tabs get the updated running status. Made-with: Cursor
process_chain completion, formatRelativeTime, and sort order Covers sections 6-10 in test_webui_performance.py: - Chat rename calls touch_chat_list + mark_dirty_all - hist_add_message updates context.last_message - _process_chain finally block broadcasts completion - formatRelativeTime boundary values (s/m/h/d/w) - Chat list sorts by last_message, not created_at Made-with: Cursor
Nafania
left a comment
There was a problem hiding this comment.
Code Review: commits since 5f0499e (sidebar UX, chat signals, test fixes)
Reviewing the 11 new commits: test_log.py fix, merge from main, chat rename refresh, sidebar UX improvements (spec + implementation + 4 follow-up fixes), and new tests.
Strengths
- Chat rename refresh is a good catch — Scoped dirty signals exposed that rename wasn't propagating to other tabs. The fix in
_60_rename_chat.pyis clean and correctly uses bothtouch_chat_list()+mark_dirty_all()since renaming affects all connected clients. - Sidebar UX improvements are well-scoped — Separate unread dot (no longer hijacking project color ball), relative timestamps with tooltip, sort by
last_message. Clean spec, clean implementation. _process_chainfinally block — Correctly broadcastsrunning: falseto all clients even on exception. Theif user:guard prevents subordinate agent calls from triggering unnecessary broadcasts.- Alpine
_tickreactivity — Using_tickas a function argument so Alpine tracks the dependency naturally is the right fix for the void+comma issue. The 10s interval matches the timestamp granularity. test_hist_add_message_sets_both_timestamps— This is a proper behavioral test: calls the actual method, checks the result. Good pattern.
Issues
See inline comments for details.
Important
hist_add_messagecallstouch_chat_list()on every message — effectively disables the conditional chat list optimization during active conversations.
Minor
- New tests repeat the
inspect.getsourceanti-pattern — flagged in the previous review round.
Assessment
Ready to merge: Fix #1 first
The sidebar UX changes are clean and the signal fixes are necessary. The only Important issue is that calling touch_chat_list() on every message during streaming continuously bumps _chat_list_updated_at, so every state push includes the full contexts[]/tasks[] — negating the optimization. A simple debounce fixes it.
Note: OAuth files appeared in the local diff due to the merge with main, but they're not part of this PR's scope. No issues with them in this context.
| self.last_message = now | ||
| self.context.last_message = now | ||
| from python.helpers.state_snapshot import touch_chat_list | ||
| touch_chat_list() |
There was a problem hiding this comment.
Important: touch_chat_list() on every message negates the conditional chat list optimization.
Every call to hist_add_message (user message, AI response, tool results) bumps _chat_list_updated_at. During active streaming, state pushes happen every ~250ms. Since chat_list_since < _chat_list_updated_at is always true, the server sends the full contexts[]/tasks[] on every push — exactly the behavior the original PR optimized away.
Updating context.last_message is correct (sidebar needs it for sort order and timestamps). The issue is calling touch_chat_list() unconditionally.
Suggested fix: debounce touch_chat_list() — only bump _chat_list_updated_at if more than N seconds have elapsed since the last bump from hist_add_message. 5-10s would match the sidebar's timestamp display granularity.
import time
_last_msg_touch = 0.0
_MSG_TOUCH_INTERVAL = 5.0
def _touch_for_last_message():
global _last_msg_touch
now = time.time()
if now - _last_msg_touch >= _MSG_TOUCH_INTERVAL:
_last_msg_touch = now
from python.helpers.state_snapshot import touch_chat_list
touch_chat_list()Alternatively, include last_message in the per-context dirty signal instead of the full chat list rebuild.
| # --------------------------------------------------------------------------- | ||
| # 6. Chat rename triggers chat list refresh | ||
| # --------------------------------------------------------------------------- | ||
|
|
There was a problem hiding this comment.
Minor: New tests repeat the inspect.getsource anti-pattern.
TestChatRenameRefresh.test_rename_extension_calls_touch_chat_list, TestHistAddMessageUpdatesContext.test_hist_add_message_updates_context_last_message, and TestProcessChainCompletion all use inspect.getsource() + string assertions. This was flagged in the previous review.
These tests pass as long as the string appears in the source text — even inside a comment, dead branch, or if False: block. test_hist_add_message_sets_both_timestamps (line 328) shows the right approach: actually call the method and assert on results.
Not blocking, but worth converting when you next touch these tests.
…ehavioral 1. Debounce touch_chat_list() in Agent.hist_add_message with a 5s interval so streaming messages don't negate the conditional chat list optimization (every push was sending full contexts[]/tasks[]). 2. Replace all inspect.getsource-based tests with behavioral tests that actually call the methods and assert on outcomes: - TestChatRenameRefresh: runs change_name, asserts touch+dirty called - TestHistAddMessageUpdatesContext: adds debounce coverage - TestProcessChainCompletion: runs _process_chain, asserts signals Made-with: Cursor
Cherry-picked from upstream commit 5b32406. - Summarizes long chat histories to reduce context window usage - Supports both chat and utility model for compaction - UI button to trigger compaction manually - Configurable via plugin settings Made-with: Cursor
feat(#21): Add chat compaction plugin
Summary
contexts[]/tasks[]on every WebSocket push; use timestamp-based versioning so the list is only included when it actually changed (create/remove/rename)POST /chat_logsendpoint with scroll-position preservationHistoryobject parsing; only metadata is loaded at boot, agents hydrated on first accessLog._notify_state_monitornow usesmark_dirty_for_contextinstead ofmark_dirty_all, so log updates only push to clients viewing that chatrequestAnimationFrameto eliminate UI freezesTest plan
Log.output(tail=N), conditional chat list, lazy deserialization, scoped dirty signals, andChatLogsendpointSpec:
docs/superpowers/specs/2026-03-26-webui-performance-optimization-design.mdMade with Cursor