Skip to content

fix: unarchive session on touch, stop cache eviction on archive#16031

Closed
BYK wants to merge 18 commits intoanomalyco:devfrom
BYK:fix/archived-session-cache-eviction
Closed

fix: unarchive session on touch, stop cache eviction on archive#16031
BYK wants to merge 18 commits intoanomalyco:devfrom
BYK:fix/archived-session-cache-eviction

Conversation

@BYK
Copy link
Copy Markdown
Contributor

@BYK BYK commented Mar 4, 2026

Issue for this PR

Closes #16030

Type of change

  • Bug fix
  • New feature
  • Refactor / code improvement
  • Documentation

What does this PR do?

Fixes two related bugs that occur when interacting with an archived session:

Problem: When you navigate to an archived session and send a prompt, Session.touch() publishes a session.updated SSE event with time.archived still set. The frontend event-reducer sees this and unconditionally removes the session from the store and wipes all cached data (messages, parts, diffs, todos, permissions, questions, session_status). This causes the "Unable to retrieve session" toast and makes all messages disappear.

Fix (2 parts):

  1. Backend — Session.touch() now clears time_archived (session/index.ts): When a session is touched (during prompting), time_archived is set to null. This means the published SSE event has time.archived = undefined (falsy), so the event-reducer treats it as a normal update — the session gets upserted back into the sidebar list instead of removed. This is the correct behavior: interacting with an archived session should unarchive it.

  2. Frontend — Remove cleanupSessionCaches() from archived branch (event-reducer.ts): Defense in depth. The session.updated handler still removes archived sessions from the sidebar list and decrements sessionTotal, but no longer wipes cached message/part/diff/todo/permission/question/status data. Archive is reversible — the user can navigate back at any time — and caches are harmless to keep. Memory is reclaimed naturally by store trimming. session.deleted still cleans caches since deletion is permanent.

How did you verify your code works?

  • Backend test: new test in session.test.ts verifies touch() on an archived session clears time_archived and publishes an event with archived: undefined
  • Frontend test: updated event-reducer.test.ts — the archive test now asserts that caches are preserved (session removed from list, sessionTotal decremented, but message/part/diff/todo/permission/question/status all still defined)
  • All existing tests pass (15 tests across both files)

Checklist

  • I have tested my changes locally
  • I have not included unrelated changes in this PR

@BYK BYK requested a review from adamdotdevin as a code owner March 4, 2026 17:55
@BYK BYK force-pushed the fix/archived-session-cache-eviction branch 2 times, most recently from 1b2c802 to 49237cc Compare March 16, 2026 15:05
@BYK BYK force-pushed the fix/archived-session-cache-eviction branch 6 times, most recently from 23b4805 to ef517c5 Compare March 26, 2026 12:26
@BYK BYK force-pushed the fix/archived-session-cache-eviction branch 2 times, most recently from 3881055 to d1ba882 Compare April 1, 2026 18:37
BYK added 18 commits April 2, 2026 20:05
meta.limit grows monotonically via loadMore() as the user scrolls up
through message history (200 → 400 → 600 → ...). When switching sessions
and returning, the 15s TTL expires and triggers a force-refresh that
reuses this inflated limit as the fetch size — re-downloading the entire
browsing history from scratch on every session switch.

Cap the limit to messagePageSize (200) on force-refreshes so only the
latest page is fetched. loadMore() still works normally after returning
since loadMessages() resets meta.cursor to match the fresh page.
…ded dir walk

Three changes to reduce server event loop contention when many projects
load concurrently (e.g. opening ~/Code with 17 sidebar projects):

1. Instance bootstrap concurrency limit (p-limit, N=3):
   Each Instance bootstrap spawns 5-7 subprocesses (git, ripgrep,
   parcel/watcher). With 17 concurrent bootstraps that's ~100 subprocesses
   overwhelming a 4-core SATA SSD system. The p-limit semaphore gates new
   directory bootstraps while allowing cache-hit requests through instantly.
   Peak subprocesses: ~100 → ~18.

2. Async Filesystem.exists and isDir:
   Filesystem.exists() wrapped existsSync() in an async function — it
   looked async but blocked the event loop on every call. Replaced with
   fs.promises.access(). Same for isDir (statSync → fs.promises.stat).
   This lets the event loop serve health checks and API responses while
   directory walk-ups are in progress. Added Filesystem.existsSync() for
   callers that genuinely need sync behavior.

3. Bounded Filesystem.up/findUp walk depth (maxDepth=10):
   Previously walked all the way to / with no limit. Added maxDepth
   parameter (default 10) as a safety net against degenerate deep paths.
…g in memory

Previously, bash tool and /shell command accumulated ALL stdout/stderr
in an unbounded string — a verbose command could grow to hundreds of MB.

Now output beyond Truncate.MAX_BYTES (50KB) is streamed directly to a
spool file on disk. Only the first 50KB is kept in memory for the
metadata preview. The full output remains recoverable via the spool file
path included in the tool result metadata.
- LSP: delete diagnostic entries when server publishes empty array
  instead of keeping empty arrays in the map forever
- RPC: add 60s timeout to pending calls so leaked promises don't
  accumulate indefinitely
- FileTime: skip (already handled by Effect InstanceState migration)
Two optimizations to drastically reduce memory during prompting:

1. filterCompactedLazy: probe newest 50 message infos (1 query, no
   parts) to detect compaction. If none found, fall back to original
   single-pass filterCompacted(stream()) — avoids 155+ wasted info-only
   queries for uncompacted sessions. Compacted sessions still use the
   efficient two-pass scan.

2. Context-window windowing: before calling toModelMessages, estimate
   which messages from the tail fit in the LLM context window using
   model.limit.context * 4 chars/token. Only convert those messages to
   ModelMessage format. For a 7,704-message session where ~200 fit in
   context, this reduces toModelMessages input from 7,704 to ~200
   messages — cutting ~300MB of wrapper objects across 4-5 copy layers
   down to ~10MB.

Also caches conversation across prompt loop iterations — full reload
only after compaction, incremental merge for tool-call steps.
AsyncQueue is unbounded — when SSE clients fall behind (slow network,
stalled browser tab), JSON-serialized bus events accumulate without limit.
This caused 187GB RSS in production (anomalyco#16697).

Add optional capacity parameter to AsyncQueue with drop-oldest behavior.
Set to 1024 items (~2MB) for both SSE endpoints. TUI queues remain
unbounded (1:1 request/response, never accumulate).
… minimize

Two additional fixes:

1. Add plan_exit/plan_enter to the tools disable map in task.ts.
   The session permissions were being overwritten by SessionPrompt.prompt()
   which converts the tools map into session permissions. Without plan_exit
   in the tools map, it wasn't being denied.

2. Add minimize/expand toggle to the question dock so users can collapse
   it to read the conversation while a question is pending. Adds a
   chevron button in the header and makes the title clickable to toggle.
   DockPrompt gains a minimized prop that hides content and footer.
…selection

1. Restore Markdown component usage in live question dock — the
   rebase dropped it, leaving plain text rendering while the import
   was still present.

2. Refuse plan_exit when plan file is empty/missing — return an error
   telling the agent to write the plan first instead of showing an
   empty question to the user.

3. Add question-text to the user-select: text allow-list so plan
   content in the question dock is selectable and copyable.
- One-time migration to incremental auto-vacuum (PRAGMA auto_vacuum=2)
  so disk space is reclaimed when sessions are deleted
- Add Database.checkpoint() (TRUNCATE mode) and Database.vacuum()
  (incremental_vacuum(500)) helpers
Snapshot.patch() now stores relative paths (instead of joining with worktree
root) and caps at 1000 entries. Revert handles both old absolute and new
relative paths via path.isAbsolute() for backward compat.

Summary diffs (summarizeSession/summarizeMessage) use a ~1MB byte budget
based on actual before+after content size instead of an arbitrary count cap.
This prevents multi-MB payloads while allowing many small file diffs.

Closes anomalyco#18921
Build on anomalyco#19299 by @thdxr with production hardening for the new
server architecture (anomalyco#19335):

- Add pre-bootstrap static middleware in server.ts before
  WorkspaceRouterMiddleware to avoid Instance.provide() + DB migration
  checks on every CSS/JS/image/font request
- Add SPA fallback routing — page routes (no extension) get index.html,
  asset requests (.js, .woff2) fall through to CDN proxy
- Add OPENCODE_APP_DIR env var for explicit dev override
- Auto-detect packages/app/dist in monorepo dev mode
- Use explicit MIME type map for consistent content-type headers
- CDN proxy fallback for assets not found locally/embedded
The plan_exit tool renders the entire plan file as markdown in the
question dock. Two issues prevented scrolling:

1. DockShell used overflow: clip which prevents all descendant scrolling.
   Changed to overflow: hidden (still clips at border-radius but allows
   inner scroll containers).

2. question-content had no overflow — added overflow-y: auto so both
   the question text and options scroll together when content exceeds
   the dock height.
@BYK BYK force-pushed the fix/archived-session-cache-eviction branch from d1ba882 to 13a044a Compare April 9, 2026 13:23
@BYK
Copy link
Copy Markdown
Contributor Author

BYK commented Apr 9, 2026

Closing — changes have been incorporated into dev.

@BYK BYK closed this Apr 9, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Archived session loses messages and becomes inaccessible when interacted with

1 participant