Skip to content

Persist turn state + restore conversation history on cold-boot#1202

Merged
senamakel merged 11 commits into
tinyhumansai:mainfrom
senamakel:feat/threads-turn-state-persistence
May 5, 2026
Merged

Persist turn state + restore conversation history on cold-boot#1202
senamakel merged 11 commits into
tinyhumansai:mainfrom
senamakel:feat/threads-turn-state-persistence

Conversation

@senamakel
Copy link
Copy Markdown
Member

@senamakel senamakel commented May 5, 2026

Summary

  • Persist each thread's in-flight agent turn to disk so the UI can rehydrate (or surface "interrupted") after a restart.
  • Fix set_agent_definition_name to also rebuild session_key — previously, persist wrote session_raw/<ts>_orchestrator.jsonl while resume searched for <ts>_orchestrator_thread-XXX.jsonl, so transcript-resume always missed and the LLM ran every cold-boot turn with empty history.
  • Add a JSONL fallback: when the web channel builds a fresh agent, seed it from the authoritative conversation log via Agent::seed_resume_from_messages so existing on-disk threads (with the misnamed transcripts) recover immediately, not just future ones.
  • New JSON-RPC surface (threads_turn_state_get / _list / _clear) and Redux hydration plumbing for the snapshot.

Problem

Cold-booting the desktop app and continuing a conversation behaved as if the thread had no history — the agent reflected the literal current message back instead of answering against the prior turns. Two layered bugs:

  1. The web channel scopes transcripts per thread by calling set_agent_definition_name("orchestrator_thread-XXX") after build. The setter only updated agent_definition_name; session_key (used by persist_session_transcript for the filename) stayed at the builder-time <ts>_orchestrator. find_latest_transcript searches by agent_definition_name, so it could never match the on-disk file. Every cold boot ran with an empty transcript.
  2. Even after fixing the rename, every existing transcript on disk still has the wrong filename, so the next cold boot would still come up empty for an entire release window.

In parallel, no per-thread runtime state (in-flight tool timeline, streaming buffer, lifecycle) survived a process restart, so a navigation away mid-turn always lost UI state.

Solution

Turn-state persistence

  • New openhuman::threads::turn_state module: TurnState types, atomic JSON TurnStateStore at <workspace>/memory/conversations/turn_states/<hex(thread_id)>.json, and a TurnStateMirror that translates AgentProgress → snapshot mutations. High-frequency text/thinking/args deltas mutate memory only; flushes happen on iteration / tool / subagent boundaries to avoid filesystem thrash. Terminal TurnCompleted deletes the file; bridge exit without completion stamps Interrupted.
  • Web-channel progress bridge wires the mirror alongside its existing socket emissions.
  • bootstrap_skill_runtime calls mark_all_interrupted once per process so stale snapshots from a previous process surface consistently.
  • Three new RPC methods on the threads namespace: threads_turn_state_get, _list, _clear.
  • App: chatRuntimeSlice gains a hydrateRuntimeFromSnapshot reducer + fetchAndHydrateTurnState thunk. InferenceTurnLifecycle extended with 'interrupted'. Conversations.tsx dispatches the thunk on every thread switch.

Cold-boot history restoration

  • set_agent_definition_name now rebuilds session_key's suffix from the new sanitised name (preserving the unix-timestamp prefix). Future persists land at the same path resume searches.
  • Agent::seed_resume_from_messages(messages, current_user_message) builds a system prompt and primes cached_transcript_messages with the prior conversation. web::run_chat_request calls it on every cache-miss build using conversations::get_messages as the authoritative source. Trailing user message that matches the incoming message is deduped to avoid double-sending.

Submission Checklist

  • Tests added or updated (happy path + at least one failure / edge case) per docs/TESTING-STRATEGY.md
  • Diff coverage ≥ 80% — Rust unit tests for the store (7), mirror (7), and session_key/seed_resume regressions (5); JSON-RPC E2E lifecycle test (json_rpc_thread_turn_state_lifecycle); Vitest reducer tests for streaming + interrupted hydration. Full Vitest suite (1347) and cargo test for the affected modules pass locally.
  • N/A: behaviour-only change — no new feature rows to add to docs/TEST-COVERAGE-MATRIX.md; the touched threads / agent surfaces are already represented.
  • N/A: no matrix-row additions, so no feature IDs to list under ## Related.
  • No new external network dependencies introduced (mock backend used per docs/TESTING-STRATEGY.md)
  • N/A: not a release-cut surface — turn-state snapshot file + transcript path scoping are internal to the threads domain, no docs/RELEASE-MANUAL-SMOKE.md entries.
  • Linked issue tracked via Related — see Consolidate thread message storage onto session_raw transcripts #1201 (this PR is the workaround that issue retires).

Impact

  • Runtime: desktop only. New writes under <workspace>/memory/conversations/turn_states/. Snapshot writes are atomic (tempfile::persist) and bounded to iteration / tool boundaries; no per-delta thrash.
  • Migration: existing thread message JSONLs are left untouched. Existing misnamed transcripts under session_raw/ are simply ignored — the JSONL fallback recovers history from memory/conversations/threads/.
  • Performance: one extra disk read on cache-miss (conversation JSONL parse + system-prompt build), bounded by typical thread length. Hot path (cache-hit) is untouched.
  • Security: snapshot files live alongside existing conversation logs and inherit the same workspace permissions; no new attack surface.

Related

Summary by CodeRabbit

  • New Features

    • Persisted per-thread turn snapshots with get/list/clear controls
    • UI fetches and hydrates per-thread runtime when opening a conversation
    • Fresh sessions can be seeded from conversation logs for improved resume
  • Behavior

    • Startup marks stale snapshots as interrupted to enable retry paths
    • Session renaming is sanitized to avoid resume/persistence mismatches
    • Thread delete/purge now removes associated persisted turn snapshots
  • Tests

    • Added unit and end-to-end tests for persistence, mirroring, recovery, and hydration

senamakel added 3 commits May 4, 2026 20:50
The Rust core now snapshots each thread's in-flight agent turn to
`<workspace>/memory/conversations/turn_states/<hex(thread_id)>.json`
at iteration / tool / subagent boundaries. Snapshots survive process
restarts: on bootstrap any leftover snapshot is stamped Interrupted
so the UI can offer a retry rather than show an empty composer.

- New `openhuman::threads::turn_state` module: types, atomic JSON
  store, and `TurnStateMirror` that translates `AgentProgress` events
  into snapshot mutations. High-frequency text/thinking/args deltas
  mutate memory only; flushes happen on iteration / tool / subagent
  boundaries to avoid filesystem thrash. Terminal `TurnCompleted`
  deletes the file; bridge exit without completion marks Interrupted.
- Web-channel progress bridge wires the mirror alongside its existing
  socket emissions; nothing on the live path changes shape.
- Three new RPC methods on the threads namespace:
  `threads_turn_state_get`, `_list`, `_clear`.
- `bootstrap_skill_runtime` invokes `mark_all_interrupted` once per
  process so stale snapshots are surfaced consistently.
- App: `chatRuntimeSlice` gains `hydrateRuntimeFromSnapshot` reducer
  and `fetchAndHydrateTurnState` thunk; `InferenceTurnLifecycle`
  extended with `'interrupted'`. Conversations dispatches the thunk
  on every thread switch.

Tests: 7 store + 7 mirror Rust unit tests, json_rpc_e2e lifecycle
test (empty → seed → get → list → clear → null), 2 Vitest reducer
tests for streaming + interrupted hydration.
Two bugs caused the agent to forget the entire thread on every app
restart:

1. `set_agent_definition_name` (called by the web channel to scope
   transcripts per thread, e.g. `orchestrator_thread-6ad6d`) updated
   only `agent_definition_name`. `session_key`, which the persist
   path uses to build the transcript filename, stayed at the
   builder-time `<ts>_orchestrator`. So persist wrote
   `session_raw/<ts>_orchestrator.jsonl` while
   `find_latest_transcript` searched for the thread-scoped name and
   never matched. Every cold boot ran with an empty transcript.

   Fix: have `set_agent_definition_name` rebuild `session_key`'s
   suffix from the new sanitised name (preserving the timestamp
   prefix). Two regression tests pin the contract.

2. Even with the rename fix, existing on-disk transcripts are still
   misnamed and `find_latest_transcript` will keep missing them on
   the next boot. Conversation messages, however, are persisted by
   the authoritative JSONL log under `memory/conversations/threads/`
   regardless. Add `Agent::seed_resume_from_messages` and call it
   from `web::run_chat_request` whenever a fresh agent is built —
   the prior thread messages are loaded, the trailing user message
   (the one about to be passed to `run_single`) is deduped, and the
   list is primed into `cached_transcript_messages` with a freshly
   built system prompt. Three tests cover the dedup, the warm-agent
   noop, and the unmatched-trailing-user case.

Net effect: opening a thread after restart and asking a follow-up
question now gets answered against the full conversation, not only
the latest message.
@senamakel senamakel requested a review from a team May 5, 2026 04:17
@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai Bot commented May 5, 2026

Note

Reviews paused

It looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the reviews.auto_review.auto_pause_after_reviewed_commits setting.

Use the following commands to manage reviews:

  • @coderabbitai resume to resume automatic reviews.
  • @coderabbitai review to trigger a single review.

Use the checkboxes below for quick actions:

  • ▶️ Resume reviews
  • 🔍 Trigger review
📝 Walkthrough

Walkthrough

Adds persistent per-thread turn snapshots: Rust records and persists AgentProgress into TurnState files, exposes RPCs to get/list/clear snapshots, marks stale snapshots Interrupted at startup, mirrors turn-state during web-channel streaming, and the frontend fetches and hydrates persisted turn state into Redux when a thread is selected.

Changes

Turn-state persistence → RPC → frontend hydration

Layer / File(s) Summary
Types / Wire Shapes (backend)
src/openhuman/threads/turn_state/types.rs
Adds serde wire/storage types: TurnLifecycle (Started/Streaming/Interrupted), TurnPhase, ToolTimelineStatus, ToolTimelineEntry, SubagentActivity, SubagentToolCall, TurnState snapshot model, and RPC envelopes.
Store Implementation (backend)
src/openhuman/threads/turn_state/store.rs, src/openhuman/threads/turn_state/store_tests.rs
Filesystem-backed JSON per-thread snapshot store; atomic overwrite via temp file; global mutex; put/get/delete/list/mark_all_interrupted APIs and tests.
Mirror / Observer (backend)
src/openhuman/threads/turn_state/mirror.rs, src/openhuman/threads/turn_state/mirror_tests.rs
TurnStateMirror translates AgentProgress into TurnState, flushes at iteration/tool boundaries, deletes on TurnCompleted, finish() marks Interrupted; tests exercise lifecycle, args buffering, text deltas, subagents.
Module wiring (backend)
src/openhuman/threads/turn_state/mod.rs, src/openhuman/threads/mod.rs
New turn_state submodule exposing mirror, store, types; re-exports TurnStateMirror, TurnStateStore, and types.
RPC ops / controller (backend)
src/openhuman/threads/ops.rs, src/openhuman/threads/schemas.rs, src/openhuman/threads/schemas_tests.rs
Adds ops turn_state_get/list/clear, controller schemas/handlers, includes snapshot deletion during thread delete/purge; tests updated to include new controller functions.
Startup recovery (backend)
src/core/jsonrpc.rs
bootstrap_skill_runtime calls mark_all_interrupted at startup to stamp non-completed snapshots as Interrupted with current timestamp.
Web-channel integration (backend)
src/openhuman/channels/providers/web.rs
Session rebuild returns (agent, was_built_fresh); fresh builds seed resume from conversation JSONL via new seed_resume_from_messages; progress bridge now constructs TurnStateMirror with TurnStateStore and mirrors AgentProgress events, deleting or marking Interrupted on exit.
Session helpers & tests (backend)
src/openhuman/agent/harness/session/runtime.rs, src/openhuman/agent/harness/session/tests.rs
set_agent_definition_name sanitizes and rebuilds session_key suffix; seed_resume_from_messages primes cached transcript on cold agents; tests added.
End-to-end RPC test (integration)
tests/json_rpc_e2e.rs
New Tokio JSON-RPC e2e test exercising turn-state list→get→clear lifecycle.
Types / API client (frontend)
app/src/types/turnState.ts, app/src/services/api/threadApi.ts
Adds TypeScript persisted turn-state types and threadApi.getTurnState, listTurnStates, clearTurnState that call RPC endpoints and normalize envelope results.
Redux slice & thunk (frontend)
app/src/store/chatRuntimeSlice.ts, app/src/store/__tests__/chatRuntimeSlice.test.ts
Extends InferenceTurnLifecycle with 'interrupted'; adds conversion helpers from persisted snapshots; new hydrateRuntimeFromSnapshot reducer; fetchAndHydrateTurnState(threadId) thunk fetches snapshot and dispatches hydration; tests cover streaming and interrupted snapshot hydration.
UI wiring (frontend)
app/src/pages/Conversations.tsx
On selectedThreadId change, dispatches loadThreadMessages(selectedThreadId) then fetchAndHydrateTurnState(selectedThreadId) to restore per-thread runtime state.

Sequence Diagram(s)

sequenceDiagram
    participant UI as UI (Conversations)
    participant Redux as Redux (chatRuntimeSlice)
    participant API as Frontend API (threadApi)
    participant RPC as RPC Server (threads/ops)
    participant Store as TurnStateStore (Filesystem)

    UI->>Redux: dispatch fetchAndHydrateTurnState(threadId)
    Redux->>API: getTurnState(threadId)
    API->>RPC: threads_turn_state_get
    RPC->>Store: get(threadId)
    Store-->>RPC: TurnState | null
    RPC-->>API: GetTurnStateResponse
    API-->>Redux: PersistedTurnState | null
    Redux->>Redux: dispatch hydrateRuntimeFromSnapshot(snapshot)
    Redux->>UI: Update runtime slices (lifecycle/status/streaming/toolTimeline)
Loading
sequenceDiagram
    participant Web as Web Channel (run_chat_task)
    participant Conv as Conversation JSONL (Memory)
    participant Agent as Agent Session
    participant Bridge as Progress Bridge (TurnStateMirror)
    participant Store as TurnStateStore (Filesystem)

    Web->>Agent: build_session_agent() or reuse from cache
    alt fresh agent built
        Web->>Conv: get_messages(thread_id)
        Conv-->>Web: [(role, content), ...]
        Web->>Agent: seed_resume_from_messages(pairs, current_msg)
        Agent->>Agent: populate cached_transcript_messages
    end
    Web->>Bridge: spawn_progress_bridge(TurnStateStore,...)
    Bridge->>Store: TurnStateMirror::new() flush started snapshot
    loop agent progress
        Agent->>Bridge: AgentProgress event
        Bridge->>Bridge: observe(event) update TurnState
        alt flush boundary (iteration/tool events)
            Bridge->>Store: flush() persist snapshot
        end
    end
    alt TurnCompleted
        Bridge->>Store: delete snapshot
    else early exit
        Bridge->>Store: finish() mark Interrupted + flush
    end
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~60 minutes

Possibly related issues

Possibly related PRs

Poem

🐰 I stitched each turn to disk tonight,

Deltas tucked safe in JSON light,
Interrupted flags for sleepy ends,
Hydrate and hop—resume, dear friends,
Threads awake, the story mends.

🚥 Pre-merge checks | ✅ 5
✅ Passed checks (5 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title 'Persist turn state + restore conversation history on cold-boot' accurately and specifically describes the main changes in the PR.
Docstring Coverage ✅ Passed Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.


Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 5

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@app/src/store/chatRuntimeSlice.ts`:
- Around line 260-282: The code currently writes snapshot.lifecycle into
inferenceTurnLifecycleByThread but still hydrates inferenceStatusByThread and
streamingAssistantByThread from snapshot even when the lifecycle is
'interrupted'; change the logic in the reducer handling snapshot to first check
if snapshot.lifecycle === 'interrupted' (or equivalent interrupted value) and,
when true, skip populating inferenceStatusByThread and
streamingAssistantByThread (instead delete/leave them absent) so that no
live-progress UI is restored for interrupted turns; update the branches around
inferenceTurnLifecycleByThread, inferenceStatusByThread, and
streamingAssistantByThread to apply this lifecycle guard before using
snapshot.iteration, snapshot.streamingText, or snapshot.thinking.

In `@src/openhuman/channels/providers/web.rs`:
- Around line 595-600: TurnStateMirror::observe() returns a boolean indicating a
flush-boundary snapshot must be persisted, but the code currently drops that
value at the call site (turn_state.observe(&event)); update the loop to capture
the return and, when true, immediately persist the snapshot using the same
persistence mechanism used by finish() (i.e., write the snapshot into the
turn_state_store) so boundary updates are durable before TurnCompleted/finish()
runs; reference TurnStateMirror::observe(), the turn_state.observe(&event) call,
and the existing finish()/TurnCompleted persistence path to implement the
immediate flush.

In `@src/openhuman/threads/ops.rs`:
- Around line 462-499: thread lifecycle cleanup currently doesn't remove
persisted turn-state snapshots; update the thread deletion and purge code paths
(e.g., the functions thread_delete and threads_purge in this module) to also
remove turn-state files from the workspace. After acquiring the same
workspace_dir() used by turn_state_get/list/clear, call the turn_state store
APIs (e.g., turn_state::store::delete(dir, &thread_id) for single-thread
deletion and iterate/delete all relevant entries when purging) and
handle/propagate any errors consistently with existing delete/purge behavior so
no stale turn_state remains.

In `@src/openhuman/threads/turn_state/mirror.rs`:
- Around line 79-100: AgentProgress::ToolCallStarted currently always appends a
new ToolTimelineEntry even if a placeholder row with the same call_id was
created earlier (e.g. by ToolCallArgsDelta), causing duplicate entries; modify
the ToolCallStarted handling to search self.state.tool_timeline for an existing
entry with id == call_id and reuse it by updating its fields (name, round,
status -> ToolTimelineStatus::Running, active_tool,
display_name/detail/subagent/source_tool_name as needed) instead of pushing a
new entry, and only push a new ToolTimelineEntry if no existing entry is found;
apply the same reuse logic to the analogous handler referenced around lines
259-289.

In `@src/openhuman/threads/turn_state/store.rs`:
- Around line 39-63: The put() method currently fsyncs the tempfile but not the
parent directory after tmp.persist(&path), so add an explicit fsync of the
snapshot directory to ensure the directory entry (rename) is durable: after
tmp.persist(&path) (in the put function in store.rs, which uses ensure_dir(),
snapshot_path(), NamedTempFile and TURN_STATE_LOCK), open the directory (e.g.,
std::fs::File::open(&dir) or use Dir::open) and call sync_all() and map any
error into the same Result<String, _> error path (similar formatting as the
other map_err messages) so a failure to sync the directory returns an Err.
Ensure the directory fsync occurs only after persist() succeeds.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 352827a8-bd65-42ee-a25a-3493d773a4ee

📥 Commits

Reviewing files that changed from the base of the PR and between 62516ba and 255f307.

⛔ Files ignored due to path filters (2)
  • Cargo.lock is excluded by !**/*.lock
  • app/src-tauri/Cargo.lock is excluded by !**/*.lock
📒 Files selected for processing (19)
  • app/src/pages/Conversations.tsx
  • app/src/services/api/threadApi.ts
  • app/src/store/__tests__/chatRuntimeSlice.test.ts
  • app/src/store/chatRuntimeSlice.ts
  • app/src/types/turnState.ts
  • src/core/jsonrpc.rs
  • src/openhuman/agent/harness/session/runtime.rs
  • src/openhuman/agent/harness/session/tests.rs
  • src/openhuman/channels/providers/web.rs
  • src/openhuman/threads/mod.rs
  • src/openhuman/threads/ops.rs
  • src/openhuman/threads/schemas.rs
  • src/openhuman/threads/turn_state/mirror.rs
  • src/openhuman/threads/turn_state/mirror_tests.rs
  • src/openhuman/threads/turn_state/mod.rs
  • src/openhuman/threads/turn_state/store.rs
  • src/openhuman/threads/turn_state/store_tests.rs
  • src/openhuman/threads/turn_state/types.rs
  • tests/json_rpc_e2e.rs

Comment thread app/src/store/chatRuntimeSlice.ts
Comment thread src/openhuman/channels/providers/web.rs
Comment thread src/openhuman/threads/ops.rs
Comment thread src/openhuman/threads/turn_state/mirror.rs
Comment thread src/openhuman/threads/turn_state/store.rs
senamakel added 3 commits May 4, 2026 21:23
- Hydrate path: when lifecycle === 'interrupted', skip
  inferenceStatusByThread / streamingAssistantByThread so the UI
  doesn't render fake live progress from a stale snapshot. Tool
  timeline IS preserved as a frozen record next to the retry banner.
- Mirror: when ToolCallStarted lands and an args-delta placeholder
  with the same call_id already exists, mutate the placeholder
  in-place instead of appending a duplicate row.
- Store: fsync the parent directory after persist() so the rename
  survives a crash / power loss (Unix only — Windows relies on NTFS
  journaling). Best-effort: fsync failure is logged, not fatal.
- ops::thread_delete + threads_purge: drop turn-state snapshots so
  turn_state_list never surfaces orphaned files for deleted threads.

Adds two regression tests (placeholder-merge in mirror, stale-field
suppression in chatRuntimeSlice).
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@src/openhuman/threads/ops.rs`:
- Around line 441-446: The current code calls turn_state::store::delete(dir,
&request.thread_id) and only logs a warning on Err, which allows thread_delete /
threads_purge to return success while persisted turn snapshots remain; change
the flow to propagate the deletion failure instead of silencing it: in the
thread_delete and threads_purge handlers detect Err from
turn_state::store::delete and either return an RPC error (propagate the Err) or
include a partial-failure field in the RPC response indicating which thread_ids
failed to delete (using request.thread_id to identify them); update any call
sites that expect unconditional success accordingly so the RPC result reflects
the cleanup failure rather than only logging it.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: bf1a0520-18ac-4ba3-b8e7-ed34010bf7d4

📥 Commits

Reviewing files that changed from the base of the PR and between 256ecf4 and 68d40d5.

📒 Files selected for processing (6)
  • app/src/store/__tests__/chatRuntimeSlice.test.ts
  • app/src/store/chatRuntimeSlice.ts
  • src/openhuman/threads/ops.rs
  • src/openhuman/threads/turn_state/mirror.rs
  • src/openhuman/threads/turn_state/mirror_tests.rs
  • src/openhuman/threads/turn_state/store.rs
✅ Files skipped from review due to trivial changes (3)
  • app/src/store/tests/chatRuntimeSlice.test.ts
  • src/openhuman/threads/turn_state/mirror_tests.rs
  • src/openhuman/threads/turn_state/mirror.rs
🚧 Files skipped from review as they are similar to previous changes (1)
  • src/openhuman/threads/turn_state/store.rs

Comment thread src/openhuman/threads/ops.rs Outdated
senamakel added 2 commits May 4, 2026 21:34
CodeRabbit pointed out that thread_delete / threads_purge previously
treated turn-state cleanup as best-effort and only logged on failure.
That meant the RPC could return success while the snapshot files —
which mirror conversation-derived state — remained on disk, weakening
the data-deletion guarantee these endpoints are supposed to provide.

Propagate the error instead. The thread index row is already gone by
the time cleanup runs, so an error message communicates partial
failure rather than total failure; callers see exactly which step
left state behind and can retry the cleanup explicitly.
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@src/openhuman/threads/ops.rs`:
- Around line 445-451: turn_state::store::delete may return early and prevent
web_channel::invalidate_thread_sessions from running, leaving active sessions
for a deleted thread; ensure
invalidate_thread_sessions(&request.thread_id).await always runs even when
delete fails by calling web_channel::invalidate_thread_sessions before returning
the error (or by capturing the delete result, awaiting
invalidate_thread_sessions, then propagating the formatted error). Refer to
turn_state::store::delete, web_channel::invalidate_thread_sessions, and
request.thread_id to locate and implement this reorder/try-finally style flow.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: cf9473ec-adb8-4ff6-b1f7-42e1364aef2c

📥 Commits

Reviewing files that changed from the base of the PR and between 68d40d5 and f33cf10.

📒 Files selected for processing (1)
  • src/openhuman/threads/ops.rs

Comment thread src/openhuman/threads/ops.rs Outdated
Reorders the two cleanup steps in thread_delete so
web_channel::invalidate_thread_sessions runs even when
turn_state::store::delete fails. Otherwise a flaky snapshot delete
could leave an active in-process session pinned to a thread that no
longer exists in the index — caught by CodeRabbit.
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@src/openhuman/threads/ops.rs`:
- Around line 475-484: threads_purge currently calls turn_state::store::list
which skips unreadable/corrupted snapshot files, so purging can falsely succeed;
add and call a deserialize-independent API on the store (e.g.,
turn_state::store::clear_all or TurnStateStore::clear_all) from threads_purge to
remove all snapshot artifacts regardless of readability, and update
threads_purge to use that new API instead of iterating list/delete so corrupted
files are also deleted.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 186b4805-17b1-4dc9-bab6-0d518b6df15b

📥 Commits

Reviewing files that changed from the base of the PR and between f33cf10 and 179abfc.

📒 Files selected for processing (1)
  • src/openhuman/threads/ops.rs

Comment thread src/openhuman/threads/ops.rs Outdated
senamakel added 2 commits May 4, 2026 21:47
…endent clear_all

Previously threads_purge iterated turn_state::store::list, which
warn-and-skips files that fail to deserialize. A purge could
silently leave behind corrupted/half-written snapshot files,
defeating the destructive-cleanup guarantee.

Adds TurnStateStore::clear_all that walks the directory and removes
every *.json entry without parsing — corrupted or schema-skewed
files are wiped along with the rest. Regression test seeds an
intentionally corrupt snapshot alongside good ones and asserts all
three are cleared. CodeRabbit.
@senamakel senamakel merged commit 49a9b83 into tinyhumansai:main May 5, 2026
16 of 17 checks passed
jwalin-shah added a commit to jwalin-shah/openhuman that referenced this pull request May 5, 2026
* feat(remotion): Ghosty character library with transparent MOV variants (tinyhumansai#1059)

Co-authored-by: WOZCODE <contact@withwoz.com>

* feat(composio/gmail): sync into memory tree (Slack-parity) (tinyhumansai#1056)

Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* feat(scheduler-gate): throttle background AI on battery / busy CPU (tinyhumansai#1062)

* fix(core,cef): run core in-process and stop orphaning CEF helpers on Cmd+Q (tinyhumansai#1061)

* ci: add dedicated staging release workflow (tinyhumansai#1066)

* fix(sentry): Rust source context + per-release deploy marker (tinyhumansai#405) (tinyhumansai#1067)

* fix(welcome): re-enable OAuth buttons with focus/timeout recovery (tinyhumansai#1049) (tinyhumansai#1069)

Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* chore(dependencies): update pnpm-lock.yaml and Cargo.lock for package… (tinyhumansai#1082)

* fix(onboarding): personalize welcome agent greeting with user identity (tinyhumansai#1078)

* fix(chat): make agent message bubbles fit content width (tinyhumansai#1083)

* Feat/dmg checks (tinyhumansai#1084)

* fix(linux): Add X11 platform flags to .deb package launcher (tinyhumansai#1087)

Co-authored-by: unn-Known1 <unn-known1@users.noreply.github.com>

* fix(sentry): auto-send React events; collapse core→tauri for desktop (tinyhumansai#1086)

Co-authored-by: Steven Enamakel <enamakel@tinyhumans.ai>

* fix(cef): run blank reload guard on the CEF UI thread (tinyhumansai#1092)

* fix(app): reload webview instead of restart_app in dev mode (tinyhumansai#1068) (tinyhumansai#1071)

* fix(linux): deliver X11 ozone flags via custom .desktop template (tinyhumansai#1091)

* fix(webview-accounts): retry data-dir purge so CEF handle race doesn't leak cookies (tinyhumansai#1076) (tinyhumansai#1081)

Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Co-authored-by: Steven Enamakel <enamakel@tinyhumans.ai>

* fix(webview/slack): media perms + deep-link isolation (tinyhumansai#1074) (tinyhumansai#1080)

Co-authored-by: Steven Enamakel <enamakel@tinyhumans.ai>

* ci(release): split staging vs production workflows; promote staging tags (tinyhumansai#1094)

* Update release-staging.yml (tinyhumansai#1097)

* chore(staging): v0.53.5

* chore(staging): v0.53.6

* ci(staging): cut staging from main; add act local-debug helper (tinyhumansai#1099)

* chore(staging): v0.53.7

* fix(ci): correct sentry-cli download URL and trap scope (tinyhumansai#1100)

* chore(staging): v0.53.8

* feat(chat): forward thread_id to backend for KV cache locality (tinyhumansai#1095)

* fix(ci): bump pinned sentry-cli to 3.4.1 (2.34.2 was never published) (tinyhumansai#1102)

* chore(staging): v0.53.9

* fix(ci): drop bash trap in upload_sentry_symbols.sh; inline cleanup (tinyhumansai#1103)

* chore(staging): v0.53.10

* refactor(session): flatten session_raw/, switch md to YYYY_MM_DD (tinyhumansai#1098)

* Add full Composio managed-auth toolkit catalog (tinyhumansai#1093)

* ci: add diff-aware 80% coverage gate (Vitest + cargo-llvm-cov) (tinyhumansai#1104)

* feat(scripts): pnpm work + pnpm debug for agent-driven workflows (tinyhumansai#1105)

* ci: pull pnpm into CI image, drop redundant setup steps (tinyhumansai#1107)

* docs: add Cursor Cloud specific instructions to AGENTS.md (tinyhumansai#1106)

Co-authored-by: Cursor Agent <cursoragent@cursor.com>

* chore(staging): v0.53.11

* docs: surface 80% coverage gate and scripts/debug runners (tinyhumansai#1108)

* feat(app): show Composio integrations as sorted icon grid on Skills (tinyhumansai#1109)

Co-authored-by: Cursor Agent <cursoragent@cursor.com>

* feat(composio): client-side trigger enable/disable toggles (tinyhumansai#1110)

* feat(skills): channels grid + integrations card polish; tolerant Composio trigger decode (tinyhumansai#1112)

* chore(staging): v0.53.12

* feat(home): early-bird banner + assistant→agent terminology (tinyhumansai#1113)

* feat(updater): in-app auto-update with auto-download + restart prompt (tinyhumansai#677) (tinyhumansai#1114)

* chore(claude): add ship-and-babysit slash command (tinyhumansai#1115)

* feat(home): EarlyBirdyBanner + agent terminology + LinkedIn enrichment model pin (tinyhumansai#1118)

* fix(chat): single onboarding thread in sidebar after wizard (tinyhumansai#1116)

Co-authored-by: Cursor Agent <cursoragent@cursor.com>
Co-authored-by: Steven Enamakel <senamakel@users.noreply.github.com>

* fix: filter out global namespace from citation chips (tinyhumansai#1124)

Co-authored-by: google-labs-jules[bot] <161369871+google-labs-jules[bot]@users.noreply.github.com>
Co-authored-by: senamakel-droid <281415773+senamakel-droid@users.noreply.github.com>

* feat(nav): enable Memory tab in BottomTabBar (tinyhumansai#1125)

* feat(memory): singleton ingestion + status RPC + UI pill (tinyhumansai#1126)

* feat(human): mascot tab with viseme-driven lipsync (staging only) (tinyhumansai#1127)

* Fix CEF zombie processes on full app close and restart (tinyhumansai#1128)

Co-authored-by: senamakel-droid <281415773+senamakel-droid@users.noreply.github.com>
Co-authored-by: Steven Enamakel <enamakel@tinyhumans.ai>

* Update issue templates for GitHub issue types (tinyhumansai#1146)

* feat(human): expand mascot expressions and tighten reply-speech state machine (tinyhumansai#1147)

* feat(memory): ingestion pipeline + tree-architecture docs + ops/schemas split (tinyhumansai#1142)

* feat(threads): surface live subagent work in parent thread (tinyhumansai#1122) (tinyhumansai#1159)

* fix(human): keep mascot mouth animating when TTS ships no viseme data (tinyhumansai#1160)

* feat(composio): consume backend markdownFormatted for LLM output (tinyhumansai#1165)

* fix(subagent): lazy-register toolkit actions filtered out of fuzzy top-K (tinyhumansai#1162)

* feat(memory): user-facing long-term memory window preset (tinyhumansai#1137) (tinyhumansai#1161)

* fix(tauri-shell): proactively kill stale openhuman RPC on startup (tinyhumansai#1166)

* chore(staging): v0.53.13

* fix(composio): per-action tool consumes backend markdownFormatted (tinyhumansai#1167)

* fix(threads): persist selectedThreadId across reloads (tinyhumansai#1168)

* feat(memory_tree): switch embed model to bge-m3 (1024-dim, 8K context) (tinyhumansai#1174)

Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* fix(agent): drop redundant [Memory context] recall injection (tinyhumansai#1173)

* chore(memory_tree): drop body-read timeouts on Ollama HTTP calls (tinyhumansai#1171)

Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* feat(transcript): emit thread_id + fix orchestrator missing cost (tinyhumansai#1169)

* fix(composio/gmail): phase out html2md, prefer text/plain MIME part (tinyhumansai#1170)

Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* feat(tools): markdown output for internal tool results (tinyhumansai#1172)

* feat(security): enforce prompt-injection guard before model and tool execution (tinyhumansai#1175)

* fix(cef): popup paint dies after first frame — skip blank-page guard for popups (tinyhumansai#1079) (tinyhumansai#1182)

Co-authored-by: Steven Enamakel <31011319+senamakel@users.noreply.github.com>

* chore(sentry): rename OPENHUMAN_SENTRY_DSN → OPENHUMAN_CORE_SENTRY_DSN (tinyhumansai#1186)

* feat(remotion): add yellow mascot character with all animation variants (tinyhumansai#1193)

Co-authored-by: Neel Mistry <neelmistry@Neels-MacBook-Pro.local>
Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>

* refactor(composio): hide raw connection ID, derive friendly label (tinyhumansai#1153) (tinyhumansai#1185)

Co-authored-by: google-labs-jules[bot] <161369871+google-labs-jules[bot]@users.noreply.github.com>

* fix(windows): align install.ps1 MSI with per-machine scope (tinyhumansai#913) (tinyhumansai#1187)

Co-authored-by: Cursor <cursoragent@cursor.com>

* fix(tauri): deterministic CEF teardown on full app close (tinyhumansai#1120) (tinyhumansai#1189)

Co-authored-by: Cursor <cursoragent@cursor.com>

* fix(composio): cap Gmail HTML body before strip (crash mitigation) (tinyhumansai#1191)

Co-authored-by: Cursor <cursoragent@cursor.com>

* fix(auth): stop stale chat threads after signup (tinyhumansai#1192)

Co-authored-by: Cursor <cursoragent@cursor.com>

* feat(sentry): staging-only "Trigger Sentry Test" button (tinyhumansai#1072) (tinyhumansai#1183)

* chore(staging): v0.53.14

* chore(staging): v0.53.15

* feat(composio): format trigger slugs into human-readable labels (tinyhumansai#1129) (tinyhumansai#1179)

Co-authored-by: google-labs-jules[bot] <161369871+google-labs-jules[bot]@users.noreply.github.com>

* fix(ui): hide unsupported permission UI on non-macOS for Screen Intelligence (tinyhumansai#1194)

Co-authored-by: Cursor <cursoragent@cursor.com>

* chore(tauri-shell): retire embedded Gmail webview-account flow (tinyhumansai#1181)

* feat(onboarding): replace welcome-agent bot with react-joyride walkthrough (tinyhumansai#1180)

* chore(release): v0.53.16

* fix(threads): preserve selectedThreadId on cold-boot identity hydration (tinyhumansai#1196)

* feat(core): version/shutdown/update RPCs + mid-thread integration refresh (tinyhumansai#1195)

* fix(mascot): swap to yellow mascot via @remotion/player (tinyhumansai#1200)

* feat(memory_tree): cloud-default LLM, queue priority, entity filter, Memory tab UI (tinyhumansai#1198)

Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* Persist turn state + restore conversation history on cold-boot (tinyhumansai#1202)

* feat(mascot): floating desktop mascot via native NSPanel + WKWebView (macOS) (tinyhumansai#1203)

* fix(memory/tree): emit summary children as Obsidian wikilinks (tinyhumansai#1210)

Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* feat(tools): coding-harness baseline primitives (tinyhumansai#1205) (tinyhumansai#1208)

* docs: add Codex PR checklist for remote agents

---------

Co-authored-by: Steven Enamakel <31011319+senamakel@users.noreply.github.com>
Co-authored-by: WOZCODE <contact@withwoz.com>
Co-authored-by: sanil-23 <sanil@vezures.xyz>
Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Co-authored-by: Cyrus Gray <144336577+graycyrus@users.noreply.github.com>
Co-authored-by: CodeGhost21 <164498022+CodeGhost21@users.noreply.github.com>
Co-authored-by: oxoxDev <164490987+oxoxDev@users.noreply.github.com>
Co-authored-by: Mega Mind <146339422+M3gA-Mind@users.noreply.github.com>
Co-authored-by: Gaurang Patel <ptelgm.yt@gmail.com>
Co-authored-by: unn-Known1 <unn-known1@users.noreply.github.com>
Co-authored-by: Steven Enamakel <enamakel@tinyhumans.ai>
Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
Co-authored-by: Cursor Agent <cursoragent@cursor.com>
Co-authored-by: Steven Enamakel <senamakel@users.noreply.github.com>
Co-authored-by: Steven Enamakel's Droid <enamakel.agent@tinyhumans.ai>
Co-authored-by: google-labs-jules[bot] <161369871+google-labs-jules[bot]@users.noreply.github.com>
Co-authored-by: senamakel-droid <281415773+senamakel-droid@users.noreply.github.com>
Co-authored-by: YellowSnnowmann <167776381+YellowSnnowmann@users.noreply.github.com>
Co-authored-by: Neil <neil@maha.xyz>
Co-authored-by: Neel Mistry <neelmistry@Neels-MacBook-Pro.local>
Co-authored-by: obchain <167975049+obchain@users.noreply.github.com>
Co-authored-by: Jwalin Shah <jshah1331@gmail.com>
AusAgentSmith pushed a commit to AusAgentSmith/openhuman that referenced this pull request May 23, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant