Skip to content

fix(sessions): preserve terminal status after agent run completes#60290

Open
jwchmodx wants to merge 5 commits intoopenclaw:mainfrom
jwchmodx:fix/session-status-clobbered-after-run
Open

fix(sessions): preserve terminal status after agent run completes#60290
jwchmodx wants to merge 5 commits intoopenclaw:mainfrom
jwchmodx:fix/session-status-clobbered-after-run

Conversation

@jwchmodx
Copy link
Copy Markdown
Contributor

@jwchmodx jwchmodx commented Apr 3, 2026

Problem

After a run completes, a session can remain persisted with status: "running" even though endedAt and runtimeMs are already set. This wedges the session: new inbound messages are not accepted and stop/abort commands do not work until the store row is manually edited.

Root cause

persistGatewaySessionLifecycleEvent is called with void (fire-and-forget) in the agent event listener. It enqueues a write to set the terminal status on disk. Immediately after, updateSessionStoreAfterAgentRun enqueues its own write.

The lock queue runs them in order:

  1. Lifecycle end handler: reads disk, writes { status: "done", endedAt: X, runtimeMs: Y }
  2. updateSessionStoreAfterAgentRun: reads disk (has "done"), but patches from the in-memory sessionStore which was loaded during session initialisation — before the run — and still carries status: "running". mergeSessionEntry(diskState, next) spreads next.status = "running" over the disk's "done", leaving endedAt/runtimeMs intact (they were never in next).

Final persisted state: { status: "running", endedAt: X, runtimeMs: Y } — the exact stuck pattern described in the issue.

Fix

Omit status from the patch passed to mergeSessionEntry inside updateSessionStoreAfterAgentRun. This function is responsible for token/cost/model fields only; lifecycle status is exclusively owned by persistGatewaySessionLifecycleEvent.

Test

Added a regression test in src/commands/agent/session-store.test.ts:

  • Disk pre-seeded with { status: "done", endedAt: X, runtimeMs: 1234 } (as if lifecycle end already ran)
  • In-memory store has { status: "running" } (stale)
  • After updateSessionStoreAfterAgentRun, asserts persisted.status === "done" and endedAt/runtimeMs preserved

Closes #60250

🤖 Generated with Claude Code

jwchmodx and others added 5 commits April 3, 2026 15:45
…ostReplyRootId

Direct messages in Mattermost were creating threads even with
replyToMode=off because resolveMattermostReplyRootId would fall back
to payload.replyToId regardless of chat kind. When a downstream payload
carried a replyToId (e.g. from block-streaming delivery), this
bypassed the earlier DM threading guard and set root_id on the outbound
post, making DM replies appear as threads instead of in the channel body.

Fix: pass kind through all three delivery call sites and hard-return
undefined inside resolveMattermostReplyRootId for kind="direct",
mirroring the resolveMattermostEffectiveReplyToId guard that already
existed for session-key resolution.

Fixes openclaw#59981
…hreshold

The generic_repeat detector only checked warningThreshold and always
returned level "warning", making criticalThreshold effectively a no-op
for the most common runaway loop pattern (same tool + same args).

Add a critical-threshold check before the warning check, consistent
with how known_poll_no_progress and ping_pong already behave.

Fixes openclaw#60111

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…aining-only fields

MiniMax's usage_percent / usagePercent fields report the *remaining* quota
as a percentage, not the consumed quota. When count fields (prompt_limit /
prompt_remain) are also present, fromCounts already computed the correct
usedPercent and the inverted value was silently ignored. But when only
usage_percent is returned (no count fields), the code treated it as a
used-percent and passed it through unchanged, causing the menu bar to show
"2% left" instead of "98% left".

Move usage_percent and usagePercent from PERCENT_KEYS to a new
REMAINING_PERCENT_KEYS array. deriveUsedPercent now inverts remaining-percent
values to obtain usedPercent, matching the behaviour already validated by the
existing "prefers count-based usage when percent looks inverted" test. Count-
based fromCounts still takes priority over both key groups.

Fixes openclaw#60193

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
MiMo reasoning models (mimo-v2-pro, mimo-v2-omni) output their full
response to reasoning_content with an empty content field. This causes
OpenClaw to emit no visible text since the pi-ai stream emits
reasoning_content deltas as thinking blocks, not as main content.

Setting enable_thinking: false in the request payload directs the model
to write its reply to the standard content field instead.

Closes openclaw#60261

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
persistGatewaySessionLifecycleEvent writes the terminal status to disk
as a fire-and-forget write. updateSessionStoreAfterAgentRun runs after
it, but reads the in-memory sessionStore (loaded before the run) which
still carries status: "running". The stale status was being spread into
the merge patch, clobbering the "done"/"failed"/"killed" status on disk.

Result: sessions could remain stuck as "running" even though endedAt and
runtimeMs were correctly set, blocking new inbound messages and
stop/abort commands until the store row was edited manually.

Fix: omit the status field from the patch in updateSessionStoreAfterAgentRun
so that the lifecycle-managed terminal status on disk is always preserved.

Closes openclaw#60250

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
@greptile-apps
Copy link
Copy Markdown
Contributor

greptile-apps Bot commented Apr 3, 2026

Greptile Summary

This PR bundles five independent fixes: the primary fix strips status from the patch in updateSessionStoreAfterAgentRun so stale in-memory \"running\" cannot overwrite a terminal status already persisted by persistGatewaySessionLifecycleEvent; the other four changes fix Mattermost DM threading, escalate generic tool-loop repeats to \"critical\" at criticalThreshold, invert the MiniMax usage_percent (remaining→used), and inject enable_thinking: false for Xiaomi reasoning models. All five changes are well-targeted with regression tests.

Confidence Score: 5/5

Safe to merge — all five fixes are targeted, well-tested, and carry no P0/P1 issues.

All remaining findings are P2 or lower. The primary session-status fix is logically correct and covered by a regression test. The four accompanying fixes (Mattermost DM threading, tool-loop critical escalation, MiniMax percent inversion, Xiaomi reasoning-model payload) are each independently sound and tested. No data integrity, security, or reliability concerns were found.

No files require special attention.

Reviews (1): Last reviewed commit: "fix(sessions): preserve terminal status ..." | Re-trigger Greptile

Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: c230d1d0e8

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

Comment on lines +51 to +54
// MiniMax's usage_percent / usagePercent fields report the remaining quota
// as a percentage, not the consumed quota. Treat them as "remaining percent"
// and invert to get usedPercent. Count-based fromCounts always takes priority.
const REMAINING_PERCENT_KEYS = ["usage_percent", "usagePercent"] as const;
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Score remaining-percent keys when selecting MiniMax usage records

REMAINING_PERCENT_KEYS is introduced for usage_percent, but scoreUsageRecord still only awards score for PERCENT_KEYS. As a result, nested usage records that only expose usage_percent/usagePercent (and no count fields) can be filtered out of collectUsageCandidates, causing fetchMinimaxUsage to return Unsupported response shape even though valid usage data exists. This regression appears in payloads where the remaining-percent field is present only in nested objects.

Useful? React with 👍 / 👎.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

BUG: Completed run can remain persisted as running, blocking new input and stop

1 participant