Skip to content

[codex] follow up glm thinking recovery#63

Merged
terisuke merged 15 commits intofeat/glm-thinking-fix-and-memory-systemfrom
codex/fix-glm-thinking-followup
Apr 5, 2026
Merged

[codex] follow up glm thinking recovery#63
terisuke merged 15 commits intofeat/glm-thinking-fix-and-memory-systemfrom
codex/fix-glm-thinking-followup

Conversation

@terisuke
Copy link
Copy Markdown

@terisuke terisuke commented Apr 5, 2026

What changed

  • compact on RepetitionError instead of stopping the turn
  • ignore stale assistant messages without finish when building the next turn
  • return the last finished assistant when an unfinished stale assistant trails it
  • add regression coverage for repetition compaction and polluted unfinished-session recovery

Why

PR #56 improves GLM-5.1 thinking corruption handling during the active stream, but two gaps remained:

  • repetition loops still stopped the turn instead of forcing compaction
  • already-saved broken unfinished assistant output was still re-injected into later turns, poisoning resumed sessions even after the live-stream fix

Impact

Root cause

The saved corrupted assistant output in the affected session had no finish, but SessionPrompt.run still kept that message in MessageV2.filterCompactedEffect(sessionID) and passed it back into the next model turn. In parallel, SessionProcessor.halt only auto-compacted ContextOverflowError, not the already-detected RepetitionError path.

Validation

  • bun test test/session/processor-effect.test.ts
  • bun test test/session/prompt-effect.test.ts
  • bun run typecheck
  • rebuilt the live binary with OPENCODE_CHANNEL='live/guardrails' OPENCODE_VERSION='0.0.0-live/guardrails-202604041233' bun run build --single
  • verified the real affected session with opencode run -s ses_2a395c5e6ffeuPcAR7KBys7W91 "Check GitHub directly and answer in one line only: is https://github.com/Cor-Incorporated/Grift/pull/412 OPEN or MERGED?", which executed gh pr view 412 --repo Cor-Incorporated/Grift --json state --jq .state and returned MERGED

Notes

terisuke and others added 12 commits April 5, 2026 16:20
ToolRegistry.layer now requires Question.Service and Todo.Service as
direct dependencies after upstream refactor. Build those layers
explicitly before providing them to the registry, matching the pattern
in prompt-effect.test.ts.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Skip parallel implementation enforcement when repository has no
commits yet. This prevents the deadlock where mutations are blocked
until team tool is called, but team/yardadd requires HEAD to create
worktrees.

- chat.message hook: detect HEAD-less repo and suspend parallel policy
- yardadd(): fail early with clear error if HEAD is missing

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Empty the evals Set so all configured providers are available for
standard work. Guard both eval-related gate checks with evals.size > 0
to prevent provider-eval from being blocked when no eval providers
are defined. Update tests accordingly.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…de (#25)

Verify that parallel implementation policy is suspended when repository
has no commits, preventing the deadlock described in issue #25.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- Add Notification service with zero dependencies (darwin/linux/win32)
- macOS notifications attributed to Terminal.app (fixes click target)
- Add terminal focus detection to avoid notifications when focused
- Add 'notifications' config option to tui.json (default: true)
- Listen to session.idle for completion notifications
- Update session.error to also show system notifications
- Fix documentation plugin example (osascript attribution)
- Beautify TUI toast: full borders + variant icons (✓✗ℹ⚠)
- Add .catch() to session.idle notification to prevent crashes
- Use proper escapeForOsascript function for macOS
- Add more terminals to detection list (vscode, code)
- Restore escapeXml for Windows branch
feat(opencode): add memory system and GLM thinking corruption fix
[codex] fix team redirection false positive for read-only bash
@github-actions
Copy link
Copy Markdown

github-actions bot commented Apr 5, 2026

Hey! Your PR title [codex] follow up glm thinking recovery doesn't follow conventional commit format.

Please update it to start with one of:

  • feat: or feat(scope): new feature
  • fix: or fix(scope): bug fix
  • docs: or docs(scope): documentation changes
  • chore: or chore(scope): maintenance tasks
  • refactor: or refactor(scope): code refactoring
  • test: or test(scope): adding or updating tests

Where scope is the package name (e.g., app, desktop, opencode).

See CONTRIBUTING.md for details.

@github-actions
Copy link
Copy Markdown

github-actions bot commented Apr 5, 2026

This PR doesn't fully meet our contributing guidelines and PR template.

What needs to be fixed:

  • PR description is missing required template sections. Please use the PR template.

Please edit this PR description to address the above within 2 hours, or it will be automatically closed.

If you believe this was flagged incorrectly, please let a maintainer know.

terisuke and others added 3 commits April 5, 2026 16:55
GitHub-hosted runners are not allocated for pull_request events on
forked repos. Switch to pull_request_target which runs in the base
repo context. Explicitly checkout PR head SHA to test the correct code.

Closes #62

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…target

fix(ci): use pull_request_target for fork runner compatibility
@terisuke terisuke force-pushed the codex/fix-glm-thinking-followup branch from 7ad609f to d33e51a Compare April 5, 2026 08:11
@terisuke terisuke marked this pull request as ready for review April 5, 2026 08:11
Copilot AI review requested due to automatic review settings April 5, 2026 08:11
@terisuke terisuke merged commit 51e4372 into feat/glm-thinking-fix-and-memory-system Apr 5, 2026
2 of 4 checks passed
terisuke added a commit that referenced this pull request Apr 5, 2026
Cherry-pick of PR #63 (d33e51a) which was accidentally merged into
the feature branch instead of dev.

- RepetitionError triggers compaction instead of stopping the turn
- Filter stale unfinished assistant messages from resumed sessions
- 3 regression tests added

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR improves session recovery and compaction behavior (especially around repetition loops and stale unfinished assistant messages), and also introduces/updates notification and guardrails-related behavior plus CI workflow adjustments.

Changes:

  • Compact on repetition loops and prevent unfinished/stale assistant output from being re-injected into subsequent turns.
  • Add regression tests for prompt/session recovery and repetition compaction paths.
  • Add cross-platform system notifications in the TUI and extend guardrails profile behavior (including a new team orchestration plugin), plus CI script/workflow tweaks.

Reviewed changes

Copilot reviewed 37 out of 37 changed files in this pull request and generated 7 comments.

Show a summary per file
File Description
packages/opencode/src/session/processor.ts Treat RepetitionError like context overflow to trigger compaction.
packages/opencode/src/session/prompt.ts Sanitize message history to ignore unfinished assistants; adjust loop exit/interrupt behavior.
packages/opencode/test/session/processor-effect.test.ts Regression test asserting repetition loops return "compact".
packages/opencode/test/session/prompt-effect.test.ts Regression tests for ignoring unfinished assistant pollution and returning last finished assistant.
packages/opencode/src/notification/index.ts New notification utility (macOS/Linux/Windows).
packages/opencode/test/notification.test.ts Tests for terminal-app normalization and Windows toast XML escaping.
packages/opencode/src/config/tui-schema.ts Adds notifications setting to tui.json schema.
packages/opencode/src/cli/cmd/tui/app.tsx Trigger OS notifications on session.idle / session.error when not focused.
packages/opencode/src/cli/cmd/tui/ui/toast.tsx Toast UI polish (icons, border, stricter error typing).
packages/opencode/src/cli/cmd/tui/feature-plugins/home/tips-view.tsx Updates tip to reference built-in notifications in tui.json.
packages/web/src/content/docs/plugins.mdx Docs update: AppleScript line + mention built-in TUI notifications.
packages/web/src/content/docs/ar/plugins.mdx Docs update: AppleScript notification snippet.
packages/web/src/content/docs/bs/plugins.mdx Docs update: AppleScript notification snippet.
packages/web/src/content/docs/da/plugins.mdx Docs update: AppleScript notification snippet.
packages/web/src/content/docs/de/plugins.mdx Docs update: AppleScript notification snippet.
packages/web/src/content/docs/es/plugins.mdx Docs update: AppleScript notification snippet.
packages/web/src/content/docs/fr/plugins.mdx Docs update: AppleScript notification snippet.
packages/web/src/content/docs/it/plugins.mdx Docs update: AppleScript notification snippet.
packages/web/src/content/docs/ja/plugins.mdx Docs update: AppleScript notification snippet.
packages/web/src/content/docs/ko/plugins.mdx Docs update: AppleScript notification snippet.
packages/web/src/content/docs/nb/plugins.mdx Docs update: AppleScript notification snippet.
packages/web/src/content/docs/pl/plugins.mdx Docs update: AppleScript notification snippet.
packages/web/src/content/docs/pt-br/plugins.mdx Docs update: AppleScript notification snippet.
packages/web/src/content/docs/ru/plugins.mdx Docs update: AppleScript notification snippet.
packages/web/src/content/docs/th/plugins.mdx Docs update: AppleScript notification snippet.
packages/web/src/content/docs/tr/plugins.mdx Docs update: AppleScript notification snippet.
packages/web/src/content/docs/zh-cn/plugins.mdx Docs update: AppleScript notification snippet.
packages/web/src/content/docs/zh-tw/plugins.mdx Docs update: AppleScript notification snippet.
packages/opencode/test/scenario/harness.ts Adjusts layers provided to ToolRegistry in scenario harness.
packages/guardrails/profile/plugins/guardrail.ts Removes default eval-only provider set; gates eval-only logic behind non-empty set.
packages/guardrails/profile/AGENTS.md Updates provider admission guidance to reflect new policy behavior.
packages/guardrails/profile/plugins/team.ts New team/background orchestration tools + enforcement hooks for large requests.
packages/opencode/test/scenario/guardrails.test.ts Updates provider-lane expectations; adds tests for team plugin redirect/headless behavior.
packages/opencode/package.json Ensures artifact directory exists before junit output in CI.
packages/app/package.json Ensures artifact directory exists before junit output in CI.
.github/workflows/typecheck.yml Switches to pull_request_target and checks out PR head SHA.
.github/workflows/test.yml Switches to pull_request_target, checks out PR head SHA, tweaks artifact upload naming/options.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines 3 to 30
on:
push:
branches:
- dev
pull_request:
pull_request_target:
workflow_dispatch:

concurrency:
# Keep every run on dev so cancelled checks do not pollute the default branch
# commit history. PRs and other branches still share a group and cancel stale runs.
group: ${{ case(github.ref == 'refs/heads/dev', format('{0}-{1}', github.workflow, github.run_id), format('{0}-{1}', github.workflow, github.event.pull_request.number || github.ref)) }}
cancel-in-progress: true

permissions:
contents: read
checks: write

jobs:
unit:
name: unit (linux)
runs-on: blacksmith-4vcpu-ubuntu-2404
steps:
- name: Checkout repository
uses: actions/checkout@v4
with:
ref: ${{ github.event.pull_request.head.sha || github.sha }}
token: ${{ secrets.GITHUB_TOKEN }}

Copy link

Copilot AI Apr 5, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Using pull_request_target while checking out the PR head SHA runs untrusted PR code with the base-repo GITHUB_TOKEN (here granted checks: write). This is a known security footgun because PR code can exfiltrate tokens/secrets or write to the repo via API; prefer pull_request for tests/typecheck, or keep pull_request_target but only run trusted code (checkout base ref) / gate forked PRs and strip permissions (and consider persist-credentials: false).

Copilot uses AI. Check for mistakes.
uses: actions/upload-artifact@v4
with:
name: unit-linux-${{ github.run_attempt }}
name: unit-${{ matrix.settings.name }}-${{ github.run_attempt }}
Copy link

Copilot AI Apr 5, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

name: unit-${{ matrix.settings.name }}-... references matrix.* but this job does not define a matrix strategy. This will fail workflow evaluation at runtime/compile time; use a static name (e.g. unit-linux-...) or introduce an actual matrix.

Suggested change
name: unit-${{ matrix.settings.name }}-${{ github.run_attempt }}
name: unit-linux-${{ github.run_attempt }}

Copilot uses AI. Check for mistakes.
Comment on lines 3 to 18
@@ -13,6 +13,8 @@ jobs:
steps:
- name: Checkout repository
uses: actions/checkout@v4
with:
ref: ${{ github.event.pull_request.head.sha || github.sha }}

Copy link

Copilot AI Apr 5, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Switching to pull_request_target (and checking out the PR head SHA) runs untrusted PR code with the base-repo token context. For a typecheck workflow this is usually unnecessary risk; prefer pull_request or ensure the workflow never executes PR-provided code with elevated permissions/secrets.

Copilot uses AI. Check for mistakes.
Comment on lines +27 to +28
.default(true)
.optional()
Copy link

Copilot AI Apr 5, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

z.boolean().default(true).optional() won’t apply the default because .optional() wraps the schema after the default (parsing undefined will stay undefined). If you want a default of true, use .optional().default(true) (or drop .optional() entirely).

Suggested change
.default(true)
.optional()
.optional()
.default(true)

Copilot uses AI. Check for mistakes.
Comment on lines +820 to +827
sdk.event.on("session.idle", async (evt) => {
if (tuiConfig.notifications === false) return
const focused = await Notification.terminalIsFocused().catch(() => false)
if (focused) return

const sessionID = evt.properties.sessionID
Notification.show("opencode", `${sessionID} completed`).catch(() => {})
})
Copy link

Copilot AI Apr 5, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On non-macOS platforms Notification.terminalIsFocused() always returns false, so notifications will always fire even when the TUI is frontmost. If the intent is “only notify when focus is lost”, consider gating notifications by platform (or adding focus detection for Linux/Windows) to avoid noisy always-on alerts.

Copilot uses AI. Check for mistakes.
Comment on lines +45 to +76
export namespace Notification {
export async function terminalIsFocused(): Promise<boolean> {
if (platform() !== "darwin") return false

const result = await Process.text(
[
"osascript",
"-e",
'tell application "System Events" to get name of first application process whose frontmost is true',
],
{ nothrow: true },
)
return terminal(result.text.trim())
}

export async function show(title: string, message: string): Promise<void> {
const os = platform()

if (os === "darwin") {
const escaped = escapeForOsascript(message)
const titleEscaped = escapeForOsascript(title)
await Process.run(
[
"osascript",
"-e",
`tell application "Terminal" to display notification "${escaped}" with title "${titleEscaped}"`,
],
{ nothrow: true },
)
return
}

Copy link

Copilot AI Apr 5, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This change set introduces new cross-platform notification infrastructure, but the PR title/description are focused on GLM repetition/thinking recovery. Please update the PR description to cover the notifications feature (and any UX/OS support expectations) or split it into a separate PR to keep the follow-up scoped and reviewable.

Copilot uses AI. Check for mistakes.
Comment on lines +9 to +12
const kids = new Set<string>()
const need = new Map<string, Need>()
const live = new Map<string, Run>()

Copy link

Copilot AI Apr 5, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The module-level kids/need/live collections grow without any eviction/cleanup. In a long-running process this can become an unbounded memory leak across sessions; consider pruning entries (e.g., on session completion/abort events, by TTL, or by max-size LRU) and/or persisting minimal state to disk instead of keeping everything in memory forever.

Suggested change
const kids = new Set<string>()
const need = new Map<string, Need>()
const live = new Map<string, Run>()
const cacheTTL = 1000 * 60 * 60
const cacheMax = 1000
class PrunableSet<T> extends Set<T> {
private readonly seen = new Map<T, number>()
private prune() {
const now = Date.now()
for (const [value, at] of this.seen) {
if (now - at > cacheTTL) {
this.seen.delete(value)
super.delete(value)
}
}
while (this.size > cacheMax) {
const oldest = this.values().next()
if (oldest.done) break
this.seen.delete(oldest.value)
super.delete(oldest.value)
}
}
override add(value: T) {
this.prune()
this.seen.set(value, Date.now())
const result = super.add(value)
this.prune()
return result
}
override has(value: T) {
this.prune()
return super.has(value)
}
override delete(value: T) {
this.seen.delete(value)
return super.delete(value)
}
override clear() {
this.seen.clear()
return super.clear()
}
}
class PrunableMap<K, V> extends Map<K, V> {
private readonly seen = new Map<K, number>()
private prune() {
const now = Date.now()
for (const [key, at] of this.seen) {
if (now - at > cacheTTL) {
this.seen.delete(key)
super.delete(key)
}
}
while (this.size > cacheMax) {
const oldest = this.keys().next()
if (oldest.done) break
this.seen.delete(oldest.value)
super.delete(oldest.value)
}
}
override set(key: K, value: V) {
this.prune()
if (super.has(key)) {
super.delete(key)
}
this.seen.set(key, Date.now())
const result = super.set(key, value)
this.prune()
return result
}
override get(key: K) {
this.prune()
return super.get(key)
}
override has(key: K) {
this.prune()
return super.has(key)
}
override delete(key: K) {
this.seen.delete(key)
return super.delete(key)
}
override clear() {
this.seen.clear()
return super.clear()
}
}
const kids = new PrunableSet<string>()
const need = new PrunableMap<string, Need>()
const live = new PrunableMap<string, Run>()

Copilot uses AI. Check for mistakes.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants