systemReminder was re-shown on every query, not just the first#228
systemReminder was re-shown on every query, not just the first#228
systemReminder was re-shown on every query, not just the first#228Conversation
systemReminder is meant to be ephemeral context for the opening call of a human turn — a nudge to the model that isn't stored in conversation history. It was instead passed on every iteration of the agent loop, including tool-result continuations. Tool-result continuations are not new user turns. They are the model's own tool-use exchange completing itself. Appending user-turn context there is semantically wrong, and it changes the request shape on every loop iteration, breaking prompt cache stability. The fix consumes systemReminder exactly once: extracted from options before the loop in AgentRun.execute(), passed into the first #getMessageStream call, then cleared. Every subsequent call in the same turn sees undefined. To make this testable without hitting the Anthropic API, AgentRun's constructor was changed to accept IMessageStreamer and IAgentChannelFactory abstractions instead of a raw Anthropic client. AnthropicAgent creates the concrete implementations and passes them in. FakeMessageStreamer in the test captures every call body, allowing direct assertion on what was and wasn't sent.
… first AgentMessageHandler stored gitDelta at construction time and rendered it on every query_summary message. The field was set once, never cleared, so every loop iteration — including tool-result continuations — showed the git delta header even though the reminder was only meant to be visible for the first API call of a turn. Fix: thread systemReminder through the query_summary SDK message itself. AgentRun already has a local variable that holds the reminder for the first call and is cleared to undefined after; that same value is now included in the channel send before the clear. AgentMessageHandler reads msg.systemReminder, so it only renders when the message actually carries the value — which is exactly once per turn. Removes gitDelta from AgentMessageHandlerOptions entirely; the field only ever existed to feed this display path, and that role is now filled by the message payload. Tests added: - AgentRun: first query_summary carries systemReminder, second does not - AgentMessageHandler: appends reminder when set, omits it when absent Closes #227
9483977 to
9c9a97f
Compare
systemReminder was re-shown on every query, not just the first
bananabot9000
left a comment
There was a problem hiding this comment.
Now I see what you meant by "insane fix for a simple bug" 🍌
The fix is 3 lines. Consume systemReminder into a local variable, pass it to the first #getMessageStream call, set it to undefined. Done. Bug fixed.
The other 300 lines are testability refactoring. To PROVE the fix works, 006 needed to intercept what gets sent to the Anthropic API. Which meant abstracting the Anthropic client behind IMessageStreamer, abstracting the MessageChannel behind IAgentChannelFactory, creating FakeMessageStreamer that captures call bodies, FakeAgentChannel that captures SDK messages, and scripting multi-step tool-use flows with makeToolUseStream/makeEndTurnStream.
That's the "insane" part — the testing infrastructure is 50x the fix. But now AgentRun is testable without hitting the real API, which pays dividends forever.
Architecture improvements baked in:
RequestBuilderOptionsextracted fromRunAgentQuery— RequestBuilder only sees what it needs, not the full query shape. Clean separation.IMessageStreamer/IAgentChannelFactory— proper DI seams. AgentRun no longer creates its own dependencies.gitDeltaremoved fromAgentMessageHandlerOptionsentirely — the display path reads from the message payload now, not constructor state.
The consume-once pattern:
let systemReminder = this.#options.systemReminder;
// ... in loop:
const stream = this.#getMessageStream(messages, systemReminder);
systemReminder = undefined; // gone after first useSimple, readable, no flags, no "isFirstIteration" booleans. The variable IS the state.
Tests verify both bugs:
- API injection: first call has systemReminder in body, second (tool-result continuation) does not
- Display: first
query_summarycarriessystemReminder, second does not - RequestBuilder: injects when set, leaves messages unchanged when undefined
No sensitive data, no reverted changes, no session log artifacts.
81+ PRs reviewed. The goldfish tests the plumbing 🍌
* Switch packages from custom build scripts to tsup Each package had a hand-rolled build.ts that called esbuild directly. Replaced with tsup.config.ts in each package, which handles ESM + CJS output and DTS generation in one pass. Key decisions: - bundle: true is required because bundle: false only transpiles entry files, leaving relative imports unresolved in dist. Consuming bundlers follow those imports and find nothing. - clean: true replaces the @shellicar/build-clean plugin. The plugin uses esbuild's metafile to find current build outputs and delete everything else. TypeScript-generated DTS files never appear in the metafile, so the plugin deleted them every time. clean: true wipes the outdir before the build starts, so JS and DTS land together. - entryNames: '[name]' in esbuildOptions puts JS files at the same level as DTS files. tsup's TypeScript pass ignores esbuild's entryNames entirely, so DTS always lands flat at dist/esm/[name].d.ts. Matching JS placement keeps them co-located and matches the exports map. - claude-sdk-tools entry is src/entry/*.ts, not src/*.ts. The public tool modules live under src/entry/; src/*.ts only contains internal utilities. Exports maps updated to the nested import/require form with types + default sub-conditions, pointing to the correct dist paths. tsconfig.json in each package now scopes to src/**/*.ts and excludes dist/node_modules. tsconfig.check.json symlinks to the root copy. turbo.json build inputs updated to include tsup.config.ts. Both apps updated: entryNames '[name]', packages: 'external' instead of a manual external list, bin paths updated from dist/entry/main.js to dist/main.js, workspace deps moved to dependencies since they are externalized and need to be present at runtime. * Fix biome CI errors Format fixes in the three tsup.config.ts files (arrow function body style), indentation in claude-sdk-tools package.json exports, trailing newline in claude-sdk-cli package.json, import sort order in runAgent.ts. Remove unused CacheTtl import from AgentRun.ts and reformat a ternary in MessageStream.ts — both left over from the cache TTL refactor that came in via rebase. * Session log and harness update for packaging tsup migration Session 3 appended to 2026-04-09.md covering the build verification, post-rebase check, biome CI fix approach, and PR #230. Current State updated: branch fix/packaging, PR #230 open, recent PRs #228/#229 added to the post-refactor list. * Add @shellicar/changes tooling: schema generation, validation, CI integration Sets up the full changes.jsonl toolchain for this repo: - changes.config.json defines the valid categories (feature, fix, breaking, deprecation, security, performance) with their display names - scripts/src/generate-schema.ts generates schema/shellicar-changes.json from the Zod definitions + config; category is required (repo policy, stricter than the base spec) - scripts/src/validate-changes.ts validates all **/changes.jsonl files against the schema via ajv; no-args mode globs the repo, or pass specific paths - CI (.github/workflows/node.js.yml) runs the validator on every push Adds changes.jsonl entries for the tsup packaging work (PR #230) in claude-core, claude-sdk, and claude-sdk-tools. Updates both CLAUDE.md files: correct category enum, removed the bad 'issue' field from the example, noted that category is required and that issue links belong in metadata not at the top level, added the @shellicar/changes tooling section. * Session log: @shellicar/changes tooling session (2026-04-09 continued) * Add @shellicar/changes toolchain with per-package changelogs Builds out the full changes toolchain across the monorepo: - changes.config.json: Keep a Changelog standard categories (added/changed/deprecated/removed/fixed/security). Custom category names were replaced because these are the recognised standard and tooling elsewhere understands them. - schema/shellicar-changes.schema.json (renamed from .json): generated JSON Schema artifact. The .schema.json suffix is conventional for JSON Schema files and makes intent unambiguous in editors. - scripts/src/generate-schema.ts: generates schema from Zod definitions + changes.config.json. Must run from scripts/ directory. - scripts/src/validate-changes.ts: validates every **/changes.jsonl against the schema via ajv. CI runs this on every push. - scripts/src/generate-changelog.ts: generates CHANGELOG.md for a package from its changes.jsonl. Groups entries by category in config order within each release. Renders metadata.issue as (#NNN) and metadata.ghsa as a linked advisory suffix. - Release markers gain an optional tag field. When absent the script defaults to <shortName>@<version>. The claude-cli app uses explicit tags because its historical alphas used unscoped version tags. - changes.jsonl + CHANGELOG.md added for all five packages/apps. apps/claude-cli carries the full reverse-engineered history from the root CHANGELOG.md (alpha.67 through alpha.74) so the generated CHANGELOG.md is the authoritative source going forward. - Both CLAUDE.md files updated with a dedicated @shellicar/changes section covering config, schema, and all three scripts. * Fix biome errors in scripts Formatting applied via --changed --since=origin/main to scope fixes to only the files we touched. Rewrote the ??= pattern as a plain if-block: clearer to read, no cleverness required. * Fix CI: run validate via scripts package, not pnpm tsx from root tsx is only installed in the scripts workspace package. Running pnpm tsx from the repo root fails because there is no root-level tsx binary. Use pnpm --filter scripts run validate instead, which runs inside the package where tsx is available.
Problem
Two related bugs where
systemReminder(the git delta snapshot) was leaking beyond the first API call of a turn.Bug 1 — API injection
AgentRunpassedsystemReminderto every#getMessageStreamcall in the loop, including tool-result continuations. The agent saw the same git delta repeated in every message during a multi-tool sequence.Bug 2 — Display
AgentMessageHandlerstoredgitDeltaas a constructor-time field and rendered it on everyquery_summarymessage. The CLI header showed the git delta on every loop iteration, not just the first.Fix
Bug 1: Consume
systemReminderwith a local variable, pass it into the first#getMessageStreamcall, then set it toundefined. Subsequent iterations of the loop sendundefined.Bug 2: Thread
systemReminderthrough thequery_summarySDK message.AgentRunincludes it in the channel send (where the local variable still holds the value, before being cleared).AgentMessageHandlerreadsmsg.systemReminder— only renders when the message actually carries it, which is exactly once per turn. RemovesgitDeltafromAgentMessageHandlerOptionsentirely.Tests
AgentRun: firstquery_summarycarriessystemReminder, second does notAgentRun:systemReminderonly injected into first API call body, not tool-result continuationsAgentMessageHandler: appends reminder when set, omits it when absentCloses #227