Skip to content

fix: checkpoint diffs never resolve due to shared PubSub subscriptions#864

Closed
jasonLaster wants to merge 1 commit intopingdotgg:mainfrom
jasonLaster:checkpoint-diff-fix
Closed

fix: checkpoint diffs never resolve due to shared PubSub subscriptions#864
jasonLaster wants to merge 1 commit intopingdotgg:mainfrom
jasonLaster:checkpoint-diff-fix

Conversation

@jasonLaster
Copy link
Copy Markdown
Contributor

@jasonLaster jasonLaster commented Mar 10, 2026

Stream.fromPubSub() creates a subscription eagerly. Storing the result as a property meant all consumers shared one queue, so only the first reader (wsServer/ProviderRuntimeIngestion) got events. CheckpointReactor received zero events and never captured git checkpoints.

Fix: change streamDomainEvents and streamEvents to getters so each consumer gets its own subscription. Also add a domain event handler that replaces placeholder checkpoints with real git captures, and extract shared CWD resolution and capture logic to reduce duplication.

Closes #585
Closes #595

Note

Fix checkpoint diffs never resolving due to shared PubSub subscriptions

  • Converts streamDomainEvents in OrchestrationEngine and streamEvents in ProviderService from fixed stream exports to getters that call Stream.fromPubSub() on each access, giving each consumer an independent subscription.
  • Adds a captureCheckpointFromPlaceholder handler in CheckpointReactor that replaces placeholder (status "missing") checkpoints with a real git checkpoint when a thread.turn-diff-completed event is received.
  • Refactors captureCheckpointFromTurnCompletion so placeholder checkpoints no longer block a real capture for the same turn, and centralizes cwd resolution and capture dispatch into resolveCheckpointCwd and captureAndDispatchCheckpoint helpers.
  • Behavioral Change: workspace index cache is no longer cleared after checkpoint capture, and missing cwd/non-git workspaces now silently skip rather than emitting warning logs.

Macroscope summarized 661a179.

@coderabbitai
Copy link
Copy Markdown

coderabbitai bot commented Mar 10, 2026

Important

Review skipped

Auto reviews are disabled on this repository. Please check the settings in the CodeRabbit UI or the .coderabbit.yaml file in this repository. To trigger a single review, invoke the @coderabbitai review command.

⚙️ Run configuration

Configuration used: Repository UI

Review profile: CHILL

Plan: Pro

Run ID: 02cfcb71-305c-4647-9ed7-7b740c5eb673

You can disable this status message by setting the reviews.review_status to false in the CodeRabbit configuration file.

Use the checkbox below for a quick retry:

  • 🔍 Trigger review
✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@github-actions github-actions bot added the vouch:trusted PR author is trusted by repo permissions or the VOUCHED list. label Mar 10, 2026
);
return;
}

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🔴 Critical Layers/CheckpointReactor.ts:424

captureCheckpointFromPlaceholder references targetCheckpointRef, fromCheckpointRef, and nextTurnCount which are never declared in its scope, so it throws ReferenceError at runtime. The function also calls captureAndDispatchCheckpoint with missing arguments (no turnCount, status, etc.) and then duplicates that function's logic manually, which would emit duplicate events. This appears to be an incomplete refactor where code was copied from captureCheckpointFromTurnCompletion without the corresponding variable definitions.

🚀 Reply "fix it for me" or copy this AI Prompt for your agent:
In file apps/server/src/orchestration/Layers/CheckpointReactor.ts around line 424:

`captureCheckpointFromPlaceholder` references `targetCheckpointRef`, `fromCheckpointRef`, and `nextTurnCount` which are never declared in its scope, so it throws `ReferenceError` at runtime. The function also calls `captureAndDispatchCheckpoint` with missing arguments (no `turnCount`, `status`, etc.) and then duplicates that function's logic manually, which would emit duplicate events. This appears to be an incomplete refactor where code was copied from `captureCheckpointFromTurnCompletion` without the corresponding variable definitions.

Evidence trail:
apps/server/src/orchestration/Layers/CheckpointReactor.ts lines 387-520 (REVIEWED_COMMIT) - `captureCheckpointFromPlaceholder` function definition and body showing undeclared variables `targetCheckpointRef`, `fromCheckpointRef`, `nextTurnCount`
apps/server/src/orchestration/Layers/CheckpointReactor.ts lines 190-220 (REVIEWED_COMMIT) - `captureAndDispatchCheckpoint` function signature showing required parameters including `turnCount`, `status`, `assistantMessageId`, `createdAt`
apps/server/src/orchestration/Layers/CheckpointReactor.ts lines 314-378 (REVIEWED_COMMIT) - `captureCheckpointFromTurnCompletion` function showing correct pattern with proper variable declarations

@jasonLaster jasonLaster force-pushed the checkpoint-diff-fix branch from 0784b67 to 4153865 Compare March 10, 2026 23:14
Stream.fromPubSub() creates a subscription eagerly. Storing the result
as a property meant all consumers shared one queue, so only the first
reader (wsServer/ProviderRuntimeIngestion) got events. CheckpointReactor
received zero events and never captured git checkpoints.

Fix: change streamDomainEvents and streamEvents to getters so each
consumer gets its own subscription. Also add a domain event handler
that replaces placeholder checkpoints with real git captures, and
extract shared CWD resolution and capture logic to reduce duplication.

Closes pingdotgg#585
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

vouch:trusted PR author is trusted by repo permissions or the VOUCHED list.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Checkpoint diffs never resolve: ProviderRuntimeIngestion placeholder preempts CheckpointReactor git capture

2 participants