Skip to content

Fix silent async failures of checkpointing (#13331)#13486

Closed
vishvananda wants to merge 2 commits intogoogle-gemini:mainfrom
vishvananda:fix-checkpoint
Closed

Fix silent async failures of checkpointing (#13331)#13486
vishvananda wants to merge 2 commits intogoogle-gemini:mainfrom
vishvananda:fix-checkpoint

Conversation

@vishvananda
Copy link
Copy Markdown
Contributor

Summary

If the git commit for checkpointing fails, it leaves the shadow repository in a broken state. Even when it doesn't fail, it can take some time to complete on a large repository, causing future checkpoint commands to silently fail. This commit updates the checkpointing to happen synchronously and fail there is something wrong with the git commit.

Details

I'm not a typescript expert, so I had AI help with the code and tests.

Related Issues

Partial for #13331

How to Validate

You should see a message that it is checkpointing before file altering commands run.

Pre-Merge Checklist

  • Updated relevant documentation and README (if needed)
  • Added/updated tests (if needed)
  • Noted breaking changes (if any)
  • Validated on required platforms/methods:
    • MacOS
      • npm run
      • npx
      • Docker
      • Podman
      • Seatbelt
    • Windows
      • npm run
      • npx
      • Docker
    • Linux
      • npm run
      • npx
      • Docker

@vishvananda vishvananda requested a review from a team as a code owner November 20, 2025 06:50
@gemini-code-assist
Copy link
Copy Markdown
Contributor

Summary of Changes

Hello @vishvananda, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request addresses critical issues with asynchronous checkpointing, which previously led to silent failures and potential repository corruption. By making checkpointing synchronous and implementing comprehensive error handling, the system now provides immediate feedback on snapshot creation status. This ensures greater reliability and transparency when using tools that modify files, safeguarding the user's work by creating robust, restorable snapshots with detailed context.

Highlights

  • Synchronous Checkpointing: Checkpointing operations are now performed synchronously, ensuring that file snapshots are created and validated before proceeding with tool calls. This prevents silent failures that could leave the repository in a broken state.
  • Robust Error Handling: Enhanced error handling for snapshot creation. If git commit fails or a commit hash cannot be obtained, an explicit error message is displayed to the user, and the operation is aborted. The model is also informed of the failed tool calls.
  • Detailed Checkpoint Files: Successful checkpoints now save detailed JSON files containing the chat history, client history, tool call arguments, the Git commit hash, and the file path. These files are stored in a project-specific temporary directory.
  • Improved Test Coverage: New test cases have been added to specifically cover the snapshotting functionality, including scenarios where createFileSnapshot is called and when it fails, ensuring the new error handling mechanisms work as expected.
  • Refactored Stream Processing: The logic for processing Gemini stream events and handling tool calls has been refactored, moving the checkpointing logic to occur proactively when tool calls are received, rather than reactively after approval.
Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request makes checkpointing synchronous and improves error handling, which is a great enhancement. The core logic changes in useGeminiStream.ts are sound, moving the snapshotting to happen before tool calls are scheduled. However, I've found a critical type-safety issue in the new error handling logic that could lead to runtime crashes, an inconsistency in how different snapshot failures are handled, and a gap in the test coverage for the new error path. My review includes detailed suggestions to address these points to make the implementation more robust and reliable.

Comment on lines +871 to +896
const erroredToolCalls: TrackedToolCall[] = toolCallRequests.map(
(request) => ({
request,
status: 'error',
response: {
callId: request.callId,
resultDisplay: '',
error: new Error(
'Tool call aborted due to a critical, non-recoverable file snapshot failure.',
),
errorType: ToolErrorType.UNKNOWN,
responseParts: [
{
functionResponse: {
name: request.name,
response: {
error:
'Tool call aborted due to a critical, non-recoverable file snapshot failure.',
},
},
},
],
},
responseSubmittedToGemini: false,
}),
);
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

critical

The erroredToolCalls objects created here do not conform to the TrackedToolCall type (specifically TrackedCompletedToolCall for the 'error' status). They are missing required properties like tool, invocation, startTime, and endTime. This violates type safety and is likely to cause runtime errors if handleCompletedTools or any function it calls is modified to access these missing properties in the future. To make the code more robust and type-safe, you should construct a complete TrackedCompletedToolCall object for each failed tool call.

              const toolRegistry = config.getToolRegistry();
              const erroredToolCalls: TrackedToolCall[] = toolCallRequests.map(
                (request): TrackedCompletedToolCall => {
                  const tool = toolRegistry.getTool(request.name);
                  if (!tool) {
                    // This should not happen if validation is correct, but we should be robust.
                    throw new Error(
                      `Snapshot failed: Tool "${request.name}" not found in registry.`,
                    );
                  }
                  const invocation = tool.build(request.args);
                  const error = new Error(
                    'Tool call aborted due to a critical, non-recoverable file snapshot failure.',
                  );

                  return {
                    request,
                    status: 'error',
                    response: {
                      callId: request.callId,
                      resultDisplay: error.message,
                      error,
                      errorType: ToolErrorType.UNKNOWN,
                      responseParts: [
                        {
                          functionResponse: {
                            name: request.name,
                            response: { error: error.message },
                          },
                        },
                      ],
                    },
                    responseSubmittedToGemini: false,
                    tool,
                    invocation,
                    startTime: Date.now(),
                    endTime: Date.now(),
                  };
                },
              );

Comment on lines +902 to +913
if (!commitHash) {
addItem(
{
type: MessageType.ERROR,
text: 'Failed to get a commit hash for the file snapshot. Aborting operation.',
},
Date.now(),
);
setSnapshottingMessage(null);
cancelOngoingRequest();
return StreamProcessingStatus.Completed;
}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

The error handling when createFileSnapshot returns no commitHash is inconsistent with the catch block above. This code calls cancelOngoingRequest(), which aborts the turn without informing the model about the failure. This can lead to a confusing experience where the conversation just stops. To provide a better user experience and allow the model to react to the failure, this case should also create erroredToolCalls and send them back to the model, similar to how the try...catch block handles snapshot errors.

            if (!commitHash) {
              const errorMessage =
                'Failed to get a commit hash for the file snapshot. Aborting operation.';
              addItem(
                {
                  type: MessageType.ERROR,
                  text: errorMessage,
                },
                Date.now(),
              );

              // Inform the model that the tool calls failed.
              const toolRegistry = config.getToolRegistry();
              const erroredToolCalls: TrackedToolCall[] = toolCallRequests.map(
                (request): TrackedCompletedToolCall => {
                  const tool = toolRegistry.getTool(request.name);
                  if (!tool) {
                    throw new Error(
                      `Snapshot failed: Tool "${request.name}" not found in registry.`,
                    );
                  }
                  const invocation = tool.build(request.args);
                  const error = new Error(errorMessage);

                  return {
                    request,
                    status: 'error',
                    response: {
                      callId: request.callId,
                      resultDisplay: error.message,
                      error,
                      errorType: ToolErrorType.UNKNOWN,
                      responseParts: [
                        {
                          functionResponse: {
                            name: request.name,
                            response: { error: error.message },
                          },
                        },
                      ],
                    },
                    responseSubmittedToGemini: false,
                    tool,
                    invocation,
                    startTime: Date.now(),
                    endTime: Date.now(),
                  };
                },
              );

              void handleCompletedToolsRef.current!(erroredToolCalls);
              setSnapshottingMessage(null);
              return StreamProcessingStatus.Completed;
            }

Comment on lines +2802 to +2842
it('should show an error if file snapshotting fails', async () => {
mockCreateFileSnapshot.mockRejectedValue(new Error('Snapshot failed'));
const geminiClient = new MockedGeminiClientClass(mockConfig);
const { result } = renderTestHook([], geminiClient);

await act(async () => {
const stream =
(async function* (): AsyncGenerator<ServerGeminiStreamEvent> {
yield {
type: ServerGeminiEventType.ToolCallRequest,
value: {
callId: 'test-call-id',
name: 'replace',
args: { file_path: 'test.txt' },
isClientInitiated: false,
prompt_id: 'test-prompt-id',
},
};
yield {
type: ServerGeminiEventType.Error,
value: { error: { message: 'Snapshot failed' } },
};
})();
await result.current.processGeminiStreamEvents(
stream,
Date.now(),
new AbortController().signal,
);

expect(mockCreateFileSnapshot).toHaveBeenCalled();
expect(mockAddItem).toHaveBeenCalledWith(
expect.objectContaining({
type: MessageType.ERROR,
text: expect.stringContaining(
'Failed to create file snapshot: Snapshot failed. Aborting operation.',
),
}),
expect.any(Number),
);
});
});
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

This test correctly verifies that an error message is displayed when file snapshotting fails. However, it's missing a crucial assertion to verify that the model is informed about the tool call failures. The implementation in useGeminiStream.ts calls handleCompletedTools to send the error status back to the model. Please add an assertion to ensure handleCompletedTools (or a mock it depends on) is invoked with the appropriate error information. This will make the test more robust and prevent regressions in this critical error recovery flow.

@vishvananda
Copy link
Copy Markdown
Contributor Author

I added the fix for sending the failed snapshot to the model and included new tests for it. I'm not sure about the critical one which seems to include some hallucinated code, invocation, startTime and endTime are not valid values in the return.

@vishvananda
Copy link
Copy Markdown
Contributor Author

I went ahead and added the tool into the response to the model as requested. I'm still confused why it is asking for invocation, startTime, and endTime, which don't seem to exist AFAICT.

@vishvananda vishvananda force-pushed the fix-checkpoint branch 2 times, most recently from 6a1ba73 to 9fcd148 Compare November 21, 2025 22:07
@vishvananda
Copy link
Copy Markdown
Contributor Author

squashed and rebased to fix merge conflicts

@vishvananda
Copy link
Copy Markdown
Contributor Author

/gemini review

Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request refactors the checkpointing logic to be synchronous, which is a great improvement to prevent silent failures and race conditions. The changes look solid, and the addition of tests for the new snapshotting logic is appreciated. I've identified one area with significant code duplication and a potential bug in the error handling logic. My review includes a suggestion to refactor this duplicated code for better maintainability and to fix the inconsistent error handling.

Comment on lines +898 to +1001
let commitHash: string | undefined;
try {
if (!gitService) {
throw new Error(
'Checkpointing is enabled but Git service is not available. Ensure Git is installed.',
);
}
commitHash = await gitService.createFileSnapshot(
`Snapshot for ${restorableToolCalls
.map((t) => t.name)
.join(', ')}`,
);
} catch (error) {
const errorMessage = `Failed to create file snapshot: ${getErrorMessage(
error,
)}`;
addItem(
{
type: MessageType.ERROR,
text: `${errorMessage}. Aborting operation.`,
},
Date.now(),
);

// Inform the model that the tool calls failed due to a non-recoverable error.
const toolRegistry = config.getToolRegistry();
const erroredToolCalls: TrackedToolCall[] = toolCallRequests.map(
(request): TrackedCompletedToolCall => {
const tool = toolRegistry.getTool(request.name);
const error = new Error(
'Tool call aborted due to a critical, non-recoverable file snapshot failure.',
);

return {
request,
status: 'error',
response: {
callId: request.callId,
resultDisplay: error.message,
error,
errorType: ToolErrorType.UNKNOWN,
responseParts: [
{
functionResponse: {
name: request.name,
response: { error: error.message },
},
},
],
},
responseSubmittedToGemini: false,
tool,
};
},
);

void handleCompletedToolsRef.current!(erroredToolCalls);
setSnapshottingMessage(null);
return StreamProcessingStatus.Completed; // Stop processing
}
if (!commitHash) {
const errorMessage =
'Failed to get a commit hash for the file snapshot. Aborting operation.';
addItem(
{
type: MessageType.ERROR,
text: errorMessage,
},
Date.now(),
);

// Inform the model that the tool calls failed.
const toolRegistry = config.getToolRegistry();
const erroredToolCalls: TrackedToolCall[] = toolCallRequests.map(
(request): TrackedCompletedToolCall => {
const tool = toolRegistry.getTool(request.name);
const error = new Error(errorMessage);
return {
request,
status: 'error',
response: {
callId: request.callId,
resultDisplay: error.message,
error,
errorType: ToolErrorType.UNKNOWN,
responseParts: [
{
functionResponse: {
name: request.name,
response: { error: error.message },
},
},
],
},
responseSubmittedToGemini: false,
tool,
};
},
);
void handleCompletedToolsRef.current!(erroredToolCalls);
setSnapshottingMessage(null);
cancelOngoingRequest();
return StreamProcessingStatus.Completed;
}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

There's significant code duplication in the error handling for snapshot creation within the catch block (lines 910-957) and the if (!commitHash) block (lines 958-1001). Both blocks build a list of erroredToolCalls and notify the model. This logic can be extracted into a helper function to improve readability and maintainability.

Additionally, there's an inconsistency in how failures are handled. The if (!commitHash) block calls cancelOngoingRequest(), while the catch block does not. Calling cancelOngoingRequest() will set turnCancelledRef.current to true, which will likely prevent handleCompletedTools from submitting the tool failure responses to Gemini. This seems unintentional, as informing Gemini of the failure allows it to react gracefully. Both failure paths should probably handle cancellation consistently.

I suggest refactoring this into a helper function and removing the cancelOngoingRequest() call to ensure consistent error reporting to the model.

            const handleSnapshotFailure = (
              uiErrorMessage: string,
              modelErrorMessage: string,
            ) => {
              addItem(
                {
                  type: MessageType.ERROR,
                  text: `${uiErrorMessage}. Aborting operation.`,
                },
                Date.now(),
              );

              const toolRegistry = config.getToolRegistry();
              const error = new Error(modelErrorMessage);
              const erroredToolCalls: TrackedToolCall[] = toolCallRequests.map(
                (request): TrackedCompletedToolCall => {
                  const tool = toolRegistry.getTool(request.name);
                  return {
                    request,
                    status: 'error',
                    response: {
                      callId: request.callId,
                      resultDisplay: error.message,
                      error,
                      errorType: ToolErrorType.UNKNOWN,
                      responseParts: [
                        {
                          functionResponse: {
                            name: request.name,
                            response: { error: error.message },
                          },
                        },
                      ],
                    },
                    responseSubmittedToGemini: false,
                    tool,
                  };
                },
              );

              void handleCompletedToolsRef.current!(erroredToolCalls);
              setSnapshottingMessage(null);
            };

            let commitHash: string | undefined;
            try {
              if (!gitService) {
                throw new Error(
                  'Checkpointing is enabled but Git service is not available. Ensure Git is installed.',
                );
              }
              commitHash = await gitService.createFileSnapshot(
                `Snapshot for ${restorableToolCalls
                  .map((t) => t.name)
                  .join(', ')}`,
              );
            } catch (error) {
              const errorMessage = `Failed to create file snapshot: ${getErrorMessage(
                error,
              )}`;
              handleSnapshotFailure(
                errorMessage,
                'Tool call aborted due to a critical, non-recoverable file snapshot failure.',
              );
              return StreamProcessingStatus.Completed; // Stop processing
            }

            if (!commitHash) {
              const errorMessage =
                'Failed to get a commit hash for the file snapshot';
              handleSnapshotFailure(errorMessage, errorMessage);
              return StreamProcessingStatus.Completed;
            }

@vishvananda
Copy link
Copy Markdown
Contributor Author

/gemini review

Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request makes checkpointing synchronous and fail-fast, which is a significant improvement for reliability. The core logic change in useGeminiStream.ts correctly moves snapshotting into the stream processing flow, blocking tool execution on failure. The new tests for these failure scenarios are also a valuable addition. I have identified one critical issue regarding type safety when creating synthetic error responses for tool calls, which could lead to runtime errors and should be addressed.

Comment thread packages/cli/src/ui/hooks/useGeminiStream.ts
@vishvananda
Copy link
Copy Markdown
Contributor Author

vishvananda commented Nov 24, 2025

I think gemini is hallucinating in its feedback. Here is an analysis from gemini-cli:
✦ I've investigated the type definitions for TrackedCompletedToolCall, CompletedToolCall, ErroredToolCall, ToolRegistry, and Tool.

Here's my verification of your feedback:

  1. Missing invocation, startTime, and endTime on erroredToolCalls:
    • The erroredToolCalls are constructed as ErroredToolCall within the TrackedCompletedToolCall union.
    • The definition of ErroredToolCall in packages/core/src/core/coreToolScheduler.ts (line 69) is:
   1         export type ErroredToolCall = {
   2           status: 'error';
   3           request: ToolCallRequestInfo;
   4           response: ToolCallResponseInfo;
   5           tool?: AnyDeclarativeTool;
   6           durationMs?: number;
   7           outcome?: ToolConfirmationOutcome;
   8         };
   * As you can see, ErroredToolCall does not include invocation, startTime, or endTime as required properties. Therefore, the object being created for erroredToolCalls correctly conforms to the ErroredToolCall type, and thus to
     TrackedCompletedToolCall.
  1. toolRegistry.getTool(request.name) can return undefined, but the tool property on TrackedCompletedToolCall is not optional.
    • The getTool method in ToolRegistry (from packages/core/src/tools/tool-registry.ts, line 351) has the signature:
      1 getTool(name: string): AnyDeclarativeTool | undefined;
      This confirms it can return undefined.
    • However, as noted above, the tool property on ErroredToolCall (which is part of TrackedCompletedToolCall) is optional (tool?: AnyDeclarativeTool).
    • Therefore, the code correctly handles the possibility of tool being undefined when creating an ErroredToolCall, and this conforms to the type definition.

Conclusion:
Based on the explicit type definitions found in the codebase, the object created for erroredToolCalls does conform to the TrackedCompletedToolCall type (specifically, its ErroredToolCall constituent), and the optionality of the tool
property is correctly handled. The feedback appears to be based on a misunderstanding of these specific type definitions.

@gemini-cli gemini-cli Bot added status/need-issue Pull requests that need to have an associated issue. priority/p2 Important but can be addressed in a future release. area/core Issues related to User Interface, OS Support, Core Functionality and removed status/need-issue Pull requests that need to have an associated issue. labels Jan 7, 2026
@bdmorgan
Copy link
Copy Markdown
Collaborator

Hi @vishvananda, thank you so much for your contribution to Gemini CLI! We really appreciate the time and effort you've put into this.

We're making some updates to our contribution process to improve how we track and review changes. Please take a moment to review our recent discussion post: Improving Our Contribution Process & Introducing New Guidelines.

Key Update: Starting January 26, 2026, the Gemini CLI project will require all pull requests to be associated with an existing issue. Any pull requests not linked to an issue by that date will be automatically closed.

Thank you for your understanding and for being a part of our community!

@vishvananda
Copy link
Copy Markdown
Contributor Author

@bdmorgan thank you for the comment. This PR is already associated with an existing issue: #13331. The PR is out of date at the moment. I haven't made any effort to update it yet because I haven't heard any feedback from the team on whether this is an acceptable approach, or if the team wants to handle it differently.

@bdmorgan
Copy link
Copy Markdown
Collaborator

Thank you for submission to the Gemini CLI project. At this time, we are closing this pull request in order to allow us to better triage and support more recent pull requests against the latest code changes. If you feel like this pull request is a critical contribution to the Gemini CLI project, please associate the pull request with an existing GitHub issue (instructions here: https://docs.github.com/en/issues/tracking-your-work-with-issues/using-issues/linking-a-pull-request-to-an-issue) before reopening. After Monday January 26 2026, any pull requests submitted by contributors without an associated issue will be automatically closed (more information here: #16706). If you do choose to reopen and submit this pull request, please ensure you rebase your changes onto the current main branch before resubmitting. This will help avoid merge conflicts and ensure your contribution is compatible with the latest codebase.

@bdmorgan bdmorgan closed this Jan 23, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

area/core Issues related to User Interface, OS Support, Core Functionality priority/p2 Important but can be addressed in a future release.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants