fix(cli): prevent race condition in loop detection retry#17916
fix(cli): prevent race condition in loop detection retry#17916cynthialong0-0 merged 4 commits intogoogle-gemini:mainfrom
Conversation
Summary of ChangesHello @skyvanguard, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed! This pull request addresses a critical race condition within the CLI's loop detection retry mechanism. By introducing a synchronous reference for the responding state and ensuring that retry queries are properly awaited, it prevents multiple query submissions from occurring simultaneously, which could lead to unexpected behavior when a user interacts with the loop detection dialog. Highlights
Using Gemini Code AssistThe full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips. Invoking Gemini You can request assistance from Gemini at any point by creating a comment using either
Customization To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a Limitations & Feedback Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here. You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension. Footnotes
|
There was a problem hiding this comment.
Code Review
This pull request aims to fix a race condition in the loop detection retry logic by introducing a synchronous useRef to track the responding state. However, the current implementation is incomplete and introduces two critical deadlock vulnerabilities. The first, in submitQuery, arises from activeQueryIdRef.current being updated before an early return, which can prevent the 'responding' state from being cleared. The second issue is that isRespondingRef and isResponding state can get out of sync, leading to a permanently blocked state. To resolve these, ensure activeQueryIdRef is updated only for queries that proceed to execution, and that isRespondingRef and isResponding state are always updated together using a consistent helper function.
| // Synchronous ref to prevent race conditions when checking responding state | ||
| const isRespondingRef = useRef<boolean>(false); |
There was a problem hiding this comment.
While isRespondingRef is a good approach to solve the race condition, its state can get out of sync with the isResponding state because setIsResponding is called in multiple places throughout this hook without also updating isRespondingRef. This can lead to a permanently blocked state where new queries are always rejected.
For example, in cancelOngoingRequest:
if (isFullCancellation) {
if (!activeShellPtyId) {
// ...
setIsResponding(false); // isRespondingRef is not updated
}
}This leaves isRespondingRef.current as true while isResponding becomes false, blocking all future queries.
To fix this critical issue, I recommend creating a single function to update both the state and the ref, and using it consistently.
-
Define a helper function to keep state and ref in sync (you can add this right after the
isRespondingRefdeclaration):const setResponding = useCallback((responding: boolean) => { isRespondingRef.current = responding; setIsResponding(responding); }, []);
-
Replace all calls to
setIsResponding(value)withsetResponding(value). This includes the ones you've already updated in this PR insidesubmitQuery, as well as the ones in:onExecuseEffectforactiveShellPtyIdcancelOngoingRequesthandleUserCancelledEventhandleAgentExecutionStoppedEventhandleCompletedTools
This will ensure the state remains consistent and prevent the deadlock.
a76b601 to
fde27df
Compare
|
Thanks for the detailed review! I've addressed the feedback:
|
956ff5a to
808ca09
Compare
| const [isResponding, setIsResponding] = useState<boolean>(false); | ||
| const [thought, thoughtRef, setThought] = | ||
| useStateAndRef<ThoughtSummary | null>(null); | ||
| const [isResponding, setIsRespondingState] = useState<boolean>(false); |
There was a problem hiding this comment.
Consider using useStateAndRef instead.
cynthialong0-0
left a comment
There was a problem hiding this comment.
Overall looks good. Can you resolve the conflicts and add unit test for your change?
808ca09 to
c5cf875
Compare
|
Hi @skyvanguard the CLA test is failing because the commit was co-authored by Claude, also I think there are conflicts still present |
c5cf875 to
bb8ff5d
Compare
|
/gemini review |
There was a problem hiding this comment.
Code Review
This pull request effectively addresses a race condition in the loop detection retry mechanism by introducing a synchronous ref (isRespondingRef) to track the responding state, preventing concurrent submitQuery calls. The change to await the retry query is also a crucial fix. The added tests are comprehensive and validate the new logic. I have one suggestion to improve code consistency by using the existing useStateAndRef hook, which also aligns with the principle of using explicit state variables for asynchronous operations.
Note: Security Review did not run due to the size of the PR.
| const [isResponding, setIsRespondingState] = useState<boolean>(false); | ||
| const isRespondingRef = useRef<boolean>(false); | ||
| const setIsResponding = useCallback( | ||
| (value: boolean) => { | ||
| setIsRespondingState(value); | ||
| isRespondingRef.current = value; | ||
| }, | ||
| [setIsRespondingState], | ||
| ); |
There was a problem hiding this comment.
To improve consistency with existing patterns in the codebase, you can use the useStateAndRef hook here. This hook is already used in this file for thought, pendingHistoryItem, etc., and it encapsulates the logic of keeping a state variable and a ref in sync. This will also make the code more concise.
| const [isResponding, setIsRespondingState] = useState<boolean>(false); | |
| const isRespondingRef = useRef<boolean>(false); | |
| const setIsResponding = useCallback( | |
| (value: boolean) => { | |
| setIsRespondingState(value); | |
| isRespondingRef.current = value; | |
| }, | |
| [setIsRespondingState], | |
| ); | |
| const [isResponding, isRespondingRef, setIsResponding] = | |
| useStateAndRef<boolean>(false); |
References
- Line 72 of the repository style guide states: 'Coding Style: Adhere to existing patterns in packages/cli (React/Ink) and packages/core (Backend logic).' The
useStateAndRefhook is an existing pattern in this file for managing state and a corresponding ref simultaneously. (link) - When managing the state of asynchronous operations, rely on an explicit state variable (e.g., a state enum) rather than checking for the existence of a promise object. Promise objects may be cleared in
finallyblocks upon completion, making them unreliable for state checks after the operation has finished. UsinguseStateAndRefforisRespondingaligns with this by providing an explicit and reliable state variable.
Add synchronous isRespondingRef guard to submitQuery to prevent concurrent submissions that bypass React's async state updates. - Add isRespondingRef (useRef) synced with isResponding (useState) via a useCallback wrapper for setIsResponding - Add isRespondingRef.current check in submitQuery's early-return guard - Move activeQueryIdRef assignment after the guard to avoid stale IDs - Make loop detection onComplete async and await submitQuery - Add setIsResponding to useCallback/useEffect dependency arrays - Add 2 unit tests for race condition prevention
bb8ff5d to
e8f6018
Compare
|
@cynthialong0-0 Thanks for the review! Regarding the suggestion to use I investigated this and it would be the ideal approach for consistency. However, the existing Fixing the mock to use real React hooks causes heap overflow in the test environment. So the explicit Happy to refactor to Also: conflicts resolved (rebased on main) and Co-Authored-By removed for CLA compliance. |
…ni#17916) Co-authored-by: cynthialong0-0 <82900738+cynthialong0-0@users.noreply.github.com>
…ni#17916) Co-authored-by: cynthialong0-0 <82900738+cynthialong0-0@users.noreply.github.com>
…ni#17916) Co-authored-by: cynthialong0-0 <82900738+cynthialong0-0@users.noreply.github.com>
…ni#17916) Co-authored-by: cynthialong0-0 <82900738+cynthialong0-0@users.noreply.github.com>
…ni#17916) Co-authored-by: cynthialong0-0 <82900738+cynthialong0-0@users.noreply.github.com>
Summary
Fixes #17071
Prevents a race condition that can occur when the user clicks "disable" in the loop detection dialog. The retry
submitQuerycould race with user prompts due to React state batching - checkingstreamingStateisn't synchronous enough.The Problem
When loop detection triggers and shows the confirmation dialog:
setIsResponding(false)is called in thefinallyblockstreamingStatebecomesIdle(after React batch)InputPromptbecomes active - user can type new promptssubmitQuery()is called without awaitsubmitQuery()calls can run in parallelThe Solution
isRespondingReffor synchronous state checkingsubmitQuerybefore allowing new queriesChanges
Test plan
useGeminiStream.test.tsxpassAppContainer.test.tsxpass