What happened?
Over several last days Gemini CLI started to ignore any settings used to set used models - my choice of Gemini 3.1 Pro Preview stems from desire to get good quality code and efficient coding session. Over last days I observed tendency for Gemini to route requests at the beginning to pro model and after some time to route requests to flash model. As code base I work on is complex and multi platform change of routing is immediately visible in presumably lost context - model does not even remember last user actions or requests, or looses ability to follow implementation plan and jumps to edit code on which work was earlier already finished with that same agent. Moreover agent just couple of minutes earlier confirmed after user request that code which is now edited in fact was finished earlier.
Loss of context and forgetting about execution / implementation plans often results in introduction of bugs and bug quashing frenzy - when agent after every fast compilation attempt comes with new code modification proposal after just several seconds its clear he is using flash model and after checking stats i observed that during some sessions almost all requests were routed to Gemini flash.
At the same time usage quota are far from even dented for Gemini Pro - never seen more than 15%, but often when Gemini Flash is in frenzy Pro usage is at 0% (rounding error) while Flash is at 15% of daily quota. This happens disregard for time of day and night - I work sometimes very long hours.
What did you expect to happen?
Just stick to the contract Google has entered to with customer and respect configuration decisions. Session UUID which was attached may not be representative for that problem but earlier session had to be terminated early due to this routing. At the same time it seems Flash model has a much lower ability to follow any instructions, mandatory rules or security limitations.
Client information
Client information
CLI Version: 0.39.0
Git Commit: https://github.com/google-gemini/gemini-cli/commit/398f78dcaa8fd2396684add19933916f7b87d349
Session ID: 6df14344-1210-4a6d-81cd-4355a0318e6c
Operating System: win32 v2 5.9.0
Sandbox Environment: no sandbox
Model Version: gemini-3.1-pro-preview
Auth Type: oauth-personal
Memory Usage: 393.5 MB
Terminal Name: Unknown
Terminal Background: #0c0 c0c
Kitty Keyboard Protocol: Unsupported
Login information
No response
Anything else we need to know?
No response
What happened?
Over several last days Gemini CLI started to ignore any settings used to set used models - my choice of Gemini 3.1 Pro Preview stems from desire to get good quality code and efficient coding session. Over last days I observed tendency for Gemini to route requests at the beginning to pro model and after some time to route requests to flash model. As code base I work on is complex and multi platform change of routing is immediately visible in presumably lost context - model does not even remember last user actions or requests, or looses ability to follow implementation plan and jumps to edit code on which work was earlier already finished with that same agent. Moreover agent just couple of minutes earlier confirmed after user request that code which is now edited in fact was finished earlier.
Loss of context and forgetting about execution / implementation plans often results in introduction of bugs and bug quashing frenzy - when agent after every fast compilation attempt comes with new code modification proposal after just several seconds its clear he is using flash model and after checking stats i observed that during some sessions almost all requests were routed to Gemini flash.
At the same time usage quota are far from even dented for Gemini Pro - never seen more than 15%, but often when Gemini Flash is in frenzy Pro usage is at 0% (rounding error) while Flash is at 15% of daily quota. This happens disregard for time of day and night - I work sometimes very long hours.
What did you expect to happen?
Just stick to the contract Google has entered to with customer and respect configuration decisions. Session UUID which was attached may not be representative for that problem but earlier session had to be terminated early due to this routing. At the same time it seems Flash model has a much lower ability to follow any instructions, mandatory rules or security limitations.
Client information
Client information
Login information
No response
Anything else we need to know?
No response