What happened?
We've seen this issue now for 2 datys and is now 100% repeatable. This may be due to my persisting in efforts to compress my context, which leads to this error but the timing is highly suspect, in that it may may relate to the proposed fix #8026 which seems to have moved the prevalence of this issue from rare to 100% repeatable. The error itself as displayed:
╭───────────────╮
│ > /compress │
╰───────────────╯
✕ Failed to compress chat history: Unable to submit request because it has a maxOutputTokens value of 167538 but the supported range is from 1
(inclusive) to 65537 (exclusive). Update the value and try again.
I thought this might be a regression issue in that I might have maxOutputTokens set too high in settings.json from before this fix was introduced, but there i no reference to it whatsoever.
It occurs to me as I write this, that it was my typical workflow to utilize gemini-2.5-flash when performing the compress operation to the extent that I has exceeded my quota for calls to gemini-2.5-pro at that point in the work day. I have yet to dig into the code to see what limits are currently being placed on the /compress operation in cli version 0.4.0 but I wonder if they're different for use of pro vs flash LLMs
What did you expect to happen?
I expect to be able to compress my historical context reliably, as this is a critically important operation for long running projects.
Client information
- CLI Version: 0.4.0
- Git Commit: 8921369
- Session ID: 1966ed4f-8268-4206-94a9-23bf4d31deb7
- Operating System: darwin v24.7.0
- Sandbox Environment: no sandbox
- Model Version: gemini-2.5-pro
- Memory Usage: 389.8 MB
- IDE Client: VS Code
Login information
Google account - personal
While I'm thinking about it I tried to attatch paid project credentials to the CLI but was never able to successfuly connect them even though I have the required access to the project. With this error continuing, however, I'm glad I was unable to do so ar this point.
Anything else we need to know?
No response
What happened?
We've seen this issue now for 2 datys and is now 100% repeatable. This may be due to my persisting in efforts to compress my context, which leads to this error but the timing is highly suspect, in that it may may relate to the proposed fix #8026 which seems to have moved the prevalence of this issue from rare to 100% repeatable. The error itself as displayed:
I thought this might be a regression issue in that I might have maxOutputTokens set too high in
settings.jsonfrom before this fix was introduced, but there i no reference to it whatsoever.It occurs to me as I write this, that it was my typical workflow to utilize gemini-2.5-flash when performing the
compressoperation to the extent that I has exceeded my quota for calls to gemini-2.5-pro at that point in the work day. I have yet to dig into the code to see what limits are currently being placed on the /compress operation in cli version 0.4.0 but I wonder if they're different for use of pro vs flash LLMsWhat did you expect to happen?
I expect to be able to compress my historical context reliably, as this is a critically important operation for long running projects.
Client information
Login information
Google account - personal
While I'm thinking about it I tried to attatch paid project credentials to the CLI but was never able to successfuly connect them even though I have the required access to the project. With this error continuing, however, I'm glad I was unable to do so ar this point.
Anything else we need to know?
No response