Skip to content

feature:rate-limit, bug:fix caching and error spamming issue#171

Merged
zfoong merged 1 commit intoV1.2.2from
feature/model-rate-limit
Apr 3, 2026
Merged

feature:rate-limit, bug:fix caching and error spamming issue#171
zfoong merged 1 commit intoV1.2.2from
feature/model-rate-limit

Conversation

@zfoong
Copy link
Copy Markdown
Collaborator

@zfoong zfoong commented Apr 3, 2026

What and why
Exceeding the rate limit when using some LLM providers will cause an issue.
We introduced slow mode to limit TPM to 25k.
I have also fixed the KV caching issue and the error spamming issue that caused an error to pop up every second on the chat panel.
I also discovered a bug where the Anthropic model returns output truncated due to exceeding the max_token set. I increased the max_token to 16k tokens.

@zfoong zfoong merged commit b9459d9 into V1.2.2 Apr 3, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant