Skip to content

grammar: increase MAX_REPETITION_THRESHOLD + make it configurable via envvar#21003

Open
pwilkin wants to merge 2 commits intoggml-org:masterfrom
pwilkin:config-max-repetition-threshold
Open

grammar: increase MAX_REPETITION_THRESHOLD + make it configurable via envvar#21003
pwilkin wants to merge 2 commits intoggml-org:masterfrom
pwilkin:config-max-repetition-threshold

Conversation

@pwilkin
Copy link
Copy Markdown
Member

@pwilkin pwilkin commented Mar 25, 2026

Overview

For very big tool calling environments (like OpenClaw) the current limit is insufficient. Even a bigger limit might be insufficient, so on top of increasing it I'm making it configurable.

Additional information

Together with #20961 should help with #20879

Requirements

  • I have read and agree with the contributing guidelines
  • AI usage disclosure: YES, told Claude to add the envvar config

@pwilkin pwilkin requested a review from ggerganov as a code owner March 25, 2026 18:08
@pwilkin pwilkin requested review from aldehir, ggerganov and ngxson and removed request for ggerganov and ngxson March 25, 2026 18:08
@pwilkin
Copy link
Copy Markdown
Member Author

pwilkin commented Mar 31, 2026

@CISC @ngxson or @ggerganov maybe care to help? Need 1 more approval :)

@pwilkin
Copy link
Copy Markdown
Member Author

pwilkin commented Mar 31, 2026

Fixes #20867

@ggerganov
Copy link
Copy Markdown
Member

Should we wait to see if #21216 fixes the issue? AFAIU, if it works, we won't have to adjust the threshold.

@pwilkin
Copy link
Copy Markdown
Member Author

pwilkin commented Mar 31, 2026

Should we wait to see if #21216 fixes the issue? AFAIU, if it works, we won't have to adjust the threshold.

No, people have requested the restriction be modifiable even before the explosion of OpenClaw models because they have some custom grammars that require lots of repetitions.

@aldehir
Copy link
Copy Markdown
Contributor

aldehir commented Apr 1, 2026

I think it's important we understand why it's exploding in the first place. Then we can make an informed decision.

Anyway, I fixed it in #21216. Need to refine the grammar a bit more, it's causing weird generations on tinyllama-function-call masquerading as Qwen3-Coder.

@pwilkin
Copy link
Copy Markdown
Member Author

pwilkin commented Apr 1, 2026

I think it's important we understand why it's exploding in the first place. Then we can make an informed decision.

Anyway, I fixed it in #21216. Need to refine the grammar a bit more, it's causing weird generations on tinyllama-function-call masquerading as Qwen3-Coder.

For the exploding stuff, yes, but people have called for this to be configurable way before the exploding stuff happened, I just didn't get to it. Some people have grammars that legitimately need more than 2k repetitions.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

testing Everything test related

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants