Skip to content

common : remove regex for gguf split and tag parsing#21294

Closed
angt wants to merge 1 commit intoggml-org:masterfrom
angt:common-remove-regex-for-gguf-split-and-tag-parsing
Closed

common : remove regex for gguf split and tag parsing#21294
angt wants to merge 1 commit intoggml-org:masterfrom
angt:common-remove-regex-for-gguf-split-and-tag-parsing

Conversation

@angt
Copy link
Copy Markdown
Member

@angt angt commented Apr 2, 2026

Overview

Try to reduce stack usage by removing regex usage

Additional information

Requirements

  • I have read and agree with the contributing guidelines
  • AI usage disclosure: YES for tag extraction, my "C" version was too big...

Signed-off-by: Adrien Gallouët <angt@huggingface.co>
@angt angt requested a review from a team as a code owner April 2, 2026 09:13
@angt
Copy link
Copy Markdown
Member Author

angt commented Apr 2, 2026

Before merging #21290 i want to see if avoiding regex fixes the issue.

@angt
Copy link
Copy Markdown
Member Author

angt commented Apr 2, 2026

Sadly not enough, URLs are too big 👀

@pwilkin
Copy link
Copy Markdown
Member

pwilkin commented Apr 2, 2026

@aldehir what about your conversion to PEG-parser for those?

@aldehir
Copy link
Copy Markdown
Contributor

aldehir commented Apr 2, 2026

You can write a PEG parser for these pattens. But if the implementation in this PR is having problems, not sure if PEG helps. It would look something like this: aldehir#11 (forgive my AI usage, I'm not at my PC).

There's another regex that matches the checksum, is this not in the crash path? It would explode the stack due to repetitions.

@angt angt closed this Apr 29, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants